from : http://www.cartoweb.org/downloads/pgdijkstra/README.txt Dijkstra - Shortest path algorithm computation on PostgreSQL REQUIREMENT * Postgresql >= 8 * The boost graph library. See http://www.boost.org/libs/graph/doc/index.html * C++ compiler INSTALLATION * Edit Makefile, and set the BOOST_PATH with the location of your boost library (if you are on Debian, just type "apt-get install libboost-graph-dev" and you don't need to modify anything) * Type "make install" * Execute the sql file dijkstra.sql to install the functions in your database (you need the plpgsql language enabled on your database. Type "createlang plpgsql YOUR_DATABASE" if not) * If you have PostGIS installed, you can launch dijkstra_postgis.sql which will create PostGIS import and manipulation functions. USAGE The core module is a function which computes a shortest path from a set of edges and vertices. The function expects data in a specific format in input. Some functions are provided for importing data from geometric tables, and for generating results as geometries. The shortest_path function has the following signature: CREATE OR REPLACE FUNCTION shortest_path(sql text, source_id integer, target_id integer, directed boolean, has_reverse_cost boolean) RETURNS SETOF path_result Where path_result is: CREATE TYPE path_result AS (vertex_id integer, edge_id integer, cost float8); arguments are: * sql: a SQL query, which should return a set of rows with the following columns: - id: an int4 identifier of the edge - source: an int4 identifier of the source vertex - target: an int4 identifier of the target vertex - cost: an float8 value, of the edge traversal cost. (a negative cost will prevent the edge from being inserted in the graph). - reverse_cost (optional): the cost for the reverse traversal of the edge. This is only used when the directed and has_reverse_cost parameters are true (see the above remark about negative costs). * source_id: the identifier if the starting vertex * target_id: the identifier of the ending vertex * directed: true if the graph is directed * has_reverse_cost: if true, the reverse_cost column of the SQL generated set of rows will be used for the cost of the traversal of the edge in the opposite direction. The function returns a set of rows. There is one row for each crossed edge, and an additional one containing the terminal vertex. The columns of each row are: - step: current step number, going from 0 to the number of edges minus one - vertex_id: the identifier of source vertex of each edge. There is one more row after the last edge, which contains the vertex identifier of the target path. - edge_id: the identifier of the edge crossed - cost: The cost associated to the current edge. It is 0 for the row after the last edge. Thus, the path total cost can be computated using a sum of all rows in the cost column. examples: SELECT * from shortest_path('SELECT source, id, target, cost FROM edges', 3, 7, false, false); vertex_id | edge_id | cost -----------+---------+------ 3 | 2 | 0 4 | 21 | 0 6 | 5 | 0 7 | -1 | 0 (4 rows) The power of SQL can be used for more complex cost computation: SELECT shortest_path('SELECT gid as id, node1_id as source, node2_id as target, coalesce(CASE WHEN gid IN (1956, 123) THEN 12 ELSE weights1.weight END, 99999) as cost FROM lines2 LEFT JOIN weights1 USING (gid)', 12, 78, false, false); GRAPH IMPORTATION The shortest_path function expects edges id and vertices id to be integer ranging from zero to the maximum number of edges or vertices (holes are allowed, but it will be less efficient). However, you may want to compute shortest path on a table which has vertex identifier stored as strings, like in the following example: SELECT * FROM graph1; gid | source_id | target_id | edge_id -----+-----------+-----------+--------- 0 | A | B | 1 | A | C | 2 | D | A | 3 | B | C | (4 rows) A function called "create_graph_tables" is available which will create two tables for edges and vertices. Example: SELECT create_graph_tables('graph1', 'varchar'); The first argument is the name of the table containing the graph, and the second is the type of the source and target vertex identifiers. It will create the following tables: SELECT * FROM graph1_edges; id | source | target | cost | reverse_cost ----+--------+--------+------+-------------- 1 | 1 | 2 | | 2 | 1 | 3 | | 3 | 4 | 1 | | 4 | 2 | 3 | | (4 rows) And SELECT * FROM graph1_vertices; id | geom_id ----+--------- 1 | A 2 | B 3 | C 4 | D (4 rows) Notice the function will also update the edge_id column of graph1: gid | source_id | target_id | edge_id -----+-----------+-----------+--------- 0 | A | B | 1 1 | A | C | 2 2 | D | A | 3 3 | B | C | 4 (4 rows) Then, you can use the shortest_path function, as below: SELECT * FROM shortest_path('SELECT id, source, target, 1::float8 AS cost FROM graph1_edges', (SELECT id FROM graph1_vertices WHERE geom_id = 'A'), (SELECT id FROM graph1_vertices WHERE geom_id = 'C'), false, false); The initial table has to contain the following columns: - gid anyelement: a unique identifier for each edge in you graph - source_id anyelement: an identifier for the starting vertex of the line - target_id anyelement: an identifier for the target vertex of the line (if the graph is not directed, source or target has the same meaning) - edge_id integer: this column will be filled by the allocated edge identifier. All data there will be overwritten, and you need to create this column if it does not exists before. The function "drop_graph_tables" will simply delete the edges and vertices associated tables. Example: SELECT drop_graph_tables('graph1'); POSTGIS GEOMETRIES IMPORTATION Some pl/pgsql functions are available for working with geographical data from PostGis tables. The table containing the graph has to contain the columns described in the previous section, and an additional geometric column called "the_geom" of type MULTILINESTRING. Only the first line in the multiline geomety will be handled. This restriction is because the shp2pgsql tool provided with postgis creates MULTILINESTRING geometric tables for shapefiles containing a set of lines. The importation should however handle more that only the first line in the multi line geometry (see TODO). Here's an example of such a table: SELECT gid, source_id, target_id, astext(the_geom) FROM graph2 LIMIT 4; gid | source_id | target_id | astext -----+-----------+-----------+-------------------------------------------------------------------------------------------- 0 | | | MULTILINESTRING((-0.357902298850575 51.2777105057471,-0.364822129560221 51.455488954023)) 1 | | | MULTILINESTRING((-0.415775862068966 51.6386587816092,-0.478232130809596 51.5784636541729)) 2 | | | MULTILINESTRING((-0.478232130809596 51.5784636541729,-0.382585141804099 51.5791468469515)) 3 | | | MULTILINESTRING((-0.364822129560221 51.455488954023,-0.433824600199901 51.5244914246627)) (4 rows) If the graph table does not contain identifier values in the source_id and target_id columns, a function is able to generate such ids, by extracting the starting and ending points of the line, and generating an unique id, for all points that are in a given distance. Example: SELECT assign_vertex_id('graph2', 0.1); SELECT gid, source_id, target_id, astext(the_geom) FROM graph2 LIMIT 4; gid | source_id | target_id | astext -----+-----------+-----------+-------------------------------------------------------------------------------------------- 0 | 1 | 2 | MULTILINESTRING((-0.357902298850575 51.2777105057471,-0.364822129560221 51.455488954023)) 1 | 3 | 3 | MULTILINESTRING((-0.415775862068966 51.6386587816092,-0.478232130809596 51.5784636541729)) 2 | 3 | 3 | MULTILINESTRING((-0.478232130809596 51.5784636541729,-0.382585141804099 51.5791468469515)) 3 | 2 | 2 | MULTILINESTRING((-0.364822129560221 51.455488954023,-0.433824600199901 51.5244914246627)) Now that the source_id and target_id are filled, the function create_graph_tables() can be used to create the edges and vertices tables (see above for the detailed description of create_graph_tables): SELECT create_graph_tables('graph2', 'int4'); Here's the content of the edges table: SELECT * FROM graph2_edges LIMIT 3; id | source | target | cost | reverse_cost ----+--------+--------+------+-------------- 1 | 1 | 2 | | 2 | 3 | 3 | | 4 | 2 | 2 | | (3 rows) We can see that it contains NULL values for the cost column. The function update_cost_from_distance can update the cost column with the distance of the lines contained in the geometry table, attached to each edge: SELECT update_cost_from_distance('graph2'); The costs are now: SELECT * FROM graph2_edges LIMIT 3; id | source | target | cost | reverse_cost ----+--------+--------+-------------------+-------------- 1 | 1 | 2 | 0.230081516048264 | 2 | 3 | 3 | 0.446760794328524 | 4 | 2 | 2 | 0.174348470878483 | Now, all is set up correctly for using the shortest_path() on these data: SELECT * FROM shortest_path('SELECT id, source, target, cost FROM graph2_edges', 1, 2, false, false); vertex_id | edge_id | cost -----------+---------+------ 1 | 1 | 0 2 | -1 | 0 An additional function shortest_path_as_geometry() can be used to retrieve the result as a set of rows containing the geometry identifier and the geometry itself: SELECT gid, astext(the_geom) FROM shortest_path_as_geometry('graph2', 1, 2); gid | astext -----+-------------------------------------------------------------------------------------------- 0 | MULTILINESTRING((-0.357902298850575 51.2777105057471,-0.364822129560221 51.455488954023)) 22 | MULTILINESTRING((-0.417298850574714 51.3371070574713,-0.408546467391305 51.3885360617191)) (2 rows) MAPSERVER INTEGRATION The function shortest_path_as_geometry() can be used inside mapserver to draw shortest path directly, as in the following example: LAYER NAME "europe2" TYPE LINE STATUS DEFAULT CONNECTIONTYPE postgis CONNECTION "user=postgres host=localhost dbname=geo" DATA "the_geom from (SELECT the_geom, gid from shortest_path_as_geometry('bahnlinien_europa_polyline', 2629, 10171)) as foo using unique gid using srid=-1" TEMPLATE "t" CLASS NAME "0" STYLE SYMBOL "circle" SIZE 10 COLOR 50 50 100 END END END Notice however, that this function will be called at each map display, computing the shortest path every time. A better approach would be to generate the shortest path in a temporary table. LIMITATION / TODO Usage of the Boost graph library can certainly be optimised. Help from Boost/STL experts is welcomed. Might not work on very large datasets due to memory requirements. (Tested with sucess on a 2 million edges table). Some function don't handle both directed and not directed path. Some more useful wrapper functions could be added. LICENCE Same as CartoWeb (GPL) AUTHORS Sylvain Pasche Many thanks to Jocelyn Turcotte for new features and fixes