Re: [postgis-users] Query to select dangling line segments

Kevin Neufeld Tue, 30 Nov 2010 19:32:56 -0800

Glad you got it working.

Just a little note of warning ... I noticed you are using "UNION", not"UNION ALL". The difference is that the later will perform a simpleconcatenation of the two results sets. The former is kind of like a setunion, returning results that are in the first or second query, removingduplicate rows. In your 8 second query, not a big deal, but in your 3second query, you've include a geometry object. In determining recordequivalence to filter duplicate rows, it is the = operator that isinvoked against two geometry objects, which goes back to comparingbounding box equivalence, not geometry equivalence.


i.e.
SELECT 'POINT(0 0)'::geometry
UNION
SELECT 'POINT(0 0.0000001)'::geometry;
                  geometry
--------------------------------------------
 010100000000000000000000000000000000000000
(1 row)


SELECT 'POINT(0 0)'::geometry
UNION ALL
SELECT 'POINT(0 0.0000001)'::geometry;
                  geometry
--------------------------------------------
 010100000000000000000000000000000000000000
 0101000000000000000000000048AFBC9AF2D77A3E
(2 rows)

Not a big deal, but I just wanted to make sure you knew what washappening with your query.UNION ALL should also be faster as it's not trying to merge two resultssets together.


Cheers,
Kevin


On 11/30/2010 2:34 PM, Brian Stempin wrote:


So, I decided to run with the following query:

SELECT osm_id
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
WHERE osm_id NOT IN (

   -- list all ids where the startpoint intersects something.
   SELECT t1.osm_id
   FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
   "OSMData".osm_mn_data_highway_20101129_101234 t2
   WHERE t1.osm_id <> t2.osm_id
   AND ST_Intersects(ST_StartPoint(t1.way), t2.way)

)
UNION
SELECT osm_id
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
WHERE osm_id NOT IN (

   -- list all ids where the startpoint intersects something.
   SELECT t1.osm_id
   FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
   "OSMData".osm_mn_data_highway_20101129_101234 t2
   WHERE t1.osm_id <> t2.osm_id
   AND ST_Intersects(ST_EndPoint(t1.way), t2.way)

)

In my dataset, this takes ~ 8 seconds to run. Being a bit of aperformance junkee, I modified it to the following:


SELECT osm_id, ST_StartPoint(way)
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
WHERE osm_id NOT IN (

   -- list all ids where the startpoint intersects something.
   SELECT t1.osm_id
   FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
   "OSMData".osm_mn_data_highway_20101129_101234 t2
   WHERE t1.osm_id <> t2.osm_id
   AND t2.way ~ t1.way
   AND ST_Intersects(ST_StartPoint(t1.way), t2.way)

)
UNION
SELECT osm_id, ST_EndPoint(way)
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
WHERE osm_id NOT IN (

   -- list all ids where the startpoint intersects something.
   SELECT t1.osm_id
   FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
   "OSMData".osm_mn_data_highway_20101129_101234 t2
   WHERE t1.osm_id <> t2.osm_id
   AND t2.way ~ t1.way
   AND ST_Intersects(ST_EndPoint(t1.way), t2.way)

)

Note the addition of the "t2.way ~ t1.way" bit. "A ~ B" means "A'sbounding box contains B's bounding box." Since in my case B is apoint, I eliminate a lot of comparisons by only looking at shapeswho's bounding box B is contained in. This reduced my query time from8.x seconds to 3.4x seconds.


Thanks a ton, Kevin!

Brian

PS -- anyone else spot any other improvements that I can make? I planon writing about this in a blog article as part of a larger piecelater tonight/tomorrow.



_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users

Re: [postgis-users] Query to select dangling line segments

Reply via email to