Re: [postgis-users] speeding up simple pt-in-poly lookups
It is, and it's counter-intuitive given our expected performance from prepared geometries. This could be a side-effect of our required cache coherence test that sits in front of the actual prepared test. P. On Thu, Dec 22, 2011 at 8:55 PM, Martin Davis wrote: > Interesting that using ST_Simplify makes such a big difference. > > Would it be possible for you to post the sizes of the target geometries > before and after simplification/buffering? This might help in determining a > better solution for the future. > > > On 12/21/2011 12:02 PM, Puneet Kishor wrote: >> >> Thanks everyone. Learned a lot. For now, I have decided to not go down the >> chop-and-reconstruct path. The following query does the job in 1.8 secs >> which is a pretty nice improvement over the previous 200+ seconds. >> >> SELECT Count(*) >> FROM collections c >> WHERE ST_Intersects( >> c.the_geom, ( >> SELECT ST_Buffer(ST_Simplify(n.the_geom, 0.75), 0.75) >> FROM base.continents n >> WHERE n.abbrev = 'NA' >> ) >> ); >> >> >> On Dec 21, 2011, at 11:38 AM, pcr...@pcreso.com wrote: >> >>> Puneet, >>> >>> Chopping polygons is pretty simple, with a grid& st_intersection(), but >>> you can certainly generalise polygons to reduce the number of vertices& >>> size of objects to de-toast... beware however that if you do this then you >>> are actually moving the polygon boundary,& therefore a point very near a >>> boundary may be inside the original country polygon but outside the >>> generalised/simplified one. >>> >>> >>> You can address this by simplifying a buffer of the polygons, with the >>> buffer very slightly larger than the simplify distance, so that every >>> simplified version fully contains the original, but you will also have to >>> check against the original polygons to confirm the point is genuinely inside >>> the original. >>> >>> As an alternative approach, you might also try selecting points where the >>> distance from a polygon is zero, as the ST_distance uses stabbing line >>> algorithm, and may be faster. The distance will be non-zero only for points >>> outside the polygon. >>> >>> Cheers, >>> >>> Brent Wood >>> >>> On 21/12/2011, at 12:06 PM, Mr. Puneet Kishor wrote: >>> On Dec 20, 2011, at 9:48 PM, Martin Davis wrote: > For more detail check out this thread on the same issue: > > > http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html Thanks. Chopping up my coverage into hundreds of small regions is the last avenue I want to try. Going by the text of that email, it seems that "few, large, regions with many vertices (may be) the problem." I will try generalizing my continents so that I have "few, large regions with *very few* vertices" and see if that speeds up the SELECTs. > On 12/20/2011 5:28 PM, Puneet Kishor wrote: >> >> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: >> >>> Chop up the continents into smaller pieces. >>> >> hmmm... I am not sure I understand the above. And then what? UNION >> each smaller piece query? >> >> >>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor >>> wrote: This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs). SELECT Count(c_id) FROM c, continents n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; Both tables have gist indexes on the geometries. The above query has the following plan "Aggregate (cost=9.66..9.67 rows=1 width=4)" " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" " Join Filter: _st_intersects(c.the_geom, n.the_geom)" " -> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" " Filter: ((continent)::text = 'North America'::text)" " -> Index Scan using pbdb__collections_the_geom on collections c (cost=0.00..8.30 rows=1 width=104)" " Index Cond: (c.the_geom&& n.the_geom)" The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern. -- Puneet Kishor >> >> ___ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users >> >> >> - >> No virus found in this message. >> Checked by AVG - www.avg.com >> Version: 2012.0.1890 / Virus Database: 2109/4694 - Release Date: 12/21/11 >> >> >
Re: [postgis-users] speeding up simple pt-in-poly lookups
On Dec 22, 2011, at 11:09 PM, Martin Davis wrote: > That's what I meant by "size'. That's quite a difference! And would > definitely explain the performance improvement. > > One other thing I'm curious about. Is the coordinate system is lat/long? In > which case the buffer distance of 0.75 is quite large, isn't it? Is that > going to possibly produce incorrect results? Yes, it is lat/lng, and yes, the buffer is rather large. But I viewed the buffered geometry in QGIS, and it is fine for my work. It coalesces a few islands here and there making my map look like the Michelin tire man, but I don't care about the outline. I just want to grab the points, and, for now, speed is winning over too-much-accuracy. The nice thing with this approach is, I don't have to create a new geometry (which I would have to do if I were to chop up the continent), so I can experiment with different simplify and buffer tolerances till things are correct. > > On 12/22/2011 9:04 PM, Mr. Puneet Kishor wrote: >> >> What do you mean by "size"? If you mean "number of points," the original >> geometry (North America continent) had circa 50K points, and the simplified >> geometry had around 2000, so an order of magnitude difference. I am speaking >> from memory because I am not in front of my db server. Will check again >> tomorrow, and repost if the numbers are drastically different from above. >> >> > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
That's what I meant by "size'. That's quite a difference! And would definitely explain the performance improvement. One other thing I'm curious about. Is the coordinate system is lat/long? In which case the buffer distance of 0.75 is quite large, isn't it? Is that going to possibly produce incorrect results? On 12/22/2011 9:04 PM, Mr. Puneet Kishor wrote: What do you mean by "size"? If you mean "number of points," the original geometry (North America continent) had circa 50K points, and the simplified geometry had around 2000, so an order of magnitude difference. I am speaking from memory because I am not in front of my db server. Will check again tomorrow, and repost if the numbers are drastically different from above. ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
You may well be right that for this particular case of testing polygon/point intersection it's efficient to use a distance function, since that uses the simple stabbing line point-in-polygon algorithm (whereas ST_Intersects has much more complicated code behind it, which may not be fully optimized). It is probably better to use ST_DWithin(geom1, geom2, 0) rather than ST_Distance(geom1, geom2) = 0, since ST_DWithin can optimize the case where points lie outside the query polygon. When checking intersection, the actual distance is irrelevant (and is expensive to compute when the point lies outside the polygon. And note that using distance functions stops being efficient when testing things that aren't points (such as linestrings or polygons), since this uses a brute-force O(n^2) algorithm. On 12/21/2011 9:38 AM, pcr...@pcreso.com wrote: Puneet, Chopping polygons is pretty simple, with a grid & st_intersection(), but you can certainly generalise polygons to reduce the number of vertices & size of objects to de-toast... beware however that if you do this then you are actually moving the polygon boundary, & therefore a point very near a boundary may be inside the original country polygon but outside the generalised/simplified one. You can address this by simplifying a buffer of the polygons, with the buffer very slightly larger than the simplify distance, so that every simplified version fully contains the original, but you will also have to check against the original polygons to confirm the point is genuinely inside the original. As an alternative approach, you might also try selecting points where the distance from a polygon is zero, as the ST_distance uses stabbing line algorithm, and may be faster. The distance will be non-zero only for points outside the polygon. Cheers, Brent Wood On 21/12/2011, at 12:06 PM, Mr. Puneet Kishor wrote: > > On Dec 20, 2011, at 9:48 PM, Martin Davis wrote: > >> For more detail check out this thread on the same issue: >> >> http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html > > > Thanks. Chopping up my coverage into hundreds of small regions is the last avenue I want to try. Going by the text of that email, it seems that "few, large, regions with many vertices (may be) the problem." I will try generalizing my continents so that I have "few, large regions with *very few* vertices" and see if that speeds up the SELECTs. > > >> >> On 12/20/2011 5:28 PM, Puneet Kishor wrote: >>> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: >>> Chop up the continents into smaller pieces. >>> >>> hmmm... I am not sure I understand the above. And then what? UNION each smaller piece query? >>> >>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor> wrote: > This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs). > > SELECT Count(c_id) > FROM c, continents n > WHERE ST_Intersects(c.the_geom, n.the_geom) AND > n.continent = 'North America'; > > > Both tables have gist indexes on the geometries. The above query has the following plan > > "Aggregate (cost=9.66..9.67 rows=1 width=4)" > " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" > "Join Filter: _st_intersects(c.the_geom, n.the_geom)" > "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" > " Filter: ((continent)::text = 'North America'::text)" > "-> Index Scan using pbdb__collections_the_geom on collections c (cost=0.00..8.30 rows=1 width=104)" > " Index Cond: (c.the_geom&& n.the_geom)" > > The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern. > > > > -- > Puneet Kishor >>> ___ >>> postgis-users mailing list >>> postgis-users@postgis.refractions.net >>> http://postgis.refractions.net/mailman/listinfo/postgis-users >>> >>> >>> - >>> No virus found in this message. >>> Checked by AVG - www.avg.com >>> Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 12/20/11 >>> >>> >> ___ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users > > ___ > postgis-users mailing list > postgis-users@postgis.refra
Re: [postgis-users] speeding up simple pt-in-poly lookups
On Dec 22, 2011, at 10:55 PM, Martin Davis wrote: > Interesting that using ST_Simplify makes such a big difference. > > Would it be possible for you to post the sizes of the target geometries > before and after simplification/buffering? What do you mean by "size"? If you mean "number of points," the original geometry (North America continent) had circa 50K points, and the simplified geometry had around 2000, so an order of magnitude difference. I am speaking from memory because I am not in front of my db server. Will check again tomorrow, and repost if the numbers are drastically different from above. > This might help in determining a better solution for the future. > > On 12/21/2011 12:02 PM, Puneet Kishor wrote: >> Thanks everyone. Learned a lot. For now, I have decided to not go down the >> chop-and-reconstruct path. The following query does the job in 1.8 secs >> which is a pretty nice improvement over the previous 200+ seconds. >> >> SELECT Count(*) >> FROM collections c >> WHERE ST_Intersects( >> c.the_geom, ( >> SELECT ST_Buffer(ST_Simplify(n.the_geom, 0.75), 0.75) >> FROM base.continents n >> WHERE n.abbrev = 'NA' >> ) >> ); >> >> >> On Dec 21, 2011, at 11:38 AM, pcr...@pcreso.com wrote: >> >>> Puneet, >>> >>> Chopping polygons is pretty simple, with a grid& st_intersection(), but >>> you can certainly generalise polygons to reduce the number of vertices& >>> size of objects to de-toast... beware however that if you do this then you >>> are actually moving the polygon boundary,& therefore a point very near a >>> boundary may be inside the original country polygon but outside the >>> generalised/simplified one. >>> >>> You can address this by simplifying a buffer of the polygons, with the >>> buffer very slightly larger than the simplify distance, so that every >>> simplified version fully contains the original, but you will also have to >>> check against the original polygons to confirm the point is genuinely >>> inside the original. >>> >>> As an alternative approach, you might also try selecting points where the >>> distance from a polygon is zero, as the ST_distance uses stabbing line >>> algorithm, and may be faster. The distance will be non-zero only for points >>> outside the polygon. >>> >>> Cheers, >>> >>> Brent Wood >>> >>> On 21/12/2011, at 12:06 PM, Mr. Puneet Kishor wrote: >>> On Dec 20, 2011, at 9:48 PM, Martin Davis wrote: > For more detail check out this thread on the same issue: > > http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html Thanks. Chopping up my coverage into hundreds of small regions is the last avenue I want to try. Going by the text of that email, it seems that "few, large, regions with many vertices (may be) the problem." I will try generalizing my continents so that I have "few, large regions with *very few* vertices" and see if that speeds up the SELECTs. > On 12/20/2011 5:28 PM, Puneet Kishor wrote: >> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: >> >>> Chop up the continents into smaller pieces. >>> >> hmmm... I am not sure I understand the above. And then what? UNION each >> smaller piece query? >> >> >>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor >>> wrote: This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs). SELECT Count(c_id) FROM c, continents n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; Both tables have gist indexes on the geometries. The above query has the following plan "Aggregate (cost=9.66..9.67 rows=1 width=4)" " ->Nested Loop (cost=0.00..9.66 rows=1 width=4)" "Join Filter: _st_intersects(c.the_geom, n.the_geom)" "->Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" " Filter: ((continent)::text = 'North America'::text)" "->Index Scan using pbdb__collections_the_geom on collections c (cost=0.00..8.30 rows=1 width=104)" " Index Cond: (c.the_geom&& n.the_geom)" The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern. -- Puneet Kishor >> ___ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users >> >> >> - >> No virus found in this message
Re: [postgis-users] speeding up simple pt-in-poly lookups
Interesting that using ST_Simplify makes such a big difference. Would it be possible for you to post the sizes of the target geometries before and after simplification/buffering? This might help in determining a better solution for the future. On 12/21/2011 12:02 PM, Puneet Kishor wrote: Thanks everyone. Learned a lot. For now, I have decided to not go down the chop-and-reconstruct path. The following query does the job in 1.8 secs which is a pretty nice improvement over the previous 200+ seconds. SELECT Count(*) FROM collections c WHERE ST_Intersects( c.the_geom, ( SELECT ST_Buffer(ST_Simplify(n.the_geom, 0.75), 0.75) FROM base.continents n WHERE n.abbrev = 'NA' ) ); On Dec 21, 2011, at 11:38 AM, pcr...@pcreso.com wrote: Puneet, Chopping polygons is pretty simple, with a grid& st_intersection(), but you can certainly generalise polygons to reduce the number of vertices& size of objects to de-toast... beware however that if you do this then you are actually moving the polygon boundary,& therefore a point very near a boundary may be inside the original country polygon but outside the generalised/simplified one. You can address this by simplifying a buffer of the polygons, with the buffer very slightly larger than the simplify distance, so that every simplified version fully contains the original, but you will also have to check against the original polygons to confirm the point is genuinely inside the original. As an alternative approach, you might also try selecting points where the distance from a polygon is zero, as the ST_distance uses stabbing line algorithm, and may be faster. The distance will be non-zero only for points outside the polygon. Cheers, Brent Wood On 21/12/2011, at 12:06 PM, Mr. Puneet Kishor wrote: On Dec 20, 2011, at 9:48 PM, Martin Davis wrote: For more detail check out this thread on the same issue: http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html Thanks. Chopping up my coverage into hundreds of small regions is the last avenue I want to try. Going by the text of that email, it seems that "few, large, regions with many vertices (may be) the problem." I will try generalizing my continents so that I have "few, large regions with *very few* vertices" and see if that speeds up the SELECTs. On 12/20/2011 5:28 PM, Puneet Kishor wrote: On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: Chop up the continents into smaller pieces. hmmm... I am not sure I understand the above. And then what? UNION each smaller piece query? On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs). SELECT Count(c_id) FROM c, continents n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; Both tables have gist indexes on the geometries. The above query has the following plan "Aggregate (cost=9.66..9.67 rows=1 width=4)" " ->Nested Loop (cost=0.00..9.66 rows=1 width=4)" "Join Filter: _st_intersects(c.the_geom, n.the_geom)" "->Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" " Filter: ((continent)::text = 'North America'::text)" "->Index Scan using pbdb__collections_the_geom on collections c (cost=0.00..8.30 rows=1 width=104)" " Index Cond: (c.the_geom&& n.the_geom)" The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern. -- Puneet Kishor ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1890 / Virus Database: 2109/4694 - Release Date: 12/21/11 ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
> Chopping polygons is pretty simple, with a grid & st_intersection() Just for interest, I posted a method for gridding/chopping polygons the other day in a set of comparative posting with SQL Server and Oracle.http://www.spatialdbadvisor.com/postgis_tips_tricks/258/gridding-a-geometric-objectThere may be faster ways of doing this, but my attempt was about functionality not speed. Also, the approach here is not optimized for a singlepolygon: it is designed to use the same grid no matter how many polygons etc are put into the chopper.regardsSimon-- Holder of "2011 Oracle Spatial Excellence Award for Education and Research."SpatialDB Advice and Design, Solutions Architecture and Programming,Oracle Database 10g Administrator Certified Associate; Oracle Database 10g SQL Certified ProfessionalOracle Spatial, SQL Server, PostGIS, MySQL, ArcSDE, Manifold GIS, FME, Radius Topology and Studio Specialist.39 Cliff View Drive, Allens Rivulet, 7150, Tasmania, Australia.Website: www.spatialdbadvisor.com Email: si...@spatialdbadvisor.com Voice: +61 362 396397Mobile: +61 418 396391Skype: sggreenerLongitude: 147.20515 (147° 12' 18" E)Latitude: -43.01530 (43° 00' 55" S)GeoHash: r22em9r98wgNAC:W80CK 7SWP3___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
Thanks everyone. Learned a lot. For now, I have decided to not go down the chop-and-reconstruct path. The following query does the job in 1.8 secs which is a pretty nice improvement over the previous 200+ seconds. SELECT Count(*) FROM collections c WHERE ST_Intersects( c.the_geom, ( SELECT ST_Buffer(ST_Simplify(n.the_geom, 0.75), 0.75) FROM base.continents n WHERE n.abbrev = 'NA' ) ); On Dec 21, 2011, at 11:38 AM, pcr...@pcreso.com wrote: > Puneet, > > Chopping polygons is pretty simple, with a grid & st_intersection(), but you > can certainly generalise polygons to reduce the number of vertices & size of > objects to de-toast... beware however that if you do this then you are > actually moving the polygon boundary, & therefore a point very near a > boundary may be inside the original country polygon but outside the > generalised/simplified one. > > You can address this by simplifying a buffer of the polygons, with the buffer > very slightly larger than the simplify distance, so that every simplified > version fully contains the original, but you will also have to check against > the original polygons to confirm the point is genuinely inside the original. > > As an alternative approach, you might also try selecting points where the > distance from a polygon is zero, as the ST_distance uses stabbing line > algorithm, and may be faster. The distance will be non-zero only for points > outside the polygon. > > Cheers, > > Brent Wood > > On 21/12/2011, at 12:06 PM, Mr. Puneet Kishor wrote: > >> >> On Dec 20, 2011, at 9:48 PM, Martin Davis wrote: >> >>> For more detail check out this thread on the same issue: >>> >>> http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html >> >> >> Thanks. Chopping up my coverage into hundreds of small regions is the last >> avenue I want to try. Going by the text of that email, it seems that "few, >> large, regions with many vertices (may be) the problem." I will try >> generalizing my continents so that I have "few, large regions with *very >> few* vertices" and see if that speeds up the SELECTs. >> >> >>> >>> On 12/20/2011 5:28 PM, Puneet Kishor wrote: On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: > Chop up the continents into smaller pieces. > hmmm... I am not sure I understand the above. And then what? UNION each smaller piece query? > On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor > wrote: >> This is probably a really basic question... my ST_Within or >> ST_Intersects selecting points in a continent are way too slow (both >> take upward of 200 secs). >> >>SELECT Count(c_id) >>FROM c, continents n >>WHERE ST_Intersects(c.the_geom, n.the_geom) AND >>n.continent = 'North America'; >> >> >> Both tables have gist indexes on the geometries. The above query has the >> following plan >> >> "Aggregate (cost=9.66..9.67 rows=1 width=4)" >> " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" >> "Join Filter: _st_intersects(c.the_geom, n.the_geom)" >> "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 >> width=32)" >> " Filter: ((continent)::text = 'North America'::text)" >> "-> Index Scan using pbdb__collections_the_geom on collections >> c (cost=0.00..8.30 rows=1 width=104)" >> " Index Cond: (c.the_geom&& n.the_geom)" >> >> The table c has approx 120K rows, and the continents table has 8 >> rows.Suggestions on how I can improve this? Yes, the computer is >> otherwise very swift and modern. >> >> >> >> -- >> Puneet Kishor ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
Puneet, Chopping polygons is pretty simple, with a grid & st_intersection(), but you can certainly generalise polygons to reduce the number of vertices & size of objects to de-toast... beware however that if you do this then you are actually moving the polygon boundary, & therefore a point very near a boundary may be inside the original country polygon but outside the generalised/simplified one. You can address this by simplifying a buffer of the polygons, with the buffer very slightly larger than the simplify distance, so that every simplified version fully contains the original, but you will also have to check against the original polygons to confirm the point is genuinely inside the original. As an alternative approach, you might also try selecting points where the distance from a polygon is zero, as the ST_distance uses stabbing line algorithm, and may be faster. The distance will be non-zero only for points outside the polygon. Cheers, Brent Wood On 21/12/2011, at 12:06 PM, Mr. Puneet Kishor wrote: > > On Dec 20, 2011, at 9:48 PM, Martin Davis wrote: > >> For more detail check out this thread on the same issue: >> >> http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html > > > Thanks. Chopping up my coverage into hundreds of small regions is the last > avenue I want to try. Going by the text of that email, it seems that "few, > large, regions with many vertices (may be) the problem." I will try > generalizing my continents so that I have "few, large regions with *very few* > vertices" and see if that speeds up the SELECTs. > > >> >> On 12/20/2011 5:28 PM, Puneet Kishor wrote: >>> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: >>> Chop up the continents into smaller pieces. >>> >>> hmmm... I am not sure I understand the above. And then what? UNION each >>> smaller piece query? >>> >>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: > This is probably a really basic question... my ST_Within or ST_Intersects > selecting points in a continent are way too slow (both take upward of 200 > secs). > > SELECT Count(c_id) > FROM c, continents n > WHERE ST_Intersects(c.the_geom, n.the_geom) AND > n.continent = 'North America'; > > > Both tables have gist indexes on the geometries. The above query has the > following plan > > "Aggregate (cost=9.66..9.67 rows=1 width=4)" > " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" > " Join Filter: _st_intersects(c.the_geom, n.the_geom)" > " -> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" > " Filter: ((continent)::text = 'North America'::text)" > " -> Index Scan using pbdb__collections_the_geom on collections > c (cost=0.00..8.30 rows=1 width=104)" > " Index Cond: (c.the_geom&& n.the_geom)" > > The table c has approx 120K rows, and the continents table has 8 > rows.Suggestions on how I can improve this? Yes, the computer is > otherwise very swift and modern. > > > > -- > Puneet Kishor >>> ___ >>> postgis-users mailing list >>> postgis-users@postgis.refractions.net >>> http://postgis.refractions.net/mailman/listinfo/postgis-users >>> >>> >>> - >>> No virus found in this message. >>> Checked by AVG - www.avg.com >>> Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 12/20/11 >>> >>> >> ___ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
Puneet, Trying to find which country a large set of points was in we have actually found it much quicker to find points in provinces (smaller polygons) and then return the country code associated with the province. No chopping anything, so I would guess you could use a world map, allocate a continent to each country in the world map and query it. The speed up we saw was (I guess) for the same reason - the bbox was efficient. cheers Ben On 21/12/2011, at 12:06 PM, Mr. Puneet Kishor wrote: > > On Dec 20, 2011, at 9:48 PM, Martin Davis wrote: > >> For more detail check out this thread on the same issue: >> >> http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html > > > Thanks. Chopping up my coverage into hundreds of small regions is the last > avenue I want to try. Going by the text of that email, it seems that "few, > large, regions with many vertices (may be) the problem." I will try > generalizing my continents so that I have "few, large regions with *very few* > vertices" and see if that speeds up the SELECTs. > > >> >> On 12/20/2011 5:28 PM, Puneet Kishor wrote: >>> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: >>> Chop up the continents into smaller pieces. >>> >>> hmmm... I am not sure I understand the above. And then what? UNION each >>> smaller piece query? >>> >>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: > This is probably a really basic question... my ST_Within or ST_Intersects > selecting points in a continent are way too slow (both take upward of 200 > secs). > > SELECT Count(c_id) > FROM c, continents n > WHERE ST_Intersects(c.the_geom, n.the_geom) AND > n.continent = 'North America'; > > > Both tables have gist indexes on the geometries. The above query has the > following plan > > "Aggregate (cost=9.66..9.67 rows=1 width=4)" > " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" > "Join Filter: _st_intersects(c.the_geom, n.the_geom)" > "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" > " Filter: ((continent)::text = 'North America'::text)" > "-> Index Scan using pbdb__collections_the_geom on collections > c (cost=0.00..8.30 rows=1 width=104)" > " Index Cond: (c.the_geom&& n.the_geom)" > > The table c has approx 120K rows, and the continents table has 8 > rows.Suggestions on how I can improve this? Yes, the computer is > otherwise very swift and modern. > > > > -- > Puneet Kishor >>> ___ >>> postgis-users mailing list >>> postgis-users@postgis.refractions.net >>> http://postgis.refractions.net/mailman/listinfo/postgis-users >>> >>> >>> - >>> No virus found in this message. >>> Checked by AVG - www.avg.com >>> Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 12/20/11 >>> >>> >> ___ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
Martin, Stephen In intersect testing the winding number algorithm is used. Stabbing line algorithm is used in distance calculations in PostGIS. Puneet Is it geometry type you are using or is it geography. I have no idea about the algorithms in geography functions. It sounds very much with 200 s, your continent polygons must be very detailed with many vertex points. What do you get from SELECT ST_NPoints(the_geom) from continents; ? /Nicklas On Tue, 2011-12-20 at 23:15 -0800, Martin Davis wrote: > Yes, this so-called stabbing line algorithm for Point-In-Polygon is > already implemented in the GEOS PreparedGeometry code used by PostGIS > (which I assume is being invoked in this query - Paul, can you > confirm?) In fact, the PreparedGeometry algorithm goes one better and > uses a 1-D spatial index on the edges in order to quickly discard edges > which cannot intersect the stabbing line (the standard algorithm still > requires an O(n) scan of all the edges) > > However, I think I'm correct in saying that even though the > PreparedGeometry is cached in memory it is still necessary to detoast > the large continent geometry from disk into memory for each point which > is being queried, in order to confirm the cache hit. This is due to the > way the Postgres query engine works. The detoasting dominates the time, > even though the actual Point-In-Polygon test is pretty efficient. (Paul > can perhaps confirm this too). > > It would be nice to find some more efficient strategy for testing for a > positive cache hit > > On 12/20/2011 9:54 PM, Stephen Woodbridge wrote: > > Hi Martin, > > > > There is a very fast algorithm for testing point in polygon by > > creating a vector from the point to another point beyond the extents > > of the polygon and counting the number of edge crossings. If the count > > is odd then the point is in the polygon and if the count is even it is > > not. If the vector is horizontal you can very rapidly reject all edges > > above or below the vector and all edges east or west of the vector. > > Then a simple intersection test on each of the remain edges will > > determine the count. > > > > Given your extensive knowledge in computational geometry, I assume you > > already are aware of this algorithm and it is not suitable for some > > reason. But since this question on performance comes up pretty often, > > it might have a place as long as people know what the potential > > limitations are. > > > > -Steve > > > > On 12/20/2011 10:48 PM, Martin Davis wrote: > >> For more detail check out this thread on the same issue: > >> > >> http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html > >> > >> > >> > >> > >> On 12/20/2011 5:28 PM, Puneet Kishor wrote: > >>> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: > >>> > Chop up the continents into smaller pieces. > > >>> > >>> hmmm... I am not sure I understand the above. And then what? UNION > >>> each smaller piece query? > >>> > >>> > On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor > wrote: > > This is probably a really basic question... my ST_Within or > > ST_Intersects selecting points in a continent are way too slow (both > > take upward of 200 secs). > > > > SELECT Count(c_id) > > FROM c, continents n > > WHERE ST_Intersects(c.the_geom, n.the_geom) AND > > n.continent = 'North America'; > > > > > > Both tables have gist indexes on the geometries. The above query has > > the following plan > > > > "Aggregate (cost=9.66..9.67 rows=1 width=4)" > > " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" > > " Join Filter: _st_intersects(c.the_geom, n.the_geom)" > > " -> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" > > " Filter: ((continent)::text = 'North America'::text)" > > " -> Index Scan using pbdb__collections_the_geom on collections c > > (cost=0.00..8.30 rows=1 width=104)" > > " Index Cond: (c.the_geom&& n.the_geom)" > > > > The table c has approx 120K rows, and the continents table has 8 > > rows.Suggestions on how I can improve this? Yes, the computer is > > otherwise very swift and modern. > > > > > > > > -- > > Puneet Kishor > >>> ___ > >>> postgis-users mailing list > >>> postgis-users@postgis.refractions.net > >>> http://postgis.refractions.net/mailman/listinfo/postgis-users > >>> > >>> > >>> - > >>> No virus found in this message. > >>> Checked by AVG - www.avg.com > >>> Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: > >>> 12/20/11 > >>> > >>> > >> ___ > >> postgis-users mailing list > >> postgis-users@postgis.refractions.net > >> http://postgis.refractions.net/mailman/listinfo/postgis-users > > > > ___ > > postgis-users mailing list > > postgis-users@postgis.refractio
Re: [postgis-users] speeding up simple pt-in-poly lookups
Yes, this so-called stabbing line algorithm for Point-In-Polygon is already implemented in the GEOS PreparedGeometry code used by PostGIS (which I assume is being invoked in this query - Paul, can you confirm?) In fact, the PreparedGeometry algorithm goes one better and uses a 1-D spatial index on the edges in order to quickly discard edges which cannot intersect the stabbing line (the standard algorithm still requires an O(n) scan of all the edges) However, I think I'm correct in saying that even though the PreparedGeometry is cached in memory it is still necessary to detoast the large continent geometry from disk into memory for each point which is being queried, in order to confirm the cache hit. This is due to the way the Postgres query engine works. The detoasting dominates the time, even though the actual Point-In-Polygon test is pretty efficient. (Paul can perhaps confirm this too). It would be nice to find some more efficient strategy for testing for a positive cache hit On 12/20/2011 9:54 PM, Stephen Woodbridge wrote: Hi Martin, There is a very fast algorithm for testing point in polygon by creating a vector from the point to another point beyond the extents of the polygon and counting the number of edge crossings. If the count is odd then the point is in the polygon and if the count is even it is not. If the vector is horizontal you can very rapidly reject all edges above or below the vector and all edges east or west of the vector. Then a simple intersection test on each of the remain edges will determine the count. Given your extensive knowledge in computational geometry, I assume you already are aware of this algorithm and it is not suitable for some reason. But since this question on performance comes up pretty often, it might have a place as long as people know what the potential limitations are. -Steve On 12/20/2011 10:48 PM, Martin Davis wrote: For more detail check out this thread on the same issue: http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html On 12/20/2011 5:28 PM, Puneet Kishor wrote: On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: Chop up the continents into smaller pieces. hmmm... I am not sure I understand the above. And then what? UNION each smaller piece query? On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs). SELECT Count(c_id) FROM c, continents n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; Both tables have gist indexes on the geometries. The above query has the following plan "Aggregate (cost=9.66..9.67 rows=1 width=4)" " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" " Join Filter: _st_intersects(c.the_geom, n.the_geom)" " -> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" " Filter: ((continent)::text = 'North America'::text)" " -> Index Scan using pbdb__collections_the_geom on collections c (cost=0.00..8.30 rows=1 width=104)" " Index Cond: (c.the_geom&& n.the_geom)" The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern. -- Puneet Kishor ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 12/20/11 ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 12/20/11 ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
Try this: Create a table with chopped up polygons derived from your continents table. Say you have continents (continent_id, name, the_geom) then you have continents_chopped(chop_id, continent_id, continent_name, the_geom) Spatial index on continents_chopped.polygon SELECT Count(distinct continent_id) FROM c, continents_chopped n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; the BB index will very quickly identify a few chopped polygons which contain the point, then the much slower test against all the vertices in the continent polygons is carried out against a few chopped polygons with vastly fewer vertices, making for a much faster operation. This, from an email I sent a colleague a few weeks ago may be helpful: As far as an easy way to generate a grid of specified size cells over a specified area, see: http://trac.osgeo.org/postgis/wiki/UsersWikiCreateFishnet Thus allowing arbitrary slicing/binning of polygons via a simple overlay operation - ST_Intersection() This can be wrapped up in a shell script which generates the grid, as below: #! /bin/bash # script to generate an arbitrary grid # test only # B Wood 12/11/2011 DB=test_grid # create db dropdb $DB createdb $DB psql -d $DB -qf /usr/share/postgresql/9.1/contrib/postgis-1.5/postgis.sql psql -d $DB -qf /usr/share/postgresql/9.1/contrib/postgis-1.5/spatial_ref_sys.sql # create function psql -d $DB -c "CREATE OR REPLACE FUNCTION ST_CreateGrid( nrow integer, ncol integer, xsize float8, ysize float8, x0 float8 DEFAULT 0, y0 float8 DEFAULT 0, srid integer DEFAULT 4326) RETURNS SETOF geometry AS \$\$ SELECT ST_Translate(cell, j * \$3 + \$5, i * \$4 + \$6) FROM generate_series(0, \$1 - 1) AS i, generate_series(0, \$2 - 1) AS j, ( SELECT setsrid(('POLYGON((0 0, 0 '||\$4||', '||\$3||' '||\$4||', '||\$3||' 0,0 0))')::geometry, \$7) AS cell ) AS foo; \$\$ LANGUAGE sql IMMUTABLE STRICT;" # create table psql -d $DB -c "create table test_grid ( id serial primary key);" psql -d $DB -c "select addgeometrycolumn('','test_grid','geom',4326,'POLYGON',2);" # create lat/long grid # set extent W=150 E=190 S=-55 N=-30 SRID=4326 # set cell size in degrees SIZE=0.05 # get cell counts for extent NX=`echo "($E - $W) / $SIZE" | bc` NY=`echo "($N - $S) / $SIZE" | bc | tr -d "-"` # create grid psql -d $DB -c "delete from test_grid;" psql -d $DB -c "insert into test_grid (geom) select ST_CreateGrid($NY,$NX,$SIZE,$SIZE,$W.0,$S.0,$SRID);" --- On Wed, 12/21/11, Puneet Kishor wrote: From: Puneet Kishor Subject: Re: [postgis-users] speeding up simple pt-in-poly lookups To: "PostGIS Users Discussion" Date: Wednesday, December 21, 2011, 2:28 PM On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: > Chop up the continents into smaller pieces. > hmmm... I am not sure I understand the above. And then what? UNION each smaller piece query? > On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: >> This is probably a really basic question... my ST_Within or ST_Intersects >> selecting points in a continent are way too slow (both take upward of 200 >> secs). >> >> SELECT Count(c_id) >> FROM c, continents n >> WHERE ST_Intersects(c.the_geom, n.the_geom) AND >> n.continent = 'North America'; >> >> >> Both tables have gist indexes on the geometries. The above query has the >> following plan >> >> "Aggregate (cost=9.66..9.67 rows=1 width=4)" >> " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" >> " Join Filter: _st_intersects(c.the_geom, n.the_geom)" >> " -> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" >> " Filter: ((continent)::text = 'North America'::text)" >> " -> Index Scan using pbdb__collections_the_geom on collections c >> (cost=0.00..8.30 rows=1 width=104)" >> " Index Cond: (c.the_geom && n.the_geom)" >> >> The table c has approx 120K rows, and the continents table has 8 >> rows.Suggestions on how I can improve this? Yes, the computer is otherwise >> very swift and modern. >> >> >> >> -- >> Puneet Kishor ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
Hi Martin, There is a very fast algorithm for testing point in polygon by creating a vector from the point to another point beyond the extents of the polygon and counting the number of edge crossings. If the count is odd then the point is in the polygon and if the count is even it is not. If the vector is horizontal you can very rapidly reject all edges above or below the vector and all edges east or west of the vector. Then a simple intersection test on each of the remain edges will determine the count. Given your extensive knowledge in computational geometry, I assume you already are aware of this algorithm and it is not suitable for some reason. But since this question on performance comes up pretty often, it might have a place as long as people know what the potential limitations are. -Steve On 12/20/2011 10:48 PM, Martin Davis wrote: For more detail check out this thread on the same issue: http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html On 12/20/2011 5:28 PM, Puneet Kishor wrote: On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: Chop up the continents into smaller pieces. hmmm... I am not sure I understand the above. And then what? UNION each smaller piece query? On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs). SELECT Count(c_id) FROM c, continents n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; Both tables have gist indexes on the geometries. The above query has the following plan "Aggregate (cost=9.66..9.67 rows=1 width=4)" " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" " Join Filter: _st_intersects(c.the_geom, n.the_geom)" " -> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" " Filter: ((continent)::text = 'North America'::text)" " -> Index Scan using pbdb__collections_the_geom on collections c (cost=0.00..8.30 rows=1 width=104)" " Index Cond: (c.the_geom&& n.the_geom)" The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern. -- Puneet Kishor ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 12/20/11 ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
On Dec 20, 2011, at 9:48 PM, Martin Davis wrote: > For more detail check out this thread on the same issue: > > http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html Thanks. Chopping up my coverage into hundreds of small regions is the last avenue I want to try. Going by the text of that email, it seems that "few, large, regions with many vertices (may be) the problem." I will try generalizing my continents so that I have "few, large regions with *very few* vertices" and see if that speeds up the SELECTs. > > On 12/20/2011 5:28 PM, Puneet Kishor wrote: >> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: >> >>> Chop up the continents into smaller pieces. >>> >> >> hmmm... I am not sure I understand the above. And then what? UNION each >> smaller piece query? >> >> >>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs). SELECT Count(c_id) FROM c, continents n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; Both tables have gist indexes on the geometries. The above query has the following plan "Aggregate (cost=9.66..9.67 rows=1 width=4)" " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" "Join Filter: _st_intersects(c.the_geom, n.the_geom)" "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" " Filter: ((continent)::text = 'North America'::text)" "-> Index Scan using pbdb__collections_the_geom on collections c (cost=0.00..8.30 rows=1 width=104)" " Index Cond: (c.the_geom&& n.the_geom)" The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern. -- Puneet Kishor >> ___ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users >> >> >> - >> No virus found in this message. >> Checked by AVG - www.avg.com >> Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 12/20/11 >> >> > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
For more detail check out this thread on the same issue: http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html On 12/20/2011 5:28 PM, Puneet Kishor wrote: On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: Chop up the continents into smaller pieces. hmmm... I am not sure I understand the above. And then what? UNION each smaller piece query? On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs). SELECT Count(c_id) FROM c, continents n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; Both tables have gist indexes on the geometries. The above query has the following plan "Aggregate (cost=9.66..9.67 rows=1 width=4)" " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" "Join Filter: _st_intersects(c.the_geom, n.the_geom)" "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" " Filter: ((continent)::text = 'North America'::text)" "-> Index Scan using pbdb__collections_the_geom on collections c (cost=0.00..8.30 rows=1 width=104)" " Index Cond: (c.the_geom&& n.the_geom)" The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern. -- Puneet Kishor ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 12/20/11 ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
Ah, I see...I totally missed that last line. Sorry. On Tue, Dec 20, 2011 at 8:49 PM, Puneet Kishor wrote: > > On Dec 20, 2011, at 7:44 PM, Brian Stempin wrote: > > > I've been out of the loop for a bit, but try this: > > > > SELECT Count(c_id) > > FROM c, continents n > > WHERE c.the_geom && n.the_geom AND > > ST_Intersects(c.the_geom, n.the_geom) AND > > n.continent = 'North America'; > > > > The key line being: > > c.the_geom && n.the_geom > > > > > If you see the EXPLAIN QUERY PLAN I posted (see below), ST_Intersects > automatically does an && check. Nevertheless, I tried your specific > suggestion above, and it took 201 seconds. No joy. > > > > > > Check out item 7.7: > > http://postgis.refractions.net/documentation/manual-1.5/reference.html > > > > The && uses the bounding boxes to check if they overlap. This is a very > > quick operation that might eliminate the need to further examine if two > > geoms are intersecting. > > > > HTH, > > Brian > > > > On Tue, Dec 20, 2011 at 8:28 PM, Puneet Kishor > wrote: > > > >> > >> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: > >> > >>> Chop up the continents into smaller pieces. > >>> > >> > >> > >> hmmm... I am not sure I understand the above. And then what? UNION each > >> smaller piece query? > >> > >> > >>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor > >> wrote: > This is probably a really basic question... my ST_Within or > >> ST_Intersects selecting points in a continent are way too slow (both > take > >> upward of 200 secs). > > SELECT Count(c_id) > FROM c, continents n > WHERE ST_Intersects(c.the_geom, n.the_geom) AND > n.continent = 'North America'; > > > Both tables have gist indexes on the geometries. The above query has > >> the following plan > > "Aggregate (cost=9.66..9.67 rows=1 width=4)" > " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" > "Join Filter: _st_intersects(c.the_geom, n.the_geom)" > "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 > >> width=32)" > " Filter: ((continent)::text = 'North America'::text)" > "-> Index Scan using pbdb__collections_the_geom on > collections > >> c (cost=0.00..8.30 rows=1 width=104)" > " Index Cond: (c.the_geom && n.the_geom)" > > The table c has approx 120K rows, and the continents table has 8 > >> rows.Suggestions on how I can improve this? Yes, the computer is > otherwise > >> very swift and modern. > > > > -- > Puneet Kishor > >> > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
On Dec 20, 2011, at 7:44 PM, Brian Stempin wrote: > I've been out of the loop for a bit, but try this: > > SELECT Count(c_id) > FROM c, continents n > WHERE c.the_geom && n.the_geom AND > ST_Intersects(c.the_geom, n.the_geom) AND > n.continent = 'North America'; > > The key line being: > c.the_geom && n.the_geom > If you see the EXPLAIN QUERY PLAN I posted (see below), ST_Intersects automatically does an && check. Nevertheless, I tried your specific suggestion above, and it took 201 seconds. No joy. > Check out item 7.7: > http://postgis.refractions.net/documentation/manual-1.5/reference.html > > The && uses the bounding boxes to check if they overlap. This is a very > quick operation that might eliminate the need to further examine if two > geoms are intersecting. > > HTH, > Brian > > On Tue, Dec 20, 2011 at 8:28 PM, Puneet Kishor wrote: > >> >> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: >> >>> Chop up the continents into smaller pieces. >>> >> >> >> hmmm... I am not sure I understand the above. And then what? UNION each >> smaller piece query? >> >> >>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor >> wrote: This is probably a really basic question... my ST_Within or >> ST_Intersects selecting points in a continent are way too slow (both take >> upward of 200 secs). SELECT Count(c_id) FROM c, continents n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; Both tables have gist indexes on the geometries. The above query has >> the following plan "Aggregate (cost=9.66..9.67 rows=1 width=4)" " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" "Join Filter: _st_intersects(c.the_geom, n.the_geom)" "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 >> width=32)" " Filter: ((continent)::text = 'North America'::text)" "-> Index Scan using pbdb__collections_the_geom on collections >> c (cost=0.00..8.30 rows=1 width=104)" " Index Cond: (c.the_geom && n.the_geom)" The table c has approx 120K rows, and the continents table has 8 >> rows.Suggestions on how I can improve this? Yes, the computer is otherwise >> very swift and modern. -- Puneet Kishor >> ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
I've been out of the loop for a bit, but try this: SELECT Count(c_id) FROM c, continents n WHERE c.the_geom && n.the_geom AND ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; The key line being: c.the_geom && n.the_geom Check out item 7.7: http://postgis.refractions.net/documentation/manual-1.5/reference.html The && uses the bounding boxes to check if they overlap. This is a very quick operation that might eliminate the need to further examine if two geoms are intersecting. HTH, Brian On Tue, Dec 20, 2011 at 8:28 PM, Puneet Kishor wrote: > > On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: > > > Chop up the continents into smaller pieces. > > > > > hmmm... I am not sure I understand the above. And then what? UNION each > smaller piece query? > > > > On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor > wrote: > >> This is probably a really basic question... my ST_Within or > ST_Intersects selecting points in a continent are way too slow (both take > upward of 200 secs). > >> > >>SELECT Count(c_id) > >>FROM c, continents n > >>WHERE ST_Intersects(c.the_geom, n.the_geom) AND > >>n.continent = 'North America'; > >> > >> > >> Both tables have gist indexes on the geometries. The above query has > the following plan > >> > >> "Aggregate (cost=9.66..9.67 rows=1 width=4)" > >> " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" > >> "Join Filter: _st_intersects(c.the_geom, n.the_geom)" > >> "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 > width=32)" > >> " Filter: ((continent)::text = 'North America'::text)" > >> "-> Index Scan using pbdb__collections_the_geom on collections > c (cost=0.00..8.30 rows=1 width=104)" > >> " Index Cond: (c.the_geom && n.the_geom)" > >> > >> The table c has approx 120K rows, and the continents table has 8 > rows.Suggestions on how I can improve this? Yes, the computer is otherwise > very swift and modern. > >> > >> > >> > >> -- > >> Puneet Kishor > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote: > Chop up the continents into smaller pieces. > hmmm... I am not sure I understand the above. And then what? UNION each smaller piece query? > On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: >> This is probably a really basic question... my ST_Within or ST_Intersects >> selecting points in a continent are way too slow (both take upward of 200 >> secs). >> >>SELECT Count(c_id) >>FROM c, continents n >>WHERE ST_Intersects(c.the_geom, n.the_geom) AND >>n.continent = 'North America'; >> >> >> Both tables have gist indexes on the geometries. The above query has the >> following plan >> >> "Aggregate (cost=9.66..9.67 rows=1 width=4)" >> " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" >> "Join Filter: _st_intersects(c.the_geom, n.the_geom)" >> "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" >> " Filter: ((continent)::text = 'North America'::text)" >> "-> Index Scan using pbdb__collections_the_geom on collections c >> (cost=0.00..8.30 rows=1 width=104)" >> " Index Cond: (c.the_geom && n.the_geom)" >> >> The table c has approx 120K rows, and the continents table has 8 >> rows.Suggestions on how I can improve this? Yes, the computer is otherwise >> very swift and modern. >> >> >> >> -- >> Puneet Kishor ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speeding up simple pt-in-poly lookups
Chop up the continents into smaller pieces. On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor wrote: > This is probably a really basic question... my ST_Within or ST_Intersects > selecting points in a continent are way too slow (both take upward of 200 > secs). > > SELECT Count(c_id) > FROM c, continents n > WHERE ST_Intersects(c.the_geom, n.the_geom) AND > n.continent = 'North America'; > > > Both tables have gist indexes on the geometries. The above query has the > following plan > > "Aggregate (cost=9.66..9.67 rows=1 width=4)" > " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" > " Join Filter: _st_intersects(c.the_geom, n.the_geom)" > " -> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" > " Filter: ((continent)::text = 'North America'::text)" > " -> Index Scan using pbdb__collections_the_geom on collections c > (cost=0.00..8.30 rows=1 width=104)" > " Index Cond: (c.the_geom && n.the_geom)" > > The table c has approx 120K rows, and the continents table has 8 > rows.Suggestions on how I can improve this? Yes, the computer is otherwise > very swift and modern. > > > > -- > Puneet Kishor > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
[postgis-users] speeding up simple pt-in-poly lookups
This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs). SELECT Count(c_id) FROM c, continents n WHERE ST_Intersects(c.the_geom, n.the_geom) AND n.continent = 'North America'; Both tables have gist indexes on the geometries. The above query has the following plan "Aggregate (cost=9.66..9.67 rows=1 width=4)" " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)" "Join Filter: _st_intersects(c.the_geom, n.the_geom)" "-> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)" " Filter: ((continent)::text = 'North America'::text)" "-> Index Scan using pbdb__collections_the_geom on collections c (cost=0.00..8.30 rows=1 width=104)" " Index Cond: (c.the_geom && n.the_geom)" The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern. -- Puneet Kishor ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users