Following advice from Mark and Remi, I am now tiling the troublesome layer. It's not going quickly. I'm trying an intersection of the layer with only 5 of my 10kmX10km grids and it has been running for 30 min. I have 8680 grids, which is more features than the layer I was originally working with (2709 polygons). I was hoping that intersection with squares would go much faster than an intersection with arbitrary polygons, but I can confirm that it's definitely not orders of magnitude faster.
So, I think my idea of simplifying things recursively with a quad grid is worth a shot. Mark, can you point me to your Quadgrid function? Mark suggested counting points. Counting points (and rounding to the nearest 10000) gives: select round(st_npoints(geom)/100,-2) as point100, count(*) from lancover_polygons_snap group by round(st_npoints(geom)/100,-2) order by point100 desc; 26300;1 1500;1 1100;1 800;1 600;2 500;1 400;5 300;10 200;22 100;141 0;997846 The one with about 2630000 points represents the developed land in the region (towns and cities including roads that connect them). Mark also suggested counting interior rings. select round(ST_NumInteriorRings(st_geometryN(geom,1)),-1) as rings10, avg(st_numpoints(geom)) as avgpts, count(*) as numfeatures from lancover_polygons_snap group by round(ST_NumInteriorRings(st_geometryN(geom,1)),-1) order by rings10 desc; 43140;2629905;1 4230;147121;1 3800;110083;1 2480;76515;1 1760;50261;1 1720;55576;1 1680;64588;1 1350;41826;1 1130;38059;1 970;36425;1 950;37686;1 860;32952;1 850;34657;1 820;35285;2 810;24769;1 800;28113;1 760;30726;1 740;26296;1 720;24788;2 670;18874;1 660;22428;1 640;26258;1 630;27213;2 620;22056;2 610;15771;1 560;19942;2 540;23988;1 500;15763;2 480;16598;1 470;19865;2 410;15466;1 400;16023;1 380;13536;4 370;14646;3 360;14122;1 340;12910;1 330;13273;1 320;15222;2 300;13421;1 290;9072;3 280;10025;4 270;12505;4 260;11002;5 250;9843;3 240;9980;6 230;11456;3 220;9049;3 210;9194;6 200;8586;5 190;7089;5 180;6796;8 170;6913;8 160;6955;6 150;5911;10 140;5769;15 130;5377;13 120;5301;17 110;5246;17 100;4422;28 90;4332;28 80;3664;35 70;3292;43 60;2860;66 50;2236;74 40;1882;141 30;1493;215 20;993;618 10;445;3325 0;20;993265 -- John Abraham j...@hbaspecto.com 403-232-1060 "POSTGIS="2.0.3 r11128" GEOS="3.3.8-CAPI-1.7.8" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08" LIBXML="2.7.8" LIBJSON="UNKNOWN" (core procs from "2.0.3 r11132" need upgrade) TOPOLOGY (topology procs from "2.0.3 r11132" need upgrade) RASTER (raster procs from "2.0.3 r11132" need upgrade)" On Feb 24, 2015, at 3:31 PM, Rémi Cura <remi.c...@gmail.com> wrote: > Hey, > I highjack this interesting discussion, sorry =) > > If you have one single massive Polygon (no multi). > You can generate a point table (keeping path of course) with ST_DumpPoints. > > Then you construct an index on this table. > Then you perform your Intersection with indexed points, which will be very > fast. > Then you reconstruct parts of polygon. > It will require some advanced to very advanced SQL (like windows function) to > make it work tough. > > I think it would be faster than other alternatives, because basicaly it > amounts to using the optimized R-Tree index on points, vs a custom and > non-otpimized Quad tree index on polygon. > Moreover when using a recusrive strategy, you end up computing the same think > a lot with geos. > > Cheers, > Rémi-C > > 2015-02-24 23:10 GMT+01:00 Mark Wynter <m...@dimensionaledge.com>: > > > Thanks Mark. I will indeed do the st_dump, but I don't think it will help > > much (very few multis). I think the tiling will help a lot. What I > > wonder, though, is how long it will take to tile? Afterall, it's still an > > st_intersection operation to tile each geometry against each tile. > > I’ve almost finished writing the tutorial - where I address many of these > points. The variables that affect performance are: > * how you’ve written your ST_Intersection query > * multi vs. non-multi > * size and complexity of geoms > * no. available CPUs (for parallelisation) > * tile batch size - important!!! > > All strategies in combination may be necessary if your queries are taking > forever. > > For the demonstration dataset (a multi polygon representing whole of > Australia), my tutorial tiling query incorporates ST_Intersection and > ST_Difference subqueries to produce tiled features representing land and > water. > I achieved a 49x reduction in the query time to tile the whole of Australia, > starting with a single multi polygon. > > The more complex the query, the more significant this time saving is in > absolute terms. > > > Is there a quad tree type tiling algorithm in a function? If I do 256 x > > 256 tiles, doing it all at once would be 65536 operations of > > st_intersection(complex_geom, square). With a quad tree I'll only have 4 > > operations of st_intersection(complex_geom, square), and then 16 of > > (a_little_less_complex_geom, square), and then 64 of > > (even_less_complex_geom, square) and so on, for 8 nests. The last one will > > still be 65536 operations, but the complex geoms should be a lot simpler by > > the time I get down to that level. What do you think, is this worth > > trying? Or is intersecting with a square fairly simple regardless of how > > complex the other geometry is? > > I do have a SQL quadgrid tiling function - where a cell divides recursively > subject to a maximum number of levels or “value thresholds” - but I’m not > sure if that’s the right approach. > _______________________________________________ > postgis-users mailing list > postgis-users@lists.osgeo.org > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users > > _______________________________________________ > postgis-users mailing list > postgis-users@lists.osgeo.org > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
_______________________________________________ postgis-users mailing list postgis-users@lists.osgeo.org http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users