For your query #1, it looks like you are computing the ST_Intersection
twice. Does Postgres optimize this away? IF not, you might want to use
a subquery to avoid this expensive second computation.
I also agree with Chris, that query #2 is probably not doing what you
want it to. What you need to do is for each parcel, subtract the union
of the water features covered by it. This is still likely to be slow,
however.
A general comment is that ArcGIS is using a very different approach to
compute erase and intersect (aka overlay). It evaluates the entire set
of geometries together, rather than piece-wise like the SQL query is
doing. This generally results in much better performance for large
datasets, since there is less I/O and more efficient algorithms available.
By it's nature, using SQL for spatial computation is most efficient for
operations which can be carried out in a feature-wise manner.
Unfortunately, overlay does not fall into this category (since there is
a large amount of interaction between features.
Implementing a more efficient overlay algorithm in PostGIS is a nice
challenge for the future...
On 8/24/2011 8:18 PM, Sheara Cohen wrote:
Hi all --
I have what is likely to sound like the newbie question it is. I am in
the process of shifting some of our modeling workload from ArcGIS to
PostGIS. While PostGIS seems much faster for most non-spatial
operations, I'm finding the exact opposite for spatial operations like
"erase," "intersect," etc. And I'm sure there is some basic thing I
just don't know about how to write these scripts to get fast performance.
Below are details for two issues I have run into.
1."Intersect": In ArcGIS, I used the intersect tool to return a
polygon file from the intersection of two different polygon files.
They were large input files -- one the size of the state of California
and one the size of a county in California, both with between 200-300
thousand records. In ArcGIS, this took 43 minutes. In PostGIS, I used
the script below, and it took over 17 hours.
CREATE TABLE public.fresno_parcels_lt_intersect as
SELECT
ST_Intersection(p.wkb_geometry,
lt.wkb_geometry) as wkb_geometry,
id_parcel,
(st_area(ST_Intersection(p.wkb_geometry, lt.wkb_geometry))) *
0.000247105381 as acres_lt_parcel,
Landtype
FROM fresno_parcels_unique_id as p, ca_landtypes_010211 as lt
WHERE ST_Intersects(p.wkb_geometry, lt.wkb_geometry);
2."Erase": In ArcGIS, I used the erase tool to remove water features
(polygons) from a county parcel file. Both files were large. The water
features covered the state of California with 100K records, and the
parcel file had almost 300K records. In ArcGIS, this took 17 minutes.
In PostGIS, I had to cancel the run after 16 hours. I used the script
below.
CREATE TABLE fresno_parcels_minus_ca_water as
SELECT ST_GeomFromWKB (ST_Difference (wkb_geometry
(ca_water_final_082211), wkb_geometry (fresno_parcels_clean)))
FROM ca_water_final_082211, fresno_parcels_clean;
I added a spatial index to all of the input files in PostGIS (CREATE
INDEX ____ ON ____ USING gist (wkb_geometry)). Do any of you all have
suggestions as to how to make these sorts of operations run more quickly?
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users