I'm working on updating the tables instead. The idea was to save a lot of preprocessing time on the whole table, and morph the classifications when extracting small regions.
distinct on + order by (to get the same priority of reclassification as the case statement) was really slow even for a small region. I imagine group by would be similarly slow, and as you say other attributes would have to be restored in another step. st_touches() appears to be the time consuming part. I'm trying st_distance() with start-end points on both sides (4 joins) to speed this up. st_coveredby() seems to go a lot faster than the touch join. On Sep 25, 2012, at 11:19 PM, Martin Davis wrote: > This is one of those cases that SQL seems to make unnecessarily difficult. > Writing this query in a procedural way would be trivial, so why is it so hard > to express in a declarative relational way? > > DISTINCT ON (a.id) is the classic non-standard Postgres way of handling this. > > The other hack to get uniqueness of the driving table is to GROUP BY a.id. > You then need to use appropriate aggregation functions to maintain the > dependent columns from a. You might be able to use MAX on newclass. MAX > might work for the other invariate a columns too, but it might be faster to > use FIRST. This is not built into Postgres, but is available as an extension: > > http://wiki.postgresql.org/wiki/First/last_%28aggregate%29 > > Or, you could just SELECT a.id, MAX(newclass) and then join back to a to > recover the dependent columns. > > No idea whether any of these ideas will be performant. Maybe someone else > has a more clever way of doing this. > > On 9/25/2012 10:48 AM, William Kyngesburye wrote: >> I'm trying to do a series of joins to classify some lines based on other >> lines they are touching in the same table and whether they are in polygons >> in another table, and lines are replicating. >> >> in a simplified form, it's something like (I imported with GDAL, so all my >> geometry fields are wkb_geometry): >> >> select a.wkb_geometry, a.id, a.name, a.class, >> case >> when a.class = 'a' and b.class is not NULL and (d1.newclass = 'b' or >> d2.newclass = 'b') then 'b' >> ... >> else a.class >> as newclass >> from source as a left outer join polys as b st_coveredby(a.wkb_geometry, >> b.wkb_geometry) >> left outer join source as c1 on st_touches(st_startpoint(a.wkb_geometry), >> c1.wkb_geometry) >> left outer join source as c2 on st_touches(st_endpoint(a.wkb_geometry), >> c2.wkb_geometry) >> >> All lines are replicated 4-9 times (maybe more that I didn't notice), each >> duplicate with different newclass values though some newclass values >> duplicated also (I suppose dependent on which join it came from). >> >> How can I stop the replication? Probably some SQL thing, the different join >> types still confuse me some. >> >> Adding a DISTINCT on the id to the select (and a sort to make that work) >> adds a LOT to the processing time processing time. >> >> ----- >> William Kyngesburye <kyngchaos*at*kyngchaos*dot*com> >> http://www.kyngchaos.com/ >> >> "Oh, look, I seem to have fallen down a deep, dark hole. Now what does that >> remind me of? Ah, yes - life." >> >> - Marvin >> >> >> _______________________________________________ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users >> >> >> ----- >> No virus found in this message. >> Checked by AVG - www.avg.com >> Version: 2012.0.2221 / Virus Database: 2441/5290 - Release Date: 09/24/12 >> >> > > _______________________________________________ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users ----- William Kyngesburye <kyngchaos*at*kyngchaos*dot*com> http://www.kyngchaos.com/ The equator is so long, it could encircle the earth completely once. _______________________________________________ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users