Re: [postgis-users] replicating features on self-join

Martin Davis Tue, 25 Sep 2012 21:20:19 -0700

This is one of those cases that SQL seems to make unnecessarilydifficult. Writing this query in a procedural way would be trivial, sowhy is it so hard to express in a declarative relational way?

DISTINCT ON (a.id) is the classic non-standard Postgres way of handlingthis.

The other hack to get uniqueness of the driving table is to GROUP BYa.id. You then need to use appropriate aggregation functions tomaintain the dependent columns from a. You might be able to use MAX onnewclass. MAX might work for the other invariate a columns too, but itmight be faster to use FIRST. This is not built into Postgres, but isavailable as an extension:


http://wiki.postgresql.org/wiki/First/last_%28aggregate%29

Or, you could just SELECT a.id, MAX(newclass) and then join back to a torecover the dependent columns.

No idea whether any of these ideas will be performant. Maybe someoneelse has a more clever way of doing this.


On 9/25/2012 10:48 AM, William Kyngesburye wrote:

I'm trying to do a series of joins to classify some lines based on other lines 
they are touching in the same table and whether they are in polygons in another 
table, and lines are replicating.

in a simplified form, it's something like (I imported with GDAL, so all my 
geometry fields are wkb_geometry):

select a.wkb_geometry, a.id, a.name, a.class,
   case
     when a.class = 'a' and b.class is not NULL and (d1.newclass = 'b' or 
d2.newclass = 'b') then 'b'
     ...
     else a.class
   as newclass
from source as a left outer join polys as b st_coveredby(a.wkb_geometry, 
b.wkb_geometry)
left outer join source as c1 on st_touches(st_startpoint(a.wkb_geometry), 
c1.wkb_geometry)
left outer join source as c2 on st_touches(st_endpoint(a.wkb_geometry), 
c2.wkb_geometry)

All lines are replicated 4-9 times (maybe more that I didn't notice), each 
duplicate with different newclass values though some newclass values duplicated 
also (I suppose dependent on which join it came from).

How can I stop the replication?  Probably some SQL thing, the different join 
types still confuse me some.

Adding a DISTINCT on the id to the select (and a sort to make that work) adds a 
LOT to the processing time processing time.

-----
William Kyngesburye <kyngchaos*at*kyngchaos*dot*com>
http://www.kyngchaos.com/

"Oh, look, I seem to have fallen down a deep, dark hole.  Now what does that remind 
me of?  Ah, yes - life."

- Marvin


_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.2221 / Virus Database: 2441/5290 - Release Date: 09/24/12


_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

Re: [postgis-users] replicating features on self-join

Reply via email to