This thread is tedious, but I will soldier on.

Gleb Smirnoff <gleb...@glebi.us> writes:
> Looks like the acception of multipolygons here is not as bad as I initially
> read in this email thread. So, we agree that at some level they are easier
> to maintain that shared nodes. Do we?

I cannot offer you an absolute Yes or No here:  it depends.  Sometimes yes, as 
in the many excellent use cases that Kevin lists as examples (and more!), often 
or usually No when the data are "curated" or part of an import which do not 
already use "shared ways" as OSM might were it to be as efficient as that 
technique allows.  Again, I believe it important to say that one or the other 
might be "more correct" than the other given some particular circumstances, 
however it would be kind to say that neither is fully wrong, they are simply 
different styles of entering data.  One is more efficient, and allows those 
efficiencies to propagate forward into a future where editing things "around 
them" is easier (due to not needing to replicate a large number of shared 
nodes).  Another is simpler (to edit, to understand by novice editors, to 
improve with newer data as part of an import or curated data...) and while 
simple might take up more space in the planet.osm file, there are 
sometimes-good reasons where "simpler is better."  However, "simpler is better" 
is not always true, and shouldn't "win" by default.  OK, I have beaten that 
point to death by now.

> Of course multipolygonizing couple of buildings that touch coastline in
> Monterey was wrong. Sorry, I was in a multipolygonizing rage as I was
> going through the coastline. :)

Apology accepted!

> But the rest of the coastline? The nodes can be shared by any of:
> 1) coastline, 2) beach 3) county & state borders 4) marine preserve.
> And there are ten's of thousands of nodes, because this is a natural
> crazy curved line. This is the most clear example of where multipolygons
> are way easier to draw and maintain than running multiple lines through
> the same set of nodes!
> 
> O> ... it is the process of CONVERTING existing polygons to multipolygons on 
> a widespread basis where it seems there is no good reason for this to occur 
> (and indeed even frustrates import updates).  This is what we are asking you 
> not to do (so much of).
> 
> Well, OSM started in 2006 and support for advanced multipolygons appeared
> in 2011 (correct me if I am wrong). So, at time they came in there were
> already fairly enough data in the database. Should we treat OSM as "write
> only" database? E.g. we only add data, but don't improve already entered?
> The advanced multipolygons appeared because there was a demand for them,
> so why not use them for existing data, if resulting product becomes better?

Improving existing data, especially when they are outdated or just plain wrong, 
is one of the most important things OSM can do.  (We all can agree to that).

> Now, for the import updates. Here I am starting to understand the strong
> pushback against my edits. Import updates is something I never heard
> about! Please tell me more about that. Because in all other places that
> I have edited (Russia, Ukraine, Georgia) imports were treated as something
> that comes in once, and then is adopted into OSM. Do I understand you
> correct that here you got recurring imports, where the import script needs
> to find the object it created previously and edit it? Shoudn't this object
> be protected then? At least a tag note="DO NOT EDIT ME"?

Imports are seldom, if ever, "a script."  They are nearly always carefully 
human-curated data carefully pulled into OSM with a strict, vetted process 
which includes a good deal of Quality Assurance and manual, human oversight.  
THEN, as these data age and likely become outdated, they must be (well, should 
be, by responsible importers) updated in OSM.  If in-between those two phases 
of OSM having "excellent but becoming outdated" data, polygons become 
multipolygons, but the updated data are NOT multipolygons, this greatly 
frustrates the "update the imported data" process.  It doesn't make it 
impossible, it makes it harder, as the process must be refined to include it, 
and the multipolygonization which might have occurred doesn't always have a 
"clean and easy" way to describe it that makes mechanical editing (even humans 
following a list of instructions) easy to complete.  It may be that the 
ultimate winning strategy is, as Kevin says in a recent post to this thread, to 
"convert to multipolygon where correct to do so, the imported data not being 
multipolygon be damned."  I take a deep breath as I do so, but I tend to agree.

> O> > A longer version (I'll try). I assume we all agree that overlapping
> O> > or not reaching polygons where there is adjacency on the ground is
> O> > wrong.
> O> 
> O> "Not-reaching," meaning they create small gaps or "gores," yes, those 
> polygons are technically wrong.  Polygons with overlapping ways, even where 
> they share nodes (and even if they don't share nodes), no, those are not 
> wrong.  You may believe that these are "sloppy" or have superfluous data, and 
> you may even prefer your multipolygon approach, but what that does is 
> replaces simple and correct data with complex and correct data.  I and others 
> here see little point in doing that, especially as it frustrates beginners 
> and complicates import updates.
> 
> Actually by overlapping I meant polygons with non-zero shared surface.
> I still assume you agree that this is wrong.

"Non-zero shared surface" is effective language to mean "overlapping," thank 
you.  Yes, that sort of overlapping is wrong, for example where a 
landuse=forest overlaps with a landuse=residential — they can't be both, and 
any overlap is an error.  We agree.

> Those that share ways, or share nodes, or use different nodes with
> exactly same coordinates, aren't overlapping, they are adjacent. Yes,
> they are technically correct! However, maintaining them is a hell.
> If you want to create an object that reuses already existing curves
> in the database, you need to do a lot of click-job.

This might now be a "different strokes for different folks" discussion:  I 
don't find one or the other to be any more difficult to edit or maintain than 
one or the other, EXCEPT as they are part of an imported dataset which IS 
different.  THEN it is "hell."  Perhaps THAT is the key point.  Sometimes 
"large click job" can be simplified with a Merge and/or Split commands (copy 
existing data, extend it, perhaps, or connect it to something else relatively 
easy to edit, MERGE with existing data, done).  I suppose what I might be doing 
is imagining that I can prevent some hard work from being done by being 
insistent.  If so, I CAN have my mind changed!

> O> > So how can we properly express adjacency?<redacted for brevity>
> O> 
> O> We know.  We agree.  We simply don't think this is a good idea to go and 
> do this on existing data (on a medium- or large-scale, as you and your JOSM 
> plugin do) where to do so simply isn't needed, and indeed complicates further 
> data editing.
> 
> I still stand that it makes easier further data editing. :(

Yes, "different strokes," yet I don't think we are that far apart, as I am 
facile with both, EXCEPT when the data are part of an import and later updates 
make things messy because of multi- vs. not multi-.  Messy is hard, but maybe 
the work simply needs to be done.

> All these coastline multipolygonizing was prerequisite to importing
> State Marine Reserves. And indeed after preparations adding SMR
> boundaries was 100x times easier. Here is example changeset of
> adding a couple of SMRs once coastline is multipolygon:
> 
> http://www.openstreetmap.org/changeset/47115827
> 
> How small and concise it is! And if next year Department of Wildlife
> will announce a new one, adding it would be again a minute task.

See, you have to start from "once coast is multipolygon."  Your approach to 
such "existing data (import, official data, whatever) which gets updated" 
severely breaks down when that is not true.  Please, start from a place where 
existing data are NOT multipolygon, and you see how difficult this gets.  I 
know you wish the whole planet's data were already multipolygon, but we are a 
long distance away from that.  Maybe, someday, we will be closer or even 
largely there.  In the meantime, we must accommodate both kinds of data.  I 
think we can agree that mixing them is OK, but mixing them where we are 
anticipating an update to them "back to" their other method presents 
difficulties.

> O> ... In the meantime, let's agree that polygons are also correct data 
> structures to use, and indeed are sometimes even preferred (as with imports). 
>  They are not wrong, they are not sloppy, they might use a bit more data, but 
> to many, they are preferred.  It is possible for more than one style of data 
> to represent accurately the truth on the ground.
> 
> Sure, I don't claim that polygons are incorrect. I just find out that
> at some level of map detail and fullness they are much more difficult
> to maintain than advanced multipolygons. Yes, I mistakenly assumed
> that everyone is familiar with the reltoolbox plugin, which is an
> important part of making multipolygons easier.

Candidly, I have never heard of reltoolbox, but that means very little.  I ask 
you to understand (thank you for the good dialog!) that IN THE CASE OF IMPORTED 
(polygon) DATA WHICH GET UPDATED, having those data convert from polygon to 
multipolygon data, and then trying update them is quite frustrating.  I think 
you do understand that, and while importing has its believers and detractors, I 
think everybody agrees that a successful import that has a dedicated importer 
(like me) who is willing and able to UPDATE these data, and does when newer 
data become available, that we wish to encourage this behavior, not frustrate 
it.

> Here is example, please open the area around this meadow in JOSM:
> 
> http://www.openstreetmap.org/relation/5926445#map=15/55.8569/37.2520
> 
> Just browse around in JOSM and try to do edits without committing them.
> Add imaginary natural preserve there, or a military closed area. Or imagine
> you want to go to higher detail, so you want to split existing forest into
> a forest and a scrub. Anything that follows existing lines, can be instantly
> added, with as much clicks as there are _lines_ in the boundary. Lines, not
> nodes!

Sure, but you start from multipolygon and say "let's keep going multipolygon."  
That not only doable, it's as easy as you say.  But, read on.

> Compare that area to what we have around Santa Cruz. Imagine we could do
> it better :)

Gleb, if you wish to take on the >3000 polygons of SCCGIS landuse import (now 
in its Version 3, and V4 might be in 2019 or 2020), and convert the whole 
county to multipolygon KNOWING that the new V4 data will be "only" polygon (and 
must be updated accordingly), I invite you to do so.  But, take it from me, 
that is a mighty huge amount of work:  I know, I've manually inspected those 
thousands of polygons many times.  While I do wish to "imagine" it, I DON'T 
wish to imagine the amount of work the conversion would be.  I prefer to step 
aside and let someone else do that!

Polygons:  good.  Multipolygons:  good, sometimes better than polygons, or even 
the only way to do something.  One "better" than the other?  I'm not sure it is 
ALWAYS true that multipolygons are better.  Usually, often, mostly?  Yes, 
especially in the use cases that Kevin noted (and more).  Always?  No.  
Converting from one to the other when an import or curated data are involved?  
Mmmm, not by me, unless a great deal of effort is expended to do what amounts 
to a complex data translation.  This is often a difficult nightmare of editing, 
and we shouldn't discourage updating imported data.

Good dialog, even if it is tedious!  (I don't know if we're solving anything, 
but I appreciate that there is more light than there is heat).

SteveA
California
_______________________________________________
Talk-us mailing list
Talk-us@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-us

Reply via email to