With previous large uploads I have experience the same behaviour resulting
in massive dupes. So I guess it is not a conversion issue.
I don't have experience with conversions nor (mass) imports -- but I _have_
had massive dupes problems a number of times when uploading larger
amounts of data with JOSM over a bad connection. The problem has always
been related to the combination of large uploads and bad connections where
(if I understand right) the JOSM data upload connection gets a hick-up at
some point and isn't able to finish the job -- and doesn't leave a note for
itself where it was left of. Then, because of reasons I don't _exactly_
understand there's duplication of data on the next upload(s (attempts)).
My vague understanding is that this is due to at least the fact that JOSM
uploads nodes first and only after that the information about ways (i.e.
which nodes belong to which ways). And then when it hasn't gotten or
confirmation for succesful uploads (or it hasn't recorded that to it's data
file(?)) it considers the uploaded nodes to still be new at next upload(s
(attempts)).
I feel that duplication sometimes happens also to partial uploads where the
ways have uploaded, too, resulting in duplicate uploaded ways but I haven't
documented this well enough to say this solidly.
If you have a bad connection / feel that this may be your problem it is a
good idea to tweak the JOSM Advanced upload settings (Upload Advanced
tab: Upload data in chunks of objects. Chunk size: , where is
your number of objects per chunk. I use 200 in with my Haitian connection.
Cheers,
-Jaakko
http://osm.org/user/jaakkoh
--
jaa...@helleranta.com * Skype: jhelleranta * Mobile: +509-37-269154 *
http://go.hel.cc/MyProfile
On Thu, Mar 22, 2012 at 8:28 AM, Marc Zoss marcz...@gmail.com wrote:
Nick and Josh
thanks for the clarification on your upload strategy. With previous large
uploads I have experience the same behaviour resulting in massive dupes. So
I guess it is not a conversion issue.
If you want me to commit the remove duplicates changeset, I can do so. But
you will have to go through the data subsequently and check if the issues
are resolved and no new ones emerged.
M
On 22.03.2012, at 14:12, Nick Chamberlain wrote:
Josh and Marc,
Thank you! I apologize that I'm unable to speak the OSM language as
well as everyone, I'm working on it :) I posted on the Salisbury,
Maryland Import page that Josh created to give more detail about my
uploads.
I didn't really think that I created so many duplicates, because I did a
lot of things in JOSM before I actually chose to upload. One thing I
know for sure is that I didn't I upload until I was actually able to - I
was getting a proxy error and the uploads were timing out when I
attempted to upload the entire batch. I assumed that these attempts
were unsuccessful, which I might be wrong about and might have resulted
in duplication.
I assumed that my successful attempts started, maybe @ 10901673, when I
realized I needed to break the original shapefile up tabularly into
percentiles and upload 10 segments of the building footprint dataset,
one after the other. These were all definitely successful, and were
only done once per percentile.
Josh, where are you finding the list of changesets in the format you
posted? I can only figure out how to list them in my editor profile
with my comments.
If you believe that the method you mention that removes the 71,000 nodes
is the best approach, please feel free to do so. I will also gladly
manually fix the inner ring tagging issue as the data gets fixed.
Please let me know what I can do to help. I am also willing to share
the .osm files and/or shapefiles if that will help. Thanks.
- Nick
-Original Message-
From: joshthephysic...@gmail.com [mailto:joshthephysic...@gmail.com] On
Behalf Of Josh Doe
Sent: Thursday, March 22, 2012 8:51 AM
To: Marc Zoss
Cc: impo...@openstreetmap.org; talk-us@openstreetmap.org; Nick
Chamberlain
Subject: Re: [Imports] [Talk-us] Uploads to City of Salisbury, MD
On Thu, Mar 22, 2012 at 8:04 AM, Marc Zoss marcz...@gmail.com wrote:
I briefly downloaded all sby:bldgtype-tagged ways and relation of
Maryland through the overpass-api. Then removed the ones having only a
sby:bldgtype tag, run the validator and deleted the duplicated nodes and
ways.
This would result in a changeset to remove the roughly 71'000
duplicates nodes and ways.
If the area was edited since the import and reverting gets tricky,
this might be the option to go, at least the result looks ok at the
first glance.
Please also note that the conversion step seems to add a building=yes
tag on on inner ring of building polygons () which is certainly bad
tagging, despite the correct rendering (52 occurrences, so could be
fixed manually).
Thanks for doing that, as that was the next step I was going to try. I