Hello Frederik, ok, it really must have been late. :-) Thank you for the explanation, sounds perfect.
I wouldn't call it a bug at all because it may be necessary to keep such delete requests: Let's say you found an out-of-date .osm file and want to update it. You guess, the file is from last Saturday 12:00 but you're not sure. Therefore you cumulate replication diffs for the time range between Saturday 10:00 (2 hours earlier) and today. Let's further assume that a node had been created at 10:15 and was deleted at 11:45. This node would be excluded from an "ideal" simplified diff. If the old .osm file in question in fact has the state of Saturday 11:00, it would know about the created node but never become aware of its deletion. In the end: I'm happy about this "bug". :-) However this doesn't make it easier to determine how much data you lose in taking the normal diffs instead of the replicated ones. But eventually I will get the answer... somehow. Markus -------- Original-Nachricht -------- > Datum: Mon, 07 Nov 2011 09:06:32 +0100 > Von: Frederik Ramm <frede...@remote.org> > An: mar...@gmx.eu > CC: dev@openstreetmap.org > Betreff: Re: [OSM-dev] Incomplete diffs? > Hi, > > On 11/07/2011 02:24 AM, mar...@gmx.eu wrote: > > # normal diff > > $ zcat 20111103-20111104.osc.gz |grep -c "timestamp=\"2011-11-03T12:" > > 58968 > > > > # replication diff > > $ cat 1103-1104.osc |grep -c "timestamp=\"2011-11-03T12:" > > 59068 > > > > And yes, I thought on cumulating the version in the second file before I > started counting with grep. > > I think you may have found a bug in Osmosis' --simplify-change > algorithm. (Or, if you created the above 1103-1104.osc file yourself, > you have re-implemented a bug already present in Osmosis.) > > Both the normal diff and the daily diff are correct as far as I can see, > but the simplified version that you created - the one with 59068 > elements - is not. > > An object created earlier on that particular day and deleted between > 12:00 and 13:00 will not show up in the normal daily diff: > > $ zgrep -A1 -B1 '<node id="1490162262"' 20111103-20111104.osc.gz > $ > > It will show up twice in the replication diff, once for creation and > once for deletion: > > $ zgrep -A1 -B1 '<node id="1490162262"' 1103-1104.osc.gz > <node id="1490162261" version="1" timestamp="2011-11-03T08:09:48Z" > uid="419929" user="hoti" changeset="9728137" lat="47.4399545" > lon="16.4376938"/> > <node id="1490162262" version="1" timestamp="2011-11-03T08:09:48Z" > uid="547666" user="Igor Kurvanor" changeset="9728123" lat="45.7510611" > lon="6.2813975"/> > </create> > <delete> > <node id="1490162262" version="2" timestamp="2011-11-03T12:42:36Z" > uid="547666" user="Igor Kurvanor" changeset="9730094" lat="45.7510611" > lon="6.2813975"/> > </delete> > $ > > Now if such a replication diff is simplified with Osmosis, in my opinion > it should drop the node altogether, but what it does is it always keeps > the highest version even if that corresponds to a deletion that > counteracts a previous creation: > > $ osmosis -q --read-xml-change 1103-1104.osc.gz --simc > --write-xml-change - | grep -A1 -B1 '<node id="1490162262"' > <delete> > <node id="1490162262" version="2" timestamp="2011-11-03T12:42:36Z" > uid="547666" user="Igor Kurvanor" changeset="9730094" lat="45.7510611" > lon="6.2813975"/> > </delete> > $ > > Now this is a minor bug because I don't know any consumer that will trip > on a deletion request for a non-exisitng object but still it is a > behaviour that I would not have expected. Anyway, it should explain the > discrepancy you are seeing. > > Bye > Frederik _______________________________________________ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev