Hi, Frederik Ramm wrote: > I just did a little test, prepared an .osc document that removed the > node tags from about 1000 nodes: > http://www.openstreetmap.org/browse/changeset/1894387 > It came out at roughly 10 node changes per second.
Some more tests made directly from the dev server suggest that performance is around 20 changes per second, slightly deteriorating if you upload too many changes in one diff upload (the peak performance seems to be at around 1k-2k changes per diff upload). Anything larger than 10k changes per diff upload is not feasible (you get into a territory where you have to manually increase default timeouts and all that), and also takes performance down into the 10-15 changes per second range PLUS increases the probability of having edit conflicts. If we wanted to do this cleanup through normal API requests, the best way thus seems to be dividing the data into roughly 88k batches of 2k edits each and uploading them as diff uploads; possibly grouping them in changesets of up to 25 batches each (=50k edits), which would result in roughly 3500 changesets. Each diff upload would take about 100 seconds, each changeset would take about 40 minutes, we'd be doing about 30-35 changesets per day and finish the thing after about 100 days (some time in November if we start soon). An average day in OSM currently has roughly 150k node modifications. For the 100 days of this operation, this would increase to 1.5m node modifications (factor 10). An average daily OSM diff currently has roughly 200 MB uncompressed (somedays it's 100 MB, some days it's 400 MB). For the 100 days of this operation, daily diffs would be approximately 150 MB larger, increasing the strain on downstream systems by roughly 75%. I have not done any osm2pgsql testing. If it is clever then it will detect that no geometry change has been effected by the node modification and the additional cost would mainly result from having to parse 75% more node updates. If however it automatically re-calculates the geometry of every way that contains a modified node, then it is likely that any osm2pgsql based sites running incremental updates would take anywhere between 2 and 10 times as long to process updates during the 100 days of this operation. Everything said here is of course highly speculative and based on the haphazard assumption that our systems always perform roughly as they did when I did my tests. Bye Frederik _______________________________________________ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev