At 2012-05-13 02:49, Frederik Ramm wrote:
Removing ele=0 from objects is, in my opinion, totally unnecessary;
And maybe incorrect, as ele=0 means we know the elevation is 0, while no
ele tag means we do not know the elevation.
like created_by, over which WorstFixer made a similar fuss, such
information could be removed where an object is touched for some other
reason but I don't see why it would have to be mass-removed.
The reason for this may not be obvious to some. I assume it's because we
store history of all objects, and it's a waste of space, not to mention
bandwidth and processing resources to push the changes out to the mirrors,
for almost no benefit. I just add "created_by=''" to my JOSM presets (or
maybe it does this automatically now) so I clean it up when performing
other edits.
Even so, a mass-removal would be ok if proposed, discussed, and accepted
by the community like we expect everyone to; it's not ok to just do it on
your own and see if someone notices.
Yes. Having said all that, OSMTI says there are 23 million nodes (33% of
the total) with created_by tags! This seemed surprisingly high to me.
I retrieved nodes from 300 random 0.1x0.1 degree bboxes. Of those, only 37
returned any nodes at all**. All but 6 of those areas had no "created_by"
tags on their nodes. Of those, only 2 were significant in percentage*, both
in Norway.
#137 had 1558 nodes, 801 of which (51%) have created_by tags. BLTR:
68.137 13.766 68.237 13.866
#264 had 2297 nodes, 1946 of which (85%) have created_by tags. BLTR:
60.787 4.900 60.887 5.000
In #137, they are mostly tagged:
<tag k="created_by" v="JOSM"/> (TI says this makes up 63% of the values)
In #264, they are mostly tagged:
<tag k="created_by" v="almien_coastlines"/> (TI says this makes up 10%
of the values)
<tag k="source" v="PGS(could be inacurately)"/>
My questions are:
1. Would removing the created_by from 33% of the nodes in the database save
significant storage space, dump size, backup time, etc.?
2. Is it possible to remove these in bulk from the database without having
to keep the history, push those diffs to mirrors, etc.? Do the mirrors
occasionally start fresh from a new dump? Or can they run the same bulk
purge? Or do I overestimate the necessity of doing it this way (and we can
just clean it up with the regular tools and processes)?
* While not a significant portion of the total nodes in the area (only 4%),
there were almost 600 created-by-tagged nodes in this file from England:
#123 had 14013 nodes, 594 of which (4%) have created_by tags. BLTR:
51.086 0.088 51.186 0.188
** I guess this clarifies why old satellites that fall from their orbits
and other space junk never seem to hit anything, even if they survive
re-entry :)
--
Alan Mintz <alan_mintz+...@earthlink.net>
_______________________________________________
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us