At 2012-05-15 15:21, Toby Murray wrote:
On Tue, May 15, 2012 at 4:24 PM, Alan Mintz
<alan_mintz+...@earthlink.net> wrote:
> Yes. Having said all that, OSMTI says there are 23 million nodes (33%
of the
> total) with created_by tags! This seemed surprisingly high to me.
Err last time I checked we had over 1 billion nodes. So 2% not 33.
I'm guessing that Taginfo drops (understandably) any objects without tags
before it analyzes the data, and the percentages are based on this filtered
number of objects. There are 1,458,341,105 nodes, only 70,486,257 of which
have any tags. 33% of _those_ (70M) have created_by tags.
TI maintainers: Would you think about basing the percentages on the total
unfiltered counts and possibly adding rows for <no tag> to the lists where
appropriate?
> My questions are:
>
> 1. Would removing the created_by from 33% of the nodes in the database save
> significant storage space, dump size, backup time, etc.?
>
> 2. Is it possible to remove these in bulk from the database without having
> to keep the history, push those diffs to mirrors, etc.? Do the mirrors
> occasionally start fresh from a new dump? Or can they run the same bulk
> purge? Or do I overestimate the necessity of doing it this way (and we can
> just clean it up with the regular tools and processes)?
Not even the license change bot is going to completely delete/hide
history and I think it is going to be the biggest automated change in
the history of the project. It will cause some parts of the history to
be hidden from public view but they will continue to exist in the
database. Makes me wonder... how many created_by tags are going to be
nuked by the license change bot? :)
I can understand why that is - it's being worked on by many people, may
need partial revertability, will probably run for a long time, etc. Removal
of one tag in bulk doesn't present these issues, and may be possible, which
is why I'm asking: a) does it help; and b) is it possible?
--
Alan Mintz <alan_mintz+...@earthlink.net>
_______________________________________________
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us