2011/2/22 Peter Budny <pet...@gatech.edu>: > Anders Arnholm <and...@arnholm.se> writes: > On the contrary... the bigger the database, the more we need tools to > help us understand and manipulate the data. When there are only 100 > POI nodes in a city, I can easily check them all by hand. When there > are 100000, that's when automated or semi-automated tools are necessary.
to do what? Why would you want to check all the POIs? If I come over something that is missing I add it, if there is something that is not there in reality and I am aware I delete it (or more often move it to the right position). The more the data gets used, the more the errors get found. OSM is a project with tens of thousands of contributors but we will need millions of users and possibly a back channel to maintain all the data. No bot on earth can tell you if a POI is at the right position, is well described or is there at all (in the real world). > Sorry, I avoided your question. As for imports: the bigger OSM gets, > the harder it is to ensure coverage. If I got the supposed McDonald's > POI dataset, how would we know whether OSM already has 100% of them, or > only 98%? ask McDonald's or even better, let them check ;-) Who cares if we have all McDonald's if not they themselves? Besides that your question is very simple: count them. > This discussion has somehow conflated robots and tools with imports, and > that may be partially my fault. But if we had better tools for > performing imports, it might be easier to stitch them together with > existing hand-edited data, and imports wouldn't be such a destructive > process. While I am not generally against imports I began to be more and more against them in past few years. The benefit of publicly available data imported in our database is very little in respect to a parallel crowd sourced dataset (e.g. you could also compare them one against the other to find problems in either one). I don't know about the quality of the publicly available data in the US (I guess TIGER was a super-neat and up-to-date dataset, but my knowledge is based on people admiring it here on the ML) but so-called "official" data which I have seen is often worse than what people think about it. Nobody can actually afford to spend so much time on the data like we do ;-). cheers, Martin _______________________________________________ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk