On Fri, May 31, 2013 at 3:02 PM, stevea <stevea...@softworkers.com> wrote:
> Clifford Snow writes: > > First you need to define what good data quality is and second, you need to > collect data to measure data quality. Once good data is collect then start > determining root cause of the problem. > > > Most of what I see is anecdotal evidence of problems. Fixing the cause of > those problems is good, but it may not get at the underlying issues. > > > I say +1 to this, but it is nebulous as to be only broadly helpful. > Clifford, care to flesh that out a bit? > You mean you could sense what I was trying to say? Needless to say, I tend to be a bit terse with my emails. So let me try a slightly longer version. We need quality standards that can be measured. We can and should have standards for mapping objects and ways. With those standards a quality control sampling process could be initiated to test the quality of new edits as well as the existing data. With a sample of data we could build a histogram of errors. Ideally tackling the largest column. Even a small sample size can work. Statistical Process Control in a manufacturing process only samples some 20 items. This isn't a manufacturing process, but the principles are the same. Unfortunately, some of what we do is subjective. Take the recent issue of tagging Subway sandwich shops that was recently discussed on one of the mailing lists. Everyone had a valid solution. Maybe some were more valid that others, but anyone of them was workable. Yet tagging POI is an important step to get right. Adding a node to say this is a bus stop, when it isn't is very clearly a data quality issue. It can be measured. The path of a highway can be determined to track gps traces or Bing images. It can be measured. However, is it accurately tagged as a primary, secondary, tertiary, etc. is somewhat subjective. Tackling the subjective is more difficult. For example, the Subway sandwich shop. If we had hard and fast rules it that every Subway be tagged as amenity=fastfood then we could easily do a quality check. But OSM give people a lot of tagging freedom. One last thing. My sense is that the problem generally isn't the mappers. Yes I screwed up more than my fair share of edits. But most problems are system problems. To fix those we need good data and a willingness to get at the root cause of the problem. Short summary: sample edits, categorize errors, determine root cause, then fix root cause. That process will drastically improve the quality of OSM. Hopefully someone with more recent background in Quality Control can step in here to help me out. -- Clifford OpenStreetMap: Maps with a human touch
_______________________________________________ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us