Re: [Talk-us] OSM Data Quality

2013-06-09 Thread Bryce Nesbitt
Possible drivers of quality:

   1. Peer reviewing, as a social gateway to community engagement with new
   mappers.

   2. Hiring a physiologist on retainer to understand obsessed trolls like
   NE2, and respond appropriately.

   3. Supporting single feature mappers.  There's a vibrant community of
   people who collect narrow data: for example RV dump stations.  Not everyone
   has to be an area mapper.

   4. Building tools that make it more awkward to make common mistakes.
   For example certain tags could be semi-locked (producing a educational
   warning message when altered).  source is a candidate tag for this.

   5. Building tools that show before and after as a visual diff prior
   to upload.

   6. A point system that unlocks capabilities as a mapper progresses.  For
   example new accounts may be able to edit only 10 features at a time.
   Accounts can earn and unlock additional capability with successful edits.

   7. Ongoing data imports (e.g. conflating a store's database of hours
   with OSM's cache of the same data).

   8. Using select import projects to grow the mapping community.

   9. Focusing on finding niches where Open Street Map gets used by people
   with no (current) interest in mapping.  We can't compete with Google Maps
   for driving directions: but we can *blow Google Maps away* in a huge
   variety of other ways.  *Focus on what map products would be compelling
   not to create, but to view.*
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-06-07 Thread Bryce Nesbitt
On Fri, May 31, 2013 at 12:23 PM, Mike Thompson miketh...@gmail.com wrote:

 Frederic,
 How about more mappers?
 Mike


I think the key is more users of the maps.
Not one in ten people I mention OSM to have ever heard of it: and I tend to
run with geeks, outdoor enthusiasts, graduate students, etc...
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-06-07 Thread Bryce Nesbitt
On Fri, Jun 7, 2013 at 11:24 AM, Bryce Nesbitt bry...@obviously.com wrote:

 On Fri, May 31, 2013 at 12:23 PM, Mike Thompson miketh...@gmail.comwrote:

 Frederic,
 How about more mappers?
 Mike


 I think the key is more users of the maps.


By that I mean more eyeballs on the output: more passive users of the great
maps the community creates.
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-06-07 Thread Bryce Nesbitt
On Fri, May 31, 2013 at 12:15 PM, Frederic Julien fjulie...@yahoo.comwrote:

 Dear all,
 I'm working on a presentation and interested to hear your thoughts. What
 are the top 2-3 changes that could improve OSM data quality? That could be
 processes, tools, methods, training, peer review, attributes, etc.


Peer review is definitely a candidate
...but not for the reason you might expect.  Sure it might catch the
occasional bad quality edit.

But the real goal of peer review would be to draw mappers into community.
 Make it feel like a shared effort.
Have mappers feel they are responsible and connected to other mappers.
 Share the love, get recognized for
your effort, be social about it.
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-06-07 Thread Thomas Colson
For those of you at SOTM, check out the NPS presentations: we're about to
add 10 million users (per year) we hope..

 

We'll hear about data quality issues very quickly!

 

And yes, we're hosting our own tiles. 

 

 

From: Bryce Nesbitt [mailto:bry...@obviously.com] 
Sent: Friday, June 07, 2013 2:26 PM
To: Mike Thompson
Cc: Frederic Julien; talk-us@openstreetmap.org
Subject: Re: [Talk-us] OSM Data Quality

 

On Fri, Jun 7, 2013 at 11:24 AM, Bryce Nesbitt bry...@obviously.com wrote:

On Fri, May 31, 2013 at 12:23 PM, Mike Thompson miketh...@gmail.com wrote:

Frederic,

How about more mappers?

Mike

 

I think the key is more users of the maps.

 

By that I mean more eyeballs on the output: more passive users of the great
maps the community creates. 

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Mike Thompson
Frederic,

How about more mappers?

Mike


On Fri, May 31, 2013 at 1:15 PM, Frederic Julien fjulie...@yahoo.comwrote:

 Dear all,

 I'm working on a presentation and interested to hear your thoughts. What
 are the top 2-3 changes that could improve OSM data quality? That could be
 processes, tools, methods, training, peer review, attributes, etc.

 If this sort of info is available elsewhere let me know.

 Looking forward to your answers.

 Many thanks,

 Frederic

 ___
 Talk-us mailing list
 Talk-us@openstreetmap.org
 http://lists.openstreetmap.org/listinfo/talk-us


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread John F. Eldredge
One thing that would help in the editor software would be, once you select a 
tag, and list the preset values available, to have the option to list the wiki 
descriptions of what those values mean.  This should be optional, and should 
come up in a separate window so you don't lose track of what you are editing.


Frederic Julien fjulie...@yahoo.com wrote:
Dear all,

I'm working on a presentation and interested to hear your thoughts.
What are the top 2-3 changes that could improve OSM data quality? That
could be processes, tools, methods, training, peer review, attributes,
etc.

If this sort of info is available elsewhere let me know.

Looking forward to your answers.

Many thanks,

Frederic



___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Clifford Snow
On Fri, May 31, 2013 at 12:15 PM, Frederic Julien fjulie...@yahoo.comwrote:

 I'm working on a presentation and interested to hear your thoughts. What
 are the top 2-3 changes that could improve OSM data quality? That could be
 processes, tools, methods, training, peer review, attributes, etc.


First you need to define what good data quality is and second, you need to
collect data to measure data quality. Once good data is collect then start
determining root cause of the problem.

Most of what I see is anecdotal evidence of problems. Fixing the cause of
those problems is good, but it may not get at the underlying issues.

-- 
Clifford

OpenStreetMap: Maps with a human touch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Martijn van Exel
As already noted, quality is in the eye of the beholder. That said, there
are some objective quality indicators such as positional accuracy,
completeness, resolution. I summarized this in a paper a few years ago from
another source, where I also introduced the notion of 'crowd quality' in an
academic attempt to capture specific quality considerations for
crowdsourced geospatial data:
http://www.giscience2010.org/pdfs/paper_213.pdf

Not much of an academic, I later picked this up in a more pragmatic manner
to create the notion of data temperature I presented at SOTM US 2011:
http://oegeo.wordpress.com/2011/09/19/taking-the-temperature-of-local-openstreetmap-communities/

Someone else mentioned we need more mappers. There is truth to that but we
also need to care about building out the community: how do we reduce the
churn rate, or in other words how to keep mappers involved and motivated to
continue mapping? how do we nurture the power mappers, those 5% who create
80% or more of the map data - especially in light of the large amounts of
new mappers coming in? and finally how do we make local communities work?
Latter is super important because great local data (transit, businesses,
addresses) is key to the usefulness (hey, another way of thinking about
quality!) of OSM. Great local data is something you only get if folks who
know a place, folks with different interests and from different walks of
life, work on the map together. Currently that happens in too few places. I
think one of the most important keys to making good OSM data great lies in
figuring out how to build strong local communities. In Europe, we have that
down.It all started with that. Get together and map. Have fun, figure it
out together. While traveling in Germany recently, I did not have to go
online once to find my way, my hotel, restaurant, bus stop etc. The map is
*that good*. Sure, there are more mappers per sqm there. But it is just as
much about people getting together, motivating each other, collaborating on
more complex mapping tasks (stuff like transit relations[1]). We have a
long way to go still in the US, and we may need a different approach than
Europe.

I think I just wrote half of one of my SOTM US talk. Thanks Frederic ;)

hth
Martijn

[1] http://www.overpass-api.de/api/sketch-line?network=VBBref=M1operator=


On Fri, May 31, 2013 at 2:28 PM, Richard Welty rwe...@averillpark.netwrote:

 On 5/31/13 3:15 PM, Frederic Julien wrote:

 Dear all,

 I'm working on a presentation and interested to hear your thoughts. What
 are the top 2-3 changes that could improve OSM data quality? That could be
 processes, tools, methods, training, peer review, attributes, etc.

  at one level, i agree with Clifford Snow's comment that first you need
 to define data quality.

 at another level, i think that we can talk about the following:

 1) consistency in tagging. editor improvements, better documentation,
 better
 training materials can all help with this

 2) improved processes and controls for data import (this is work that is
 happening
 on the US import committee). there are a lot of imports of the past
 that suffer
 from Quality Control issues, and lots of imports that never should
 have been
 done because of problems with the data quality.

 3) in the US (and you did ask on talk-us), identifying and dealing with
 the shaky
 Tiger data from the 2007 tiger import. some of this has been done, but
 it's an
 ongoing effort and is one of those things that is easier to say than
 it is to do

 richard


 __**_

 Talk-us mailing list
 Talk-us@openstreetmap.org
 http://lists.openstreetmap.**org/listinfo/talk-ushttp://lists.openstreetmap.org/listinfo/talk-us




-- 
Martijn van Exel
http://oegeo.wordpress.com/
http://openstreetmap.us/
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Clifford Snow
On Fri, May 31, 2013 at 2:21 PM, Martijn van Exel m...@rtijn.org wrote:

 and finally how do we make local communities work? Latter is super
 important because great local data (transit, businesses, addresses) is key
 to the usefulness (hey, another way of thinking about quality!) of OSM.
 Great local data is something you only get if folks who know a place, folks
 with different interests and from different walks of life, work on the map
 together. Currently that happens in too few places. I think one of the most
 important keys to making good OSM data great lies in figuring out how to
 build strong local communities.


Well said. I've been looking at the statistics, complements to Johan C, for
pointing me to the resource (which I can't find right now.) When looking at
the number of mappers as a percentage of population, the US lags. I'd like
to see the US agree to measure the metric, mappers per
100,000 population with a goal of drastically improving the numbers. Sorry
for being off topic, but Martijn comments were too good to pass up.



-- 
Clifford

OpenStreetMap: Maps with a human touch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Frederic Julien
Thanks to Martijn and others for their input. I'll share my presentation via 
slideshare once completed.

Please continue to share your insights :)

Kind Regards,

Frederic



 From: Martijn van Exel m...@rtijn.org
To: Richard Welty rwe...@averillpark.net 
Cc: OSM US Talk talk-us@openstreetmap.org 
Sent: Friday, May 31, 2013 2:21 PM
Subject: Re: [Talk-us] OSM Data Quality
 


As already noted, quality is in the eye of the beholder. That said, there are 
some objective quality indicators such as positional accuracy, completeness, 
resolution. I summarized this in a paper a few years ago from another source, 
where I also introduced the notion of 'crowd quality' in an academic attempt to 
capture specific quality considerations for crowdsourced geospatial data: 
http://www.giscience2010.org/pdfs/paper_213.pdf 


Not much of an academic, I later picked this up in a more pragmatic manner to 
create the notion of data temperature I presented at SOTM US 2011: 
http://oegeo.wordpress.com/2011/09/19/taking-the-temperature-of-local-openstreetmap-communities/


Someone else mentioned we need more mappers. There is truth to that but we also 
need to care about building out the community: how do we reduce the churn rate, 
or in other words how to keep mappers involved and motivated to continue 
mapping? how do we nurture the power mappers, those 5% who create 80% or more 
of the map data - especially in light of the large amounts of new mappers 
coming in? and finally how do we make local communities work? Latter is super 
important because great local data (transit, businesses, addresses) is key to 
the usefulness (hey, another way of thinking about quality!) of OSM. Great 
local data is something you only get if folks who know a place, folks with 
different interests and from different walks of life, work on the map together. 
Currently that happens in too few places. I think one of the most important 
keys to making good OSM data great lies in figuring out how to build strong 
local communities. In Europe, we have that
 down.It all started with that. Get together and map. Have fun, figure it out 
together. While traveling in Germany recently, I did not have to go online once 
to find my way, my hotel, restaurant, bus stop etc. The map is *that good*. 
Sure, there are more mappers per sqm there. But it is just as much about people 
getting together, motivating each other, collaborating on more complex mapping 
tasks (stuff like transit relations[1]). We have a long way to go still in the 
US, and we may need a different approach than Europe.


I think I just wrote half of one of my SOTM US talk. Thanks Frederic ;)


hth
Martijn

[1] http://www.overpass-api.de/api/sketch-line?network=VBBref=M1operator=




On Fri, May 31, 2013 at 2:28 PM, Richard Welty rwe...@averillpark.net wrote:

On 5/31/13 3:15 PM, Frederic Julien wrote:

Dear all,

I'm working on a presentation and interested to hear your thoughts. What are 
the top 2-3 changes that could improve OSM data quality? That could be 
processes, tools, methods, training, peer review, attributes, etc.


at one level, i agree with Clifford Snow's comment that first you need
to define data quality.

at another level, i think that we can talk about the following:

1) consistency in tagging. editor improvements, better documentation, better
    training materials can all help with this

2) improved processes and controls for data import (this is work that is 
happening
    on the US import committee). there are a lot of imports of the past that 
suffer
    from Quality Control issues, and lots of imports that never should have 
been
    done because of problems with the data quality.

3) in the US (and you did ask on talk-us), identifying and dealing with the 
shaky
    Tiger data from the 2007 tiger import. some of this has been done, but 
it's an
    ongoing effort and is one of those things that is easier to say than it is 
to do

richard


___

Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us



-- 
Martijn van Exel
http://oegeo.wordpress.com/
http://openstreetmap.us/ 
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread stevea

Frederic:

Validator is an excellent tool, but currently only works with JOSM. 
I'd love to see Potlatch and/or iD do something similar.  True, many 
(most) ignore what Validator may report, and while Errors are always 
Errors, Warnings are a bit more subtle and really must be taken one 
at a time on a case-by-case basis.  Doing the right thing with a 
Validator Warning takes experience, and for ultimate data quality, 
Validator really needs to be paid attention to much more often than 
it is now.  However, you may argue that such entry-level editors do 
not lend themselves well to such an approach.  We might talk about 
that, as done well, it could work.


I would also vote for experienced OSM importers looking over the 
shoulder of less experienced contributors as they import data.  We 
try to do this with the import guidelines, but there is nothing like 
an experienced OSM contributor who has faced (and overcome) difficult 
choices about data format translations, exactly what should be in 
what tags, what to do with fuzzily defined concepts (like landuse vs. 
landcover) issues, etc.  This really comes from building good OSM 
community, where the Elmers (an old ham radio term meaning those 
more experienced) are around to answer questions, mentor and be a 
good example to the less experienced.  That's a tall order, and a 
little non-specific, I know.


The recent game/goal-oriented sub-projects (connectivity, Zorro...) 
have had fantastic results.  We could use more of these, as their 
success is proven.  Wider attempts like Operation Cowboy, not so 
much, however the cake map (http://http://mapcraft.nanodesu.ru/ ) 
used in Cowboy is an exceedingly useful tool when used properly (by 
an active, communicative OSM community -- I speak from personal, 
recent experience).


The OSM Inspector tools (http://tools.geofabrik.de/osmi/) I find 
highly valuable.  If we could get something like a Zorro/Cowboy 
approach to work with the first three tool categories (Geometry, 
Routing and Multipolygons), that would be fantastic.  This would have 
to be really, really well-prepped, and again, it does take experience 
to use this tool and see what is wrong, then fix it to be right. 
However, if possible, (I think it is, but it is admittedly 
ambitious), imagine the results in data quality!


Clifford Snow writes:
First you need to define what good data quality is and second, you 
need to collect data to measure data quality. Once good data is 
collect then start determining root cause of the problem.


Most of what I see is anecdotal evidence of problems. Fixing the 
cause of those problems is good, but it may not get at the 
underlying issues.


I say +1 to this, but it is nebulous as to be only broadly helpful. 
Clifford, care to flesh that out a bit?


The USA OSM community is, as Martijn pointed out, lagging in many 
ways compared to those in Europe.  I say that not to disrespect us, 
but as I point out that we are behind a few years, we have a much 
broader area (one of the most important reasons why, truthfully), and 
we have lower population densities (a corollary of, and another, more 
specific way of broader area just mentioned).  So the traction to 
get good mappers mapping is slower to grip and go.  We are doing the 
right things, we just seem to be getting slower results.  But our 
results are very good, even excellent in some cases.  Yet, per this 
thread, we really can do better.


Good thread!

SteveA
California


I'm working on a presentation and interested to hear your thoughts. 
What are the top 2-3 changes that could improve OSM data quality? 
That could be processes, tools, methods, training, peer review, 
attributes, etc.


If this sort of info is available elsewhere let me know.

Looking forward to your answers.

Many thanks,

Frederic___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Kai Krueger
Martijn van Exel-3 wrote
 As already noted, quality is in the eye of the beholder. 

Yes, quality lies in the eye of the beholder. Or perhaps better said in the
eye of the data consumer. Therefore the assessment of quality will depend on
the application and use case you have in mind.

I think OSM has enough commercial users by now to be able to get a decent
(subjective) overview of data quality without doing a scientific analysis of
data quality one self. Instead one can probably ask the various developers
of frequently used software based on OSM data, what the most common
complaints of their respective end users are about the data. That should
give a pretty decent overview of the data quality in practical terms and
where the OSM community could possibly best focus their efforts to improve
the quality of the data. Either through more mappers, or by quality control
tools and perhaps even bots.

Kai



--
View this message in context: 
http://gis.19327.n5.nabble.com/OSM-Data-Quality-tp5763578p5763613.html
Sent from the USA mailing list archive at Nabble.com.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Charlotte Wolter

Richard,

We need:
1. More people. A big part of the map is untouched. We could 
reach out more to the educational community to get middle-school and 
high-school students involved.
2. Better training for people who are new to OSM. I think 
learnosm.org is very good. I'm a little apprehensive that iD is too 
geeky for people who are not coders.
3. Clear priorities. If I've just joined OSM, and I'm rarin' 
to go, what should I do first? I don't mean that we should constrain 
people's creativity, but a little guidance would be helpful. Should 
they align streets, check street names, add all street lights? Find 
all turn restrictions in their area? What kinds of things would 
improve the quality of the data? I have no agenda here. I'm waiting 
to be guided, too.


Charlotte


At 01:28 PM 5/31/2013, you wrote:

On 5/31/13 3:15 PM, Frederic Julien wrote:

Dear all,

I'm working on a presentation and interested to hear your thoughts. 
What are the top 2-3 changes that could improve OSM data quality? 
That could be processes, tools, methods, training, peer review, 
attributes, etc.

at one level, i agree with Clifford Snow's comment that first you need
to define data quality.

at another level, i think that we can talk about the following:

1) consistency in tagging. editor improvements, better documentation, better
training materials can all help with this

2) improved processes and controls for data import (this is work 
that is happening
on the US import committee). there are a lot of imports of the 
past that suffer
from Quality Control issues, and lots of imports that never 
should have been

done because of problems with the data quality.

3) in the US (and you did ask on talk-us), identifying and dealing 
with the shaky
Tiger data from the 2007 tiger import. some of this has been 
done, but it's an
ongoing effort and is one of those things that is easier to say 
than it is to do


richard


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Charlotte Wolter
927 18th Street Suite A
Santa Monica, California
90403
+1-310-597-4040
techl...@techlady.com
Skype: thetechlady

The Four Internet Freedoms
Freedom to visit any site on the Internet
Freedom to access any content or service that is not illegal
Freedom to attach any device that does not interfere with the network
Freedom to know all the terms of a service, particularly any that 
would affect the first three freedoms.
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Russ Nelson
Richard Welty writes:

  3) in the US (and you did ask on talk-us), identifying and dealing
  with the shaky Tiger data from the 2007 tiger import. some of this
  has been done, but it's an ongoing effort and is one of those
  things that is easier to say than it is to do

I've been adding lakes and ponds to New York State. I have a list of
points and names[1]. I'm using the lat/lon to point me to a lake/pond
which is in this list. I trace it using bing aerials, and look at the
(public domain) USGS topographic maps to add the name. I'm making good
progress. It's taken me since August of last year, and I'm now into
the S names.

It's fun! For a small pond or lake, it takes less than 30 seconds to
add it.

So, what if we could automatically identify misaligned TIGER ways and
make a list? Then people could take a few minutes, grab a few ways and
armchair fix them.

I've been working on finding and fixing them in New York State. I've
probably got more than half -- maybe 60% fixed. Hopefully even
70%. And I'm just one mapper (well, and you're another mapper who's
done a ton, plus there's a few more I'm sure). My main difficulty
right now is finding areas of misalignment. I've gone through entire
counties and aligned every road I could find, but I'm sure I missed a
few.

It doesn't have to be a perfect process; it just needs to be easy and
quick to use. It's okay if a few already-fixed roads get pointed to,
as long as it's not too many, and it's okay if a few unfixed roads get
skipped, as long as it's not too many.

The better we can make the map data, the more people will want to use
it to make maps and the more people will want to fix That One Last Flaw.

[1] I'm not sure of the copyright license on the list, so I'm using it
to point me to the general area of a lake or pond, and digitizing it
de novo.

-- 
--my blog is athttp://blog.russnelson.com
Crynwr supports open source software
521 Pleasant Valley Rd. | +1 315-600-8815
Potsdam, NY 13676-3213  | Sheepdog   

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Richard Welty

On 5/31/13 9:39 PM, Russ Nelson wrote:


I've been working on finding and fixing them in New York State. I've
probably got more than half -- maybe 60% fixed. Hopefully even
70%. And I'm just one mapper (well, and you're another mapper who's
done a ton, plus there's a few more I'm sure). My main difficulty
right now is finding areas of misalignment. I've gone through entire
counties and aligned every road I could find, but I'm sure I missed a
few.
Warren and Scoharie Counties still need a lot of work. Scoharie is a 
candidate

for testing a selective replacement with TIGER 2010/11/12 data.

richard


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Clifford Snow
On Fri, May 31, 2013 at 3:02 PM, stevea stevea...@softworkers.com wrote:

 Clifford Snow writes:

 First you need to define what good data quality is and second, you need to
 collect data to measure data quality. Once good data is collect then start
 determining root cause of the problem.


 Most of what I see is anecdotal evidence of problems. Fixing the cause of
 those problems is good, but it may not get at the underlying issues.


 I say +1 to this, but it is nebulous as to be only broadly helpful.
 Clifford, care to flesh that out a bit?


You mean you could sense what I was trying to say?  Needless to say, I tend
to be a bit terse with my emails. So let me try a slightly longer version.

We need quality standards that can be measured.  We can and should have
standards for mapping objects and ways. With those standards a quality
control sampling process could be initiated to test the quality of new
edits as well as the existing data. With a sample of data we could build
a histogram of errors. Ideally tackling the largest column. Even a small
sample size can work. Statistical Process Control in a manufacturing
process only samples some 20 items. This isn't a manufacturing process, but
the principles are the same.

Unfortunately, some of what we do is subjective. Take the recent issue of
tagging Subway sandwich shops that was recently discussed on one of the
mailing lists. Everyone had a valid solution. Maybe some were more valid
that others, but anyone of them was workable. Yet tagging POI is an
important step to get right.

Adding a node to say this is a bus stop, when it isn't is very clearly a
data quality issue. It can be measured. The path of a highway can be
determined to track gps traces or Bing images. It can be measured. However,
is it accurately tagged as a primary, secondary, tertiary, etc. is somewhat
subjective.

Tackling the subjective is more difficult. For example, the Subway sandwich
shop. If we had hard and fast rules it that every Subway be tagged as
amenity=fastfood then we could easily do a quality check. But OSM give
people a lot of tagging freedom.

One last thing. My sense is that the problem generally isn't the mappers.
Yes I screwed up more than my fair share of edits. But most problems are
system problems. To fix those we need good data and a willingness to get at
the root cause of the problem.

Short summary: sample edits, categorize errors, determine root cause, then
fix root cause. That process will drastically improve the quality of OSM.
Hopefully someone with more recent background in Quality Control can step
in here to help me out.

-- 
Clifford

OpenStreetMap: Maps with a human touch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread Richard Fairhurst
Martijn van Exel wrote:
 I think I just wrote half of one of my SOTM US talk.

I think you just wrote half of mine too. ;)

cheers
Richard





--
View this message in context: 
http://gis.19327.n5.nabble.com/OSM-Data-Quality-tp5763578p5763648.html
Sent from the USA mailing list archive at Nabble.com.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] OSM Data Quality

2013-05-31 Thread stevea

ramble++;

Clifford, yes I could sense what you were trying to say:  I have a 
thirty+ year Quality background at Apple, Adobe, IBM, the University 
of California (and others) as an employee, contractor, subcontractor 
and consultant.  You are doing fine, you just did fine.


OSM does sample edits, and some people listen and pay attention when 
the tools talk to them:  your step 1.  OSM does categorize errors 
(your step 2):  both within tools, like JOSM does with Validator, but 
also longer-term problems that can be solved by both human and 
one-at-a-time (usually somewhat manually) as well as bot -- bot if 
small samples are first built and proven smart about how they'll be 
unleashed.  A (selfish, but valid) example:  correct (within 
parameters) all the geographical mistakes to multipolygons in 
California caught by geofabrik Inspector.  A human or a bot might do 
that if you have some time on your hands, some of which might go 
towards crafting bots.


But first we pull and tug about what the right set of those samples 
are.  Briefly, assume we can identify and reach consensus upon some. 
Then we land in a fuzzy part of your step 3 of determine root cause 
so we can get to step 4.  Sounds about right, but we have bifurcated 
(multi-furcated?) into so many root causes that we have to get very 
plural (root causes) and then even begin to categorize those. 
Continuing, we can apply smarts and tools and a quality approach even 
to those.  Such a long-term, multi-rolling approach to quality must 
continue.  This is an important middle about how it both gets talked 
about and implemented.


(Potential root causes are likely manyfold:  a fundamental 
misunderstanding about the concept and implementation of 
multipolygon is probably one, mapping tools which don't fully 
express multipolygon concepts across data format translations is 
probably another, and so on).


There is another thing about Quality which doesn't often get said out 
loud:  I know superb quality when I [see, experience...] it.  That 
is a sublime, slippery, elusive don't forget about the topic.  This 
means finish lines and checkered flags, while they can be reached 
many ways, usually do so as they make a large number of people 
happiest.  The ones who clearly articulated not only what the finish 
line is, but milestones along the way and how we cross them.  That 
means consensus, good project management, being stepwise, thoughtful, 
communicative and achieving a definable goal with harmony.  It is 
much easier talked about than done, but that doesn't make it 
impossible, just worthy.


Good specifications of finish lines (milestones, hurdles along the 
way...) are worth a great deal.  OSM has some difficulty now 
articulating the decades-away finish line (which is OK, but let's 
keep an eye on it), but we can set up short hurdles to hop over 
during the upcoming intermediates.  How we do that is an important 
part of the next ten or twenty years of OSM (in my opinion).


We can't just say someday this'll be the best damn map on Earth. 
We have to say how.


I recently said no to an important OSM contributor who wants to do 
a building and address import.  I know for a fact that the data are 
noisy, obsolete and we can do better, so I said I'd rather get them 
right offline first before we import known wrong data.  That's the 
right call.  How do I know?  I live among the data and because of 
their age and errors, found them less rather than more useful in the 
map.  Sometimes Quality is that simple.  Mostly it is not.  I just 
know using old map data sucks.  Upload is the last step, not the 
first:  get it right in your editor offline before you start spilling 
buckets of paint.


OSM lives and breathes as as Earth's cave wall, we paint our neon 
tubing and scribbles alike.  Think before you upload.  Make each 
changeset a few smart brushstrokes on a shared canvas.  Leave the 
place better than you find it.  Your mother doesn't live here. 
Tinkering with OSM's gears is allowed, especially if you are handy, 
an artist, a cartographer or a lot of things life has to offer, such 
as a thinker about Quality.


Many people, long process.  Lather, lather, rinse, repeat.  Talking 
about Better can, even should result in Better.  I'll close by saying 
it again:  good of you to urge along the conversation in this thread.


ramble--;

SteveA
California

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us