Re: [Talk-us] A Friendly Guide to 'Bots and Imports
As I manually survey various features (POIs, some hydro, etc.), I usually try to merge in the data from existing imports so as to maintain the link (e.g. gnis:feature_id) back to the original database, in case we want to exchange updates with them again. this is impossible due to the license terms, That may be the short quick answer, but it is not the long answer. The link will be valuable as we figure out other ways to synchronize the data and/or make dual-license updates; either originated from OSM or from the other party like USGS. Simple? No. Impossible? No. - Alan ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
Serge Wroclawski writes: > Moving away from discussions of specific imports, I'd like to explore > what people think about a few areas of this discussion: > > 1) When someone says "I want to import X", what should our first response be? I think your reaction to point out the danger is fair. But, living in an area with a lot of high-quality data that has been imported rather well, I'm not anti-import. But I am in the "imports should be exceedingly well though out" camp. > 2) When someone points out a widespread problem (such as the Salt Lake > City addresses), how do we want to proceed? Some things need automated edits to fix. I would like to see safe frameworks for this in osm svn/git/whatever, and more or less require that the code to be run for fixups be stored as part of the coummunity history. It's clear that things need to be fixed, and the challenge is to make the fixes be net positive. > 3) Is it better to discourage bots and imports (as we do currently) or > better to heavily document bots and set up standardized methods? (and > do people think those methods will be used?) I think most people doing automated imports are doing so because they want to fix something that's broken, and most are patient. If we provide skeleton code and especially a way to see how the fix works before it's really committed, I think most people would be cooperative. In my case, I've thought about several automated edits (and done zero): duplicate nodes at town boundaries in roads due to massgis highway layer. I wrote on talk-us about what I think ought to be done, in terms of outlining a precondition for "two nodes on same place, massgis tags, each the end node in a highway way with massgis tags". Somehow, most of this got fixed, and I don't know if it was part of the general de-dupe rampage or someone doing a more targetted edit. But as far as I can tell it was done right, and a good outcome. In MA, landuse=reservoir is on lots that are really "reservoir protection". They render blue, and I think they should be retagged. Or maybe mapnik and the tagging rules fixed. So I haven't gotten around to this - i have gotten the clue to tread lightly and I've been busy. fuzzy matching on GNIS vs massgis points, and merging them, taking massgis locations, in cases where no human has edited the GNIS points. Bots are another story; that's a long-term running process that does automated edits whenever preconditions are satisfied. Those are scarier than someone grabbing a state extract, running an automated edit, reviewing the results, maybe sharing them for review by others, and choosing to push upload. For imports, I've thought about several, and the common theme is ENOSPARTIME, but the list is parcel data, but not imported because a) I'm not sure what I think is right, and b) I'm not sure what community consensus is. merging updates to massgis highway data, but this is hard importing NHD or masgis hydro importing more massgis rails/trails/etc. importing the towns w/o highway data, but there's a lot of manual merging (e.g. gloucester). This leads to thoughts of writing code to auto-merge, which leads to it not happening due to not enough time. > 4) In the US, what (if any) role should OSM US play in imports? Perhaps helping with the above, and being elder statesmen about advice. So all in all, my level of restraint, but a higher level of spare time, is probably where we want people to be. One thought is that someone wanting to import should probably have done some manual mapping first, to get their head around the norms and community. pgpYwsak5zSBe.pgp Description: PGP signature ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
At 2010-08-06 06:11, Serge Wroclawski wrote: ... 1. I think the first reactions to a request to import should be something that outlines the danger to OSM of importing. The biggest danger of which, IMO, is duplication of existing data. I believe many newbies will want to import datasets that already have at least some representation in existing data, given that we already have transportation, hydrography, admin boundaries, and some POIs. This last category might be one of the only ones that could be genuinely useful, like importing a chain of restaurants, fuel stations, etc. "Import" of most county land datasets (parcels, addresses, centerlines) is far more difficult in that it is really more of a comparison and synchronization than adding of data. Someone else noted the import in the city of Bakersfield, CA, which included parcel and building outlines, as well as landuse polygons that follow street edges in excruciating detail. It seems that, while interesting to look at, at least some of this might should have been discussed first, as it resulted in 10x the number of objects as similar areas with just centerlines. 2. I think widespread "bot fixes" should be encouraged to wait 10 days. Yes. Someone said something like "just long enough to annoy the author". Anyone who subscribes to multiple lists could easily not see something important or be able to comment on it for several days. The importer should also send a "last call" a day or two before. 3. I think imports and bots are inevitable, so the more documented we make the process, the less we encourage people to go wild and write their own. At the same time, we want to discourage bots and imports in general. It would be nice to have some boilerplate search/replace code or an app to use. Another issue is that of co-ordinating efforts. A few times, I walked through tagwatch and downloaded/corrected/uploaded by hand one bad key at a time until I got bored. I know there are people out there doing this, too, but it would be nice if there were a page we could use to divvy up and co-ordinate those efforts. -- Alan Mintz ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Fri, 6 Aug 2010, Kevin Atkinson wrote: On Fri, 6 Aug 2010, Katie Filbert wrote: 1) Anyone that wants to run a bot or new tasks for an existing bot (automated or semi-automated tasks) must submit a request to the bot approval group (BAG). Others are free to comment on the request, in addition to BAG. If I had too go though a formal approval process I would not have even bothers with my script. I'm not sure I wouldn't really call it a bot because I manually downloaded the data, ran a script on the data, than manually uploaded the data. As oppose to something complete automatic. Again, maybe I was a little strong. But if you do what some sort of formal approval process can you please at least cut out some of the steps for those who already went though the process once and have proven to be competent and won't do anything which will lead to a mess latter. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Fri, 6 Aug 2010, Serge Wroclawski wrote: On Fri, Aug 6, 2010 at 3:13 PM, Kevin Atkinson wrote: On Fri, 6 Aug 2010, Serge Wroclawski wrote: 2. I think widespread "bot fixes" should be encouraged to wait 10 days. It's just too easy to make a large change and too hard to fix it. I'd also suggest that we (as a community) develop tools to make it easier to demonstrate what an import or bot would do on a test server. If I had to wait 10 days there is a good chance I would of likely lost interest. I have been trying to say that there are different levels of bots and the amount of damage they can do. If you lose interest quickly, it can't be very important to you. Maybe "most likely" is a little strong, but I was trying to make a point. Just because a change is not very important to me, doesn't mean it is a good change that can make the map better. Also, it might not be that a lost interest, but rather simply don't have the time. I may have some time now, but may not in two weeks. I can fully understand that bots can do a lot of damage. But I was also very careful and limited the scope of what my bot did. I also have a clear plan to undue the controversial part of my change if anyone should object in the future. I honestly don't think I can say anything else without coming off as a reckless, impatient, jerk, that wants things done now or never.___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Fri, 6 Aug 2010, Katie Filbert wrote: 1) Anyone that wants to run a bot or new tasks for an existing bot (automated or semi-automated tasks) must submit a request to the bot approval group (BAG). Others are free to comment on the request, in addition to BAG. If I had too go though a formal approval process I would not have even bothers with my script. I'm not sure I wouldn't really call it a bot because I manually downloaded the data, ran a script on the data, than manually uploaded the data. As oppose to something complete automatic. And what exactly consists of a bot. Would the clean up of Florida's County routes "ref" tagging been a bot. It a large scale task systematic change, even if a script (I think he used search and replace on an editor) was not used. If you want to go though with this I think you need a better definition of a "bot" which should consist of at least one of 1) Large Scale Change 2) Fully automatic Defining 1) would be tricky, something over the united states count. But what about an entire state if the change is limited in scope? For OSM, something else we ought to do better with is using the dev API server (http://*api06*.*dev*.openstreetmap.org/). I did not know that site existed. It needs to be better documented. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Fri, Aug 6, 2010 at 3:13 PM, Kevin Atkinson wrote: > On Fri, 6 Aug 2010, Serge Wroclawski wrote: > >> 2. I think widespread "bot fixes" should be encouraged to wait 10 >> days. It's just too easy to make a large change and too hard to fix >> it. I'd also suggest that we (as a community) develop tools to make it >> easier to demonstrate what an import or bot would do on a test server. > > If I had to wait 10 days there is a good chance I would of likely lost > interest. I have been trying to say that there are different levels of bots > and the amount of damage they can do. If you lose interest quickly, it can't be very important to you. Minor edits can be done immediately, but any time you're making a mass change across a wide geographic region (like an entire city), that requires planning, thinking, feedback. Those things take time, not just for the person who wants to make the change, but for the rest of the community to catch up, check the edit out, give feedback, etc. The best edits in OSM took months of planning. - Serge ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Fri, 6 Aug 2010, Serge Wroclawski wrote: 2. I think widespread "bot fixes" should be encouraged to wait 10 days. It's just too easy to make a large change and too hard to fix it. I'd also suggest that we (as a community) develop tools to make it easier to demonstrate what an import or bot would do on a test server. If I had to wait 10 days there is a good chance I would of likely lost interest. I have been trying to say that there are different levels of bots and the amount of damage they can do. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
Good to see your comments getting through Katie (I was one of the people who didn't get your emails before). On Fri, Aug 6, 2010 at 11:59 AM, Katie Filbert wrote: > On Fri, Aug 6, 2010 at 9:11 AM, Serge Wroclawski wrote: >> >> Moving away from discussions of specific imports, I'd like to explore >> what people think about a few areas of this discussion: >> >> 1) When someone says "I want to import X", what should our first response >> be? > > The nature of OSM with few rules (compared to say the many rules on > Wikipedia) is appealing in some aspects and I don't want to see OSM become > burdened with so many rules. > > At the same time, we might learn some lessons from how Wikipedia handles > bots... > > 1) Anyone that wants to run a bot or new tasks for an existing bot > (automated or semi-automated tasks) must submit a request to the bot > approval group (BAG). Others are free to comment on the request, in addition > to BAG. > > 2) You explain what the bot will be doing. The BAG assesses whether it's a > good idea, and gives constructive feedback > > 3) Bot operators are encouraged to share the code, at least with BAG, but > ideally make it open source so others can review it. > > 4) The bot then goes through a trial (e.g. doing 50 edits) > > 5) The bot runs on a separate account from the user's normal account. The > bot account is flagged, so it's hidden by default from Special:RecentChanges > and gets higher API rate limits. > > The bot's user page has information on who's running the bot, what it's > doing, bot shutoff button that anyone can use if the bot is AWOL, info on > how to contact the bot operator, and the bot operator needs to be > responsive. > > http://en.wikipedia.org/wiki/Wikipedia:Bots I think these are all very reasonable. >> 2) When someone points out a widespread problem (such as the Salt Lake >> City addresses), how do we want to proceed? > > I'm not totally convinced it's effective, but Wikipedia handles disputes and > issues with "requests for comments" and tries to reach consensus. For > something like the addresses, there may be not be 100% consensus but say, > 3/4 agreement would be good, making compromises necessary to get there. > > http://en.wikipedia.org/wiki/WP:RFC I think if we have a process like this, we'd want it more streamlined, but I like the approach of having a validations and feedback period. >> 3) Is it better to discourage bots and imports (as we do currently) or >> better to heavily document bots and set up standardized methods? (and >> do people think those methods will be used?) > > See above (1). > > Furthermore, Wikipedia users have gone as far as to create bot frameworks > (pywikipedia) that are well-tested and there are tools (e.g. > autowikibrowser) for semi-automated edits. I agree with this. I don't think we need officially blessed bots,but most of us have already made our own bot frameworks (I know I did), so unless there's a compelling reason, why replicate the work? Having a tool to display the changes is really important IMHO. Sometimes those changes will be something where it'll be obvious and rendering it as tiles would be good. Other times the changes won't be something that renders, and we'll need to find a way to display the differences in a meaningful way, but if we made a framework for it, hopefully we could plug in that functionality as we went along. > For OSM, something else we ought to do better with is using the dev API > server (http://api06.dev.openstreetmap.org/). Last I knew, it's not > populated with data except what individuals put in it. It would be great > the dev server instead was a full, up-to-date mirror of OSM that people > could use to test imports and semi/fully automated edits. I think this is > especially important since, unlike setting up MediaWiki, it's not so simple > for individuals to setup their own OSM stack api06 is meant for testing out calls to the API. That's why I suggest something else altogether. I also think if we start something, it'll be easier to have it adopted by the larger OSM community. - Serge ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On 6 Aug 2010, at 1:45 , Nathan Edgars II wrote: > On Fri, Aug 6, 2010 at 3:50 AM, Apollinaris Schoell > wrote: >> >> On 5 Aug 2010, at 14:43 , Alan Mintz wrote: >>> >>> As I manually survey various features (POIs, some hydro, etc.), I usually >>> try to merge in the data from existing imports so as to maintain the link >>> (e.g. gnis:feature_id) back to the original database, in case we want to >>> exchange updates with them again. >>> >> >> this is impossible due to the license terms, > > There are no (valid) license terms applicable to something of the form > "OSM deleted feature 687645; check independently whether it exists and > delete it from GNIS if not". sure not in this form, this form requires so much work on GNIS side that it will probably never happen. the deletion of the node can happen for so many reasons that without documentation it has no value. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Fri, Aug 6, 2010 at 9:11 AM, Serge Wroclawski wrote: > Moving away from discussions of specific imports, I'd like to explore > what people think about a few areas of this discussion: > > 1) When someone says "I want to import X", what should our first response > be? > The nature of OSM with few rules (compared to say the many rules on Wikipedia) is appealing in some aspects and I don't want to see OSM become burdened with so many rules. At the same time, we might learn some lessons from how Wikipedia handles bots... 1) Anyone that wants to run a bot or new tasks for an existing bot (automated or semi-automated tasks) must submit a request to the bot approval group (BAG). Others are free to comment on the request, in addition to BAG. 2) You explain what the bot will be doing. The BAG assesses whether it's a good idea, and gives constructive feedback 3) Bot operators are encouraged to share the code, at least with BAG, but ideally make it open source so others can review it. 4) The bot then goes through a trial (e.g. doing 50 edits) 5) The bot runs on a separate account from the user's normal account. The bot account is flagged, so it's hidden by default from Special:RecentChanges and gets higher API rate limits. The bot's user page has information on who's running the bot, what it's doing, bot shutoff button that anyone can use if the bot is AWOL, info on how to contact the bot operator, and the bot operator needs to be responsive. http://en.wikipedia.org/wiki/Wikipedia:Bots Certainly not all bots and imports are bad, but I would be happy to have such careful attention and review for OSM bots and imports to help ensure the task is suitable, the bot works properly, and is not disruptive or harmful to the community. > 2) When someone points out a widespread problem (such as the Salt Lake > City addresses), how do we want to proceed? > I'm not totally convinced it's effective, but Wikipedia handles disputes and issues with "requests for comments" and tries to reach consensus. For something like the addresses, there may be not be 100% consensus but say, 3/4 agreement would be good, making compromises necessary to get there. http://en.wikipedia.org/wiki/WP:RFC Things can escalate from there, if necessary. For OSM, we tend to discuss things on the mailing list, and we may want to do things differently. Not sure what's best. > 3) Is it better to discourage bots and imports (as we do currently) or > better to heavily document bots and set up standardized methods? (and > do people think those methods will be used?) > See above (1). Furthermore, Wikipedia users have gone as far as to create bot frameworks (pywikipedia) that are well-tested and there are tools (e.g. autowikibrowser) for semi-automated edits. For OSM, something else we ought to do better with is using the dev API server (http://*api06*.*dev*.openstreetmap.org/). Last I knew, it's not populated with data except what individuals put in it. It would be great the dev server instead was a full, up-to-date mirror of OSM that people could use to test imports and semi/fully automated edits. I think this is especially important since, unlike setting up MediaWiki, it's not so simple for individuals to setup their own OSM stack More testing and more eyes on bots and imports, I think the better for bad bots and imports to be weeded out and the good, useful ones can proceed. > 4) In the US, what (if any) role should OSM US play in imports? > > Not sure it needs to be OSM US specifically, but having a staging area (e.g. to store copies of data imported -- in original & osm format? -- and a good development server for testing are important. > > And now my .02: > > 1. I think the first reactions to a request to import should be > something that outlines the danger to OSM of importing. That's the > guide this thread talks about. We want to instill on the user the > potential pitfalls and encourage them to work with the community- > maybe even discovering that the data set was known previously and not > imported for a reason. > Community feedback is indeed important. > 2. I think widespread "bot fixes" should be encouraged to wait 10 > days. It's just too easy to make a large change and too hard to fix > it. I'd also suggest that we (as a community) develop tools to make it > easier to demonstrate what an import or bot would do on a test server. > > Imagine I want to fix all the streets in Cleveland. I could spin up an > instance of Cleveland as of a certain time, apply my changes to that > test site, and show it off to the large community, soliciting > feedback. > Agree. > > This isn't really feasible right now using existing OSM methods. > > 3. I think imports and bots are inevitable, so the more documented we > make the process, the less we encourage people to go wild and write > their own. At the same time, we want to discourage bots and imports in > general. > I agree to some extent about discouraging bots and imports, a
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On 5 August 2010 20:27, Ian Dees wrote: > On Sat, Jan 8, 2000 at 3:20 PM, Katie Filbert wrote: > >> The difference with NHD is that we are leaving conversion to osm format >> for the local mapper / importer. Since OSM US has server space, maybe >> that's good use of it to host converted data ready for import. >> > > I like this... the NHD status page on the wiki sort of already does this in > a backwards way. Perhaps I will look in to writing a web tool to keep track > of the import and give easy access to the pre-generated OSM files for > subbasins. > > I think keeping the data in a database and generating on the fly with a more advanced interface like (http://clc.openstreetmap.fr) would be very good. I have been meaning to implement a webservice which would generate OSM file with specific functionalities based on some kind of layer and requests to power a site like the one shown previously. Emilie Laffray ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
Moving away from discussions of specific imports, I'd like to explore what people think about a few areas of this discussion: 1) When someone says "I want to import X", what should our first response be? 2) When someone points out a widespread problem (such as the Salt Lake City addresses), how do we want to proceed? 3) Is it better to discourage bots and imports (as we do currently) or better to heavily document bots and set up standardized methods? (and do people think those methods will be used?) 4) In the US, what (if any) role should OSM US play in imports? And now my .02: 1. I think the first reactions to a request to import should be something that outlines the danger to OSM of importing. That's the guide this thread talks about. We want to instill on the user the potential pitfalls and encourage them to work with the community- maybe even discovering that the data set was known previously and not imported for a reason. 2. I think widespread "bot fixes" should be encouraged to wait 10 days. It's just too easy to make a large change and too hard to fix it. I'd also suggest that we (as a community) develop tools to make it easier to demonstrate what an import or bot would do on a test server. Imagine I want to fix all the streets in Cleveland. I could spin up an instance of Cleveland as of a certain time, apply my changes to that test site, and show it off to the large community, soliciting feedback. This isn't really feasible right now using existing OSM methods. 3. I think imports and bots are inevitable, so the more documented we make the process, the less we encourage people to go wild and write their own. At the same time, we want to discourage bots and imports in general. 4. I think OSM US can play a significant role in two ways. I think the organization can help by working with governments to make data sets available. And I think it could possibly help with some equipment and infrastructure. Those are why I'm involved in OSM US now, and (blatant plug) why I'm running for office on the next board. At the same time, I think the process needs to be bottom-up community driven. - Serge ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Thursday, August 05, 2010 06:32:04 pm you wrote: > On Thu, Aug 5, 2010 at 6:10 PM, James U wrote: > > I have to say that after importing a large amount of NHD data (most of NC > > and MN) that it is of varying quality, as was the preexisting water > > related data already on the server. In general, I agree with Ian that > > it is higher quality (both resolution and accuracy) than the preexisting > > data that largely consisted of quickly drawn Yahoo traces. I saw very > > little evidence of on the ground surveying of these features and don't > > think the import will hinder most people from participating in OSM writ > > large. > > On the other hand, my (extremely limited) experience is that the > aerial water traces for Disney World were superior to the NHD import > (so I quickly deleted all the dupes from NHD). But the swamps were a > lot more useful, since you can't really tell if something's swampy > without physically going there. I love how all these "islands" > suddenly made sense: > http://www.openstreetmap.org/?lat=28.29&lon=-81.5191&zoom=14&layers=M You are certainly correct that there are many examples of other imported waterbodies and streams that _are_ better than NHD, this is why I strongly recommend that importers of the data do a careful scan of the data and when there are duplicate features to try to use a high quality aerial photo to make a judgment of which is the better data. For some features, such as multipolygon lakes with islands, I have mixed and matched different parts of the NHD and the previously uploaded feature (sometimes you can tell where mappers got tired of tracing). In some cases, for example in dams where water levels fluctuate and aerial photos might represent a low water phase, the NHD may have better data that are representative of the level that the reservoir is maintained at. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Fri, Aug 6, 2010 at 3:50 AM, Apollinaris Schoell wrote: > > On 5 Aug 2010, at 14:43 , Alan Mintz wrote: >> >> As I manually survey various features (POIs, some hydro, etc.), I usually >> try to merge in the data from existing imports so as to maintain the link >> (e.g. gnis:feature_id) back to the original database, in case we want to >> exchange updates with them again. >> > > this is impossible due to the license terms, There are no (valid) license terms applicable to something of the form "OSM deleted feature 687645; check independently whether it exists and delete it from GNIS if not". ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On 5 Aug 2010, at 14:43 , Alan Mintz wrote: > At 2010-08-05 11:52, Ian Dees wrote: >> ... >> It isn't any different. I had made the (bad) decision at the time to import >> over any existing data because in the several hundred places I spot-checked, >> NHD was vastly superior in resolution (and probably quality). > > By "import over", do you mean to add duplicates, replace the existing > features, or merge the info from the two manually? > > As I manually survey various features (POIs, some hydro, etc.), I usually try > to merge in the data from existing imports so as to maintain the link (e.g. > gnis:feature_id) back to the original database, in case we want to exchange > updates with them again. > this is impossible due to the license terms, > One thing that occurs to me that may be a problem is that I occasionally have > to delete a feature that is no longer present (e.g. > http://www.openstreetmap.org/browse/node/358808220). If we were to feed an > update back to GNIS or get one from them, this situation would have to be > taken into account. > > -- > Alan Mintz > > > ___ > Talk-us mailing list > Talk-us@openstreetmap.org > http://lists.openstreetmap.org/listinfo/talk-us ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
Hi, On 5 August 2010 21:46, Richard Weait wrote: > On Sat, Jan 8, 2000 at 4:20 PM, Katie Filbert wrote: >> Leaving imports to local mappers is good. They are best able to assess the >> quality of the data for that area an care about quality of their local map >> data. It also leaves "low hanging fruit" for them. Some areas without >> local mappers may take longer to "finish". That is okay. Definitely there are advantages from the import being done by a local, but, as always, there are also advantages from the import being done by the author of conversion script, someone who understands exactly what parts need to be checked manually and someone who has done many such imports instead of only a limited area. (I have taken part in an import where I made converted data available on the web for locals to import and often had to spend longer fixing stuff after them than it would have taken me to do it myself). So it's hard to stand on one side or the other, probably best to look at it case by case. > > I have no arguments with this. > > Consider this: Does importing to an area where there is no thriving > OSM community inhibit the creation of that thriving community in > future? > > At SotM, one of our friends suggested that imports are, "okay except > road networks. Never import road networks." The suggestion is that > building the road network also builds the community. An existing road > network inhibits the community. I apologize for not attributing that > comment. I've forgotten who said it to me. > > Or from another point of view. If the local community isn't > substantial enough to maintain the imported data and keep it up to > date, is it better to not import until the community can maintain it? > Why import 2004 data, if it will be unchanged when the 2006 update is > published? Does that mean that you should only import once you have > such a thriving community and high quality local data that you no > longer would benefit substantially from that import? I totally agree here, it's a bit of a trade-off choosing the right moment. If you do it too soon, you get an unmaintained map of the area. If you do it too late, local mappers who didn't know about the datasource contribute their time to re-collect the data, which later clashes with the datasource and costs time to choose the better version, to merge, and it is frustrating when someone finds out they could have spent the time on the finer details. Cheers ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
Some guides aimed at focused scripts which address a particular problem in a well defined area would be useful, as most of the guide is aimed at automatic fixup bots and large scale imports. For example a note in big bold letters that large uploads take a long time will be very helpful. Also a guide to using JOSM advanced chunk upload feature will be very helpful. Also, I think what the bot does is very important, tag fixups are generally a lot safer than bots which affect nodes, and those are safer than those that remove ways. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Thu, Aug 5, 2010 at 6:10 PM, James U wrote: > I have to say that after importing a large amount of NHD data (most of NC > and MN) that it is of varying quality, as was the preexisting water related > data already on the server. In general, I agree with Ian that it is higher > quality (both resolution and accuracy) than the preexisting data that > largely consisted of quickly drawn Yahoo traces. I saw very little evidence > of on the ground surveying of these features and don't think the import > will hinder most people from participating in OSM writ large. On the other hand, my (extremely limited) experience is that the aerial water traces for Disney World were superior to the NHD import (so I quickly deleted all the dupes from NHD). But the swamps were a lot more useful, since you can't really tell if something's swampy without physically going there. I love how all these "islands" suddenly made sense: http://www.openstreetmap.org/?lat=28.29&lon=-81.5191&zoom=14&layers=M ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
I have to say that after importing a large amount of NHD data (most of NC and MN) that it is of varying quality, as was the preexisting water related data already on the server. In general, I agree with Ian that it is higher quality (both resolution and accuracy) than the preexisting data that largely consisted of quickly drawn Yahoo traces. I saw very little evidence of on the ground surveying of these features and don't think the import will hinder most people from participating in OSM writ large. I have a fair bit of experience in converting data and would be happy to convert subbasins (these appear to be rougly 2500 square mile areas and are documented on the wiki) for people if they want to go through the process of double checking to make sure the data don't conflict with or overlap already existing data. James On Thursday, August 05, 2010 04:29:19 pm David Carmean wrote: > On Thu, Aug 05, 2010 at 01:38:36PM -0500, Ian Dees wrote: > > I think the NHD "import" is a good example of a well-intentioned importer > > (me) gone wrong. I had initially planned to import the whole darn thing > > in one swoop, but various technical and life challenges came up before I > > could get it going. While I was working on those issues, people started > > importing it themselves (sometimes marking so on the wiki, sometimes > > not). Now that there are some areas imported, the import of the whole > > dataset becomes infinitely harder because we have to match existing data > > with new OSM-ified data. > > And I'll add my own mea-culpa. I created some wiki pages/features to > help partition and coordinate NHD import efforts, and then also found > that I didn't have the time to follow up. > > I would agree that the partial imports will have increased the difficulty > of a large-scale bulk import, but we already had hydrographic features > from TIGER, did we not. And hand-drawn features from aerial traces and > actual boots-on-the-ground mapping. Conflation in general is a tough > problem, I gather. There are tools, algorithms and heuristics in the > GIS world but the OSM data model makes translation between the two models > somewhat difficult. > > For example, something that looks very interesting which I plan to examine > is the Java Conflation Suite [1], which looks like it could be used over > relatively small areas (probably about the size of the API limit... 0.25 > degrees square?). But as a component of the JUMP[2] platform, it operates > only on Shapefiles and GML out of the box. (If we could get some Java > expertise I think it would be very worthwhile working with the JUMP team > to create an OSM driver.) > > At any rate, while I think we could mitigate a number of problems > given some development effort, I also agree that we might want to > spend more time thinking about why we want to make the imports--and > perhaps publically debate, if only in talking to yourself on the > project wiki page, the pros and cons of a particular import. > > > > [1] http://www.vividsolutions.com/JCS/ > [2] http://www.vividsolutions.com/jump/ > > > > ___ > Talk-us mailing list > Talk-us@openstreetmap.org > http://lists.openstreetmap.org/listinfo/talk-us ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Thu, Aug 5, 2010 at 4:43 PM, Alan Mintz > wrote: > At 2010-08-05 11:52, Ian Dees wrote: > >> ... >> It isn't any different. I had made the (bad) decision at the time to >> import over any existing data because in the several hundred places I >> spot-checked, NHD was vastly superior in resolution (and probably quality). >> > > By "import over", do you mean to add duplicates, replace the existing > features, or merge the info from the two manually? > Add duplicates. > As I manually survey various features (POIs, some hydro, etc.), I usually > try to merge in the data from existing imports so as to maintain the link > (e.g. gnis:feature_id) back to the original database, in case we want to > exchange updates with them again. > > One thing that occurs to me that may be a problem is that I occasionally > have to delete a feature that is no longer present (e.g. > http://www.openstreetmap.org/browse/node/358808220). If we were to feed an > update back to GNIS or get one from them, this situation would have to be > taken into account. > > When I made the original GNIS import I saved the resulting XML and IDs (which would have allowed us to detect deletions) but promptly lost it in a hard drive crash. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
At 2010-08-05 11:52, Ian Dees wrote: ... It isn't any different. I had made the (bad) decision at the time to import over any existing data because in the several hundred places I spot-checked, NHD was vastly superior in resolution (and probably quality). By "import over", do you mean to add duplicates, replace the existing features, or merge the info from the two manually? As I manually survey various features (POIs, some hydro, etc.), I usually try to merge in the data from existing imports so as to maintain the link (e.g. gnis:feature_id) back to the original database, in case we want to exchange updates with them again. One thing that occurs to me that may be a problem is that I occasionally have to delete a feature that is no longer present (e.g. http://www.openstreetmap.org/browse/node/358808220). If we were to feed an update back to GNIS or get one from them, this situation would have to be taken into account. -- Alan Mintz ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Thu, Aug 05, 2010 at 10:38:47PM +0200, Frederik Ramm wrote: > Katie, > > your computer thinks it is the year 2000. I see you sent that from > your iPhone. Maybe you had your fingers on the wrong spot so it didn't > get a time signal. Not only that, all of your messages (katie) are being trapped as spam by my provider's system, probably because of the bad date. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
Katie, your computer thinks it is the year 2000. I see you sent that from your iPhone. Maybe you had your fingers on the wrong spot so it didn't get a time signal. Katie Filbert wrote: Bad imports are bad for the osm. High quality data carefully imported is helpful. Not unconditionally. For example, high quality data carefully exported which is a copy of someone else's, and which is maintained professionally at the source, may not be helpful (because while we import it as high quality, the quality vis-a-vis the original source will deteriorate over time, with the original source issuing updates that we cannot import easily). Also, high quality data carefully imported which depicts things we cannot possibly edit - example: official airspace boundaries - is not helpful, since we are not a collector of data, but a data maintenance machine - anything static that cannot be modified by our mappers will always remain a foreign object. Leaving imports to local mappers is good. They are best able to assess the quality of the data for that area an care about quality of their local map data. It also leaves "low hanging fruit" for them. Some areas without local mappers may take longer to "finish". That is okay. +1 to that. Bye Frederik -- Frederik Ramm ## eMail frede...@remote.org ## N49°00'09" E008°23'33" ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Thu, Aug 05, 2010 at 01:38:36PM -0500, Ian Dees wrote: > I think the NHD "import" is a good example of a well-intentioned importer > (me) gone wrong. I had initially planned to import the whole darn thing in > one swoop, but various technical and life challenges came up before I could > get it going. While I was working on those issues, people started importing > it themselves (sometimes marking so on the wiki, sometimes not). Now that > there are some areas imported, the import of the whole dataset becomes > infinitely harder because we have to match existing data with new OSM-ified > data. And I'll add my own mea-culpa. I created some wiki pages/features to help partition and coordinate NHD import efforts, and then also found that I didn't have the time to follow up. I would agree that the partial imports will have increased the difficulty of a large-scale bulk import, but we already had hydrographic features from TIGER, did we not. And hand-drawn features from aerial traces and actual boots-on-the-ground mapping. Conflation in general is a tough problem, I gather. There are tools, algorithms and heuristics in the GIS world but the OSM data model makes translation between the two models somewhat difficult. For example, something that looks very interesting which I plan to examine is the Java Conflation Suite [1], which looks like it could be used over relatively small areas (probably about the size of the API limit... 0.25 degrees square?). But as a component of the JUMP[2] platform, it operates only on Shapefiles and GML out of the box. (If we could get some Java expertise I think it would be very worthwhile working with the JUMP team to create an OSM driver.) At any rate, while I think we could mitigate a number of problems given some development effort, I also agree that we might want to spend more time thinking about why we want to make the imports--and perhaps publically debate, if only in talking to yourself on the project wiki page, the pros and cons of a particular import. [1] http://www.vividsolutions.com/JCS/ [2] http://www.vividsolutions.com/jump/ ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Sat, Jan 8, 2000 at 4:20 PM, Katie Filbert wrote: > Bad imports are bad for the osm. High quality data carefully imported is > helpful. If such high quality data is available for us that is as good or > better than what we can do ourselves, then it's fine not to reinvent the > wheel. Where it's lower quality data than what we can do ourselves, then > let's not use it. [ ... ] > Leaving imports to local mappers is good. They are best able to assess the > quality of the data for that area an care about quality of their local map > data. It also leaves "low hanging fruit" for them. Some areas without > local mappers may take longer to "finish". That is okay. I have no arguments with this. Consider this: Does importing to an area where there is no thriving OSM community inhibit the creation of that thriving community in future? At SotM, one of our friends suggested that imports are, "okay except road networks. Never import road networks." The suggestion is that building the road network also builds the community. An existing road network inhibits the community. I apologize for not attributing that comment. I've forgotten who said it to me. Or from another point of view. If the local community isn't substantial enough to maintain the imported data and keep it up to date, is it better to not import until the community can maintain it? Why import 2004 data, if it will be unchanged when the 2006 update is published? Does that mean that you should only import once you have such a thriving community and high quality local data that you no longer would benefit substantially from that import? ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Sat, Jan 8, 2000 at 3:20 PM, Katie Filbert wrote: > The difference with NHD is that we are leaving conversion to osm format for > the local mapper / importer. Since OSM US has server space, maybe that's > good use of it to host converted data ready for import. > I like this... the NHD status page on the wiki sort of already does this in a backwards way. Perhaps I will look in to writing a web tool to keep track of the import and give easy access to the pre-generated OSM files for subbasins. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Thu, Aug 5, 2010 at 1:47 PM, Nathan Edgars II wrote: > On Thu, Aug 5, 2010 at 2:38 PM, Ian Dees wrote: > > I think the NHD "import" is a good example of a well-intentioned importer > > (me) gone wrong. I had initially planned to import the whole darn thing > in > > one swoop, but various technical and life challenges came up before I > could > > get it going. While I was working on those issues, people started > importing > > it themselves (sometimes marking so on the wiki, sometimes not). Now that > > there are some areas imported, the import of the whole dataset becomes > > infinitely harder because we have to match existing data with new > OSM-ified > > data. > > But how is this any different from importing it into an area where > people have already mapped some lakes from aerials? > > It isn't any different. I had made the (bad) decision at the time to import over any existing data because in the several hundred places I spot-checked, NHD was vastly superior in resolution (and probably quality). If we wanted to support imports as a community, a tool like [0] should be the only way of letting imports in to OSM. [0] http://wiki.openstreetmap.org/wiki/OSM_Import_Database#French_Corine_Import_as_Template ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Thu, Aug 5, 2010 at 2:38 PM, Ian Dees wrote: > I think the NHD "import" is a good example of a well-intentioned importer > (me) gone wrong. I had initially planned to import the whole darn thing in > one swoop, but various technical and life challenges came up before I could > get it going. While I was working on those issues, people started importing > it themselves (sometimes marking so on the wiki, sometimes not). Now that > there are some areas imported, the import of the whole dataset becomes > infinitely harder because we have to match existing data with new OSM-ified > data. But how is this any different from importing it into an area where people have already mapped some lakes from aerials? Personally I think the best US example of a bad import is the "environmental hazards". ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
On Thu, Aug 5, 2010 at 1:27 PM, Nathan Edgars II wrote: > One thing I'm wondering about: how useful is a small piece of a future > larger import? For example, there's the National Hydrography Dataset, > import of which is apparently being coordinated on the wiki. I've > imported individual lakes and swamps from it, as well as all of those > in small areas (such as Disney World). Obviously this is a good thing > if I'm working on the area. But does it help at all for a future > larger import, or is it just more 'noise' like lakes drawn from > aerials? > > I think the NHD "import" is a good example of a well-intentioned importer (me) gone wrong. I had initially planned to import the whole darn thing in one swoop, but various technical and life challenges came up before I could get it going. While I was working on those issues, people started importing it themselves (sometimes marking so on the wiki, sometimes not). Now that there are some areas imported, the import of the whole dataset becomes infinitely harder because we have to match existing data with new OSM-ified data. What I'm trying to say is that once a small part of an import happens, the larger import probably doesn't make sense to do. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
One thing I'm wondering about: how useful is a small piece of a future larger import? For example, there's the National Hydrography Dataset, import of which is apparently being coordinated on the wiki. I've imported individual lakes and swamps from it, as well as all of those in small areas (such as Disney World). Obviously this is a good thing if I'm working on the area. But does it help at all for a future larger import, or is it just more 'noise' like lakes drawn from aerials? ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] A Friendly Guide to 'Bots and Imports
Hi, Richard Weait wrote: Required reading: http://www.asklater.com/matt/wordpress/2009/09/imports-and-the-community/ http://www.asklater.com/matt/wordpress/2009/09/imports-and-the-community-ii/ I also like "The Pottery Club": http://www.gravitystorm.co.uk/shine/archives/2009/11/10/the-pottery-club/ Bye Frederik -- Frederik Ramm ## eMail frede...@remote.org ## N49°00'09" E008°23'33" ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
[Talk-us] A Friendly Guide to 'Bots and Imports
On Thu, Aug 5, 2010 at 1:19 PM, Serge Wroclawski wrote: [ ... ] > What do people think of a something like "A friendly guide to bots and > imports"? I like it. Let's start. Required reading: http://www.asklater.com/matt/wordpress/2009/09/imports-and-the-community/ http://www.asklater.com/matt/wordpress/2009/09/imports-and-the-community-ii/ And previous guidance on 'bots and imports: http://wiki.openstreetmap.org/wiki/Imports http://wiki.openstreetmap.org/wiki/Import/Guidelines http://wiki.openstreetmap.org/wiki/Automated_Edits/Code_of_Conduct And support, the imports mailing list: http://lists.openstreetmap.org/listinfo/imports Motto: "It doesn't take years of experience to mess up an import really badly. But it helps." ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us