Sam Vekemans wrote: > Im not making a new script, the folks who understand python, are > making a script that can solve the 'duplicate intersecting nodes' and > 'inner/outer relation' problem. > > (Yes you might have fixed it, but "i've reached the crux of my > technical ability") > > Sure, they can use java, if that works for them. > > Im just isolating my canvec-to-osm script to handle the 80 map > features that i know converts correctly. The other 10 can be dealt > with using another option. > > Since i dont know how to program (accept in basic DOS) im not that much help. > > Others who are experienced in python & java are welcome to take the lead. > > Frank already started to help out maptastically :) > > Sam > > On 11/6/09, Ian Dees <ian.d...@gmail.com> wrote: > >> On Nov 6, 2009, at 9:28 PM, Sam Vekemans >> <acrosscanadatra...@gmail.com> wrote: >> >> >>> This python script is yet to be made 'canvec2osm.py' its open to >>> anyone to make, and i recommend that who ever does, its ONLY 1 person >>> who is in charge of maintaining the script. >>> >> Why are you creating another shp to osm converter? Is there something >> the existing tools don't do? I thought you were using shp-to-osm? What >> changed? >>
Hi Sam, The idea what I mentioned is to convert your batch files to Python. The main reason for that is that this way people using Linux or other OSs can also perform this conversion locally. I know you want to convert all of Canada yourself, but that is still a lot of work. Many hands make light work. The only thing we should be concerned about is that we're all using the same rule file. This is the most vital part in ensuring consistency across the country. To Ian: shp-to-osm won't disappear in the process. Sam's batch files are still calling it. It would be pointless to create a second version. There is already a Python script with the same name, but it was written specifically for the MassGIS import. Sam, can you give some additional clarification what your intentions are? I'm afraid I'm not following them well. When you mentioning removing duplicate nodes and relations, it looks as if you intend to create a script which does some post-processing. Is that correct? I haven't started anything in that area. (I actually still need to start with the Python version of your batch script, but I'm going to work on that today.) Now we're talking on this: in shp-to-osm (Java) tags are now put properly on the multipolygon relationship. They also still appear on the inner polygons (mentioned to Ian already), but that should be fixed. Since shp-to-osm is called for one feature type at a time, there are some new challenges when multiple feature types are involved. I guess you've been thinking about that already. Duplicate nodes will become an issue when you have for example a residential area with an adjacent wooded area (assuming that the boundaries are matching exactly). It will be difficult to deal with this. I'm not sure if it would be technically possible to adjust shp-to-osm for that, but the result will be that the files will become huge. They already have to be split up for certain feature types, and I don't think it is possible to use the same set of IDs over multiple output files. From what I understand about the upload process (and someone please correct me if this isn't right), the OSM server will return new ID numbers for any nodes, ways, and relationships uploaded. In the OSM files generated by Ian, and also when you're editing in JOSM yourself, temporary IDs are assigned. They have a negative value, which indicates that these objects don't exist on the server. So, this means that, after you have uploaded file00.osm, and you open file01.osm, JOSM or the server do no longer remember to what objects any IDs are _referring_ to, if those objects are not _defined_ in the same file. The same issue is going on with multipolygon relationships, where a part of the ways are reused. This can only happen if everything is defined in the same file. And such a file will be way too large to upload safely to the server. Recently I noticed that if you want to create/update/delete about 10k objects the server is going to "act difficult". Regarding relationships, and reuse of the geometry: I think that we have not only to remove duplicate nodes, but also split up ways, otherwise the JOSM validator will complain about overlapping ways. A way can be used in multiple relationships. A third thing which might need to be resolved are map features which cross the boundary of the NTS tiles. Do we want to merge them? If these features have the same Geobase metadata (ID, etc.), then it shouldn't be a big problem, otherwise we need to decide whether we prefer to keep the metadata, or if we want to have merged features. All of this means we can't do anything to clean up the data. Sure we can, but this can only be done after an initial upload to the server. That way we can still apply any logic to deal with duplicate nodes, reuse of features in multiple relationships, and merging features. The script will have to work live on the server: download an area, do the cleanup, and upload. In such case I think it would be the safest (and required!) that the script only does the download and the cleanup, and that a human verifies the result before upload. If we're implementing such cleanup, it needs to be executed as soon as possible after the upload, because sometimes users are very quick to make changes to freshly uploaded data. Whew, another long one. I hope you don't mind :) Any thoughts about this essay? Keep in mind this is just my opinion, and by no means the thing we should actually do. Many of you know the Canvec data better than I do, so you'll also know better if this approach makes sense. Cheers, Frank _______________________________________________ Imports mailing list Imports@openstreetmap.org http://lists.openstreetmap.org/listinfo/imports