Hi, I have written a python script to complete the missing tags for japanese train stations (like the missing romaji names (in latin alphabet) from Wikipedia. I would like some confirmation that it is safe to import this information.
At the beginning, I used to do everything manually: - I downloaded a region in JOSM - I found the stations with missing attributes - most of the time, the japanese name was present, so I used it to find the japanese wikipedia page (by just adding 駅 / "station" at the end of the name) - then I completed the names fields (romaji, english, french, kana, name) if needed. I also looked at the english page because the romaji name was sometimes better there (long vowels written with a macron) It was quite long, so I automated the task: - I still have to download the stations in JOSM (using the relations saves a lot of time) - I export that to an XML - I find the nodes being a station - if a romaji name is present (romaji, english, french), I use it to complete the others - I download the Japanese and English Wikipedia pages and extract the info - I complete the missing attributes - I put the modified node into another XML - I open it in JOSM - Double-check manually that it looks fine and commit Note that it does not handle the disambiguation pages (when two stations have the same name). And it is not really clean at it is done by extracting data from HTML. I would like to know what do you think about that, concerning the legality of the import (I hope that the stations information was not put illegally in Wikipedia), and also if you have an idea to improve the process (for example to remove the need to download manually the nodes in JOSM). Cheers, Fabien PS: my japanese is pretty poor :-) _______________________________________________ Imports mailing list [email protected] http://lists.openstreetmap.org/listinfo/imports
