Stefan Keller skrev: > You are right that XML names (= keys/tags) are valid in unicode > in which case the encoding of the whole XML document (exchange file) > must support this. > > But you know well that many tools have problems with non-ASCII XML > element and attribute names (for content/value UTF-8 is ok since > chars can escaped)! > > So, my last 20cents for valid key names before I give up is the following: > 'aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ_-.0123456789' > whereas such qualified names must begin with a letter and contain at > most one colon and have at most a length of 255.
Stefan, if I am coming across in this message as a bit harsh, then you're not mistaken - I am a grumpy old man, and damn proud of it. Just remember, it's not personal. I try to go after the ball, not the man. (No FakeSteveC, it doesn't mean I try to go after a guys ball(s) in THAT way..) Three times you have posted that you want to limit the characters used in tag naming, revising your proposal first to include the colon, and now to include numbers. Each previous attempt you have been told that UTF8 is valid, for good reasons, and yet still you persist. You have not once given a valid TECHNICAL reason for such a change, WITHIN THE SCOPE OF OSM, for limiting the characters allowable in tag names. As far as I can see from your first message on this subject, your idea stems from converting OSM data from its XML format to GML. Your project might need GML, OSM doesn't. If you are in the need of GML compliant output, then it is your task to massage the OSM provided data into a GML compliant output. It is not the task of OSM to have the data in GML compliant format, since the XML format with UTF8 as allowable just plain works for OSM. The tools that you state have problems with non-ascii characters should be fixed to be able to handle the UTF8 characters. Not the other way around, by changing the dataset to comply with the requirements of the tools. You might think it's a hen and egg situation, although in this case, the egg definitely is the important part, and has priority. The egg (the data) in this case has attributes that can contain non-ascii characters, thereby allowing non-latin based nationalities to define their own tags in their own language. This is a GOOD thing, which should NOT be changed. The hen (tools and programs utilizing OSM data) must take this into account. If a tool can't do that, then the farmer (the user of that tool) have to either change that tool, or use the egg to prepare a dish that the tool can digest (massage the OSM data into a format the tool can use). The farmer should not try to persuade the egg that it is better of as a watermelon. So to recap: The current allowable characters in OSM tag names is UTF8 - Deal with it, instead of trying to impose limitations into OSM to make OSM data comply with YOUR requirements. Dutch _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev

