Stefan Keller wrote: >Sent: 12 February 2008 1:27 PM >To: Dave Stubbs >Cc: [email protected] >Subject: Re: [OSM-dev] Restrict key names on order to retain reusability of >OSM > >Thanks for the pointer o XML, Dave. > >UTF-8 is a good choice for content, but this is about *keys* (i.e. >attributes). >Keys correspond to XML elements which are defind as names [1](!) >... which nicely fits the definition I proposed. >
The only reason that you have found a large chunk of the keys fitting your limited character set is that most are based on Map Features, which was written in English. Anyone could create a set of keys in any other format they wish that might for instance be using Chinese. Cheers Andy >And to get the discussion little more specific I made some statistics >with some recent OSM data from an european area of about 75MB: >From about 100'000 key-value pairs there are about 8000 distinct pairs >and I found about 8 outliers, listed below. This is at least what came >out perhaps of OSM REST API 0.5 (or Osmosis)? > >So, the benefit of valid attribute names costs almost nothing to clean, >almost nothing to prevent (e.g. in editors) but let's us write nice >applications - and I mean lot more than those you mentioned above... > >Stefan > >[1] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-common-syn > >Outliers found in recent OSM data: >'Node/Linear/Area '='Route sans Nom' >'Tunnel '='yes' >'opm:capacity'='2' >'wdb:source'='CIA World database II - europe-bdy.txt - segment 100' >'whc:criteria'='(ii)(iv)' >'whc:id'='268' >'whc:inscription_date'='1983' >'¨name'='Südstrasse' > > >2008/2/12, Dave Stubbs <[EMAIL PROTECTED]>: > > 2008/2/12 Stefan Keller <[EMAIL PROTECTED]>: > > GML/XML is *not* the issue, you know that: > > It's almost any application outside OSM database. > > It's about reusability and consistency! > > > > I love the approach of key-value pairs (and I like beers too... ;- >>). > > I agree with Martijn that before all, spaces must be kept out. > > I agree too with Frederik: Colons can be included as namespace >delimiters. > > Namespace, tags and keys reminds us, that OSM is a database and > > *not* a Wiki on an island (whereas I'm loving Wikis used as they >are)! > > > > > > So I'm sorry, guys, but I have to insist: > > I propose distinctly to restrict key names (elemement, tag) to the >set > > 'aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ_', now > > plus colon as namespace delimiter, allowed once and not at the >beginning or > > the end. > > > Even XML allows significantly more than that -- pretty much anything > but whitespace [1], with a ":" as namespace delimiter. > So insist all you like, but personally I think making people handle > UTF-8 nicely is probably a good thing given the number of values that > will rely on it heavily anyway. Most reasonable programming > environments have decent unicode support these days, and certainly > every XML parser that isn't a hack. > > Dave > > [1] http://www.w3.org/TR/2006/REC-xml-20060816/#charsets > > _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev

