I started mapping my local mountain village in Norway a month ago. I find the fundamental OSM data model very simple and elegant: The three basic elements (node, way/area, relation), and properties as key-value pairs. But I don’t like that free-form tagging has been elevated to a Religion.
As a mapper, I want a much more structured, well defined tagging scheme. - When I use a key, I want to know precisely what type of value is expected. - When I have entered a tag, I want to see in the editor immediately whether the tagging is valid according to approved rule, proposed rule, or not according to anything at all. - When I spend time mapping, I want to know that the data I enter is useful, and can be used for rendering, for route planning, and for future interactive maps. Clearly tags must be according to a structure if that is going to happen. - I want the editor tools to support and enforce the tagging structure. - I want that the tagging dialogs for editor tools can be easily localised to different languages, attracting non-English speaking mappers. Because I think more mappers will increase the probability for success for OSM, and hence increase my own motivation. - I DO NOT WANT TO SEARCH TALK-MAIL ARCHIVES TO LEARN HOW TO TAG! I think the tagging schema should be formally described. It should be a pragmatic mix of strict encoded values and free text values. It must of course be based on the existing (but loosely defined) tagging structure. The tagging structure must be represented as tables in the OSM database, along with a XML API. Of course, such a scheme will not do away with the problem of classifying real world things. It will always be cases where it is difficult to classify what you se as a track or as a road, as a service road or as unclassified road, and so on. And making the Grand Unified Hierarchy will always fail at some places. But once I have selected an option, I want to know that it has a well defined meaning in the OSM System. Some ideas from the top of my head: Data types: Every key should be assigned a type (or class). Could be: - boolean - enumeration - numeric - string - free text. Boolean is yes/no. Editor tools should present a Boolean key as a checkbox. Many of the existing tags fall into enumeration. “highway” is a enumeration. An enumeration may often include a “unclassified” value (e.g. building=yes). Editor tools should present an enumeration as a listbox of some sort. The defined values should be stored in the OSM database. It should be possible to enter new values, but then the system should prefix the value with “proposed:” If proposed values are approved later, then administrator can remove the “proposed:” prefix globally. Idea editor tool: Approved values marked green, previously proposed values yellow, other value is red. Numeric is a decimal number. Editor tools should enforce digits only. Subclasses may be useful: Numeric:meters, numeric:kilometres, numeric:currency. Currency obviously need special handling. A toll fee should always specify what currency is referred. String is typical like name, address, house number. Editor tools should present a single line text input. Telephone numbers, URIs, Wikipedia references may be modelled as a string, or as separate classes, or as subclasses of a string, to be discussed. Language versions may sometimes exist. Free text is end-user-description, mapper-note etc. More or less complete sentences. Language versions may often exist. Editor tools should present a multiline, scrollable text input. Comment BTW: Tag-typing can make tag-use-statistics more to the point: Statistics on the most used enumeration values is useful. Most used phone numbers/street names are less useful… I think a multilevel key scheme should be formalised. ‘:’ seems to be a de-facto standard. Special purpose mapping should use a sub-key-space. Hiking, climbing, ornithology, agriculture, archaeology and so on are examples of special interest groups which should be assigned their own toplevel tag, to be defined by special interest groups. Some generic sub-level keys should be predefined for every key. Like “note”, “description”, “source”, “fixme”. For example: key “ele” may have “ele.source”=”GPS”. “highway”=”cycleway” may have “highway:description”=”Sign says so, but lots of sharp bends and rough edges” (which can be said about 98.5% of Norwegian cycleways). Editor tools should allow entering informal information in addition to every formal key-value pairs, but in a structured way. Database: In the OSM database/API there should be a table on keys: - Key name - Key type/class - Short description in English (authorative) - Optionally a png/svg of the rendering A language translation table on keys: - Key name - Language ID (ISO 639) - Localised description A table of literal values - Key name - Literal value - Short description in English (authorative) - Optionally a png/svg of the rendering A language translation table on literal values: -Key name - Literal value - Language ID (ISO 639) - Localised name for value - Localised description in English Organisation: OSM is a community of volunteers. So neither bureaucracy or dictatorship is probably the way to go. I would guess that forking off a “tagging” mail group with a strict “keep-to-topic” policy would be the way to proceed. It could deal with tagging schema/policy in general, as well as core tagging, and assigning top level keys to other sub level tagging groups. Well, it is time to get some sleep before work calls tomorrow. I am not going to implement any of this. I just hope these ideas can spawn some productive debate. Best regards Egil Hjelmeland _______________________________________________ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk