On Mon, 2009-05-11 at 11:31 +0100, Jakob Voss wrote > A format should be described with a schema (XML Schema, OWL etc.) or at > least a standard. Mostly this schema already has a namespace or similar > identifier that can be used for the whole format.
This is unfortunately not the case. > For instance MODS Version 3 (currently 3.0, 3.1, 3.2, 3.4) has the XML > Namespace http://www.loc.gov/mods/v3 so this is the best identifier to > identify MODS. And this is a perfect example of why this is not the case. The same mods schema (let alone namespace) defines TWO formats, mods and modsCollection. To quote from the schema: ------------------------------------------------ ***** An instance of this schema is (1) a single MODS record: --> <xsd:element name="mods" type="modsType"/> <!-- or (2) a collection of MODS records: --> <xsd:element name="modsCollection"> <xsd:complexType> <xsd:sequence> <xsd:element ref="mods" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <!-- ***** End of "instance" definition ------------------------------------------------- So you're using the same identifier to identify two different things at the same time. We discussed this a lot during the development of SRU and there simply isn't an existing identifier for an XML 'format'. Also consider the following more hypothetical, but perfectly feasible situations: * One namespace is used to define two _totally_ separate sets of elements. There's no reason why this can't be done. * One namespace defines so many elements that it's meaningless to call it a format at all. Even though the top level tag might be the same, the contents are so varied that you're unable to realistically process it. Rob