Some critics of an XML-based go format seem to be involved in a paranoid fantasy that they are going to be forced by evil goblins to use it against their will. No, Jennifer, that's not the case. Sure, if the format becomes popular they may end up having to deal with it, but XML-formatted game records and SGF will be round-trippable in the blink of an eye.
Many of those complaining about XML don't seem to really know too much about it. Griping that it's too big is roughly equivalent to wanting to go back to six-bit character encodings. Carping that it's not readable misses the point. Those who wonder if their favorite platform/language might not support XML haven't checked recently. The point is that XML offers an incredibly rich environment of transformability and extensibility and interoperability. Go programmers are mostly interested in move sequence data, and that's natural, but we should remember that there are lots of other pieces to the overall puzzle, including commentary, organization of problem sets, and how diagrams are handled. Let's take just a few examples. Say we want to store metadata in or alongside SGF files, and/or retrieve/search/index the metadata already in them, such as the name of the source. If the metadata is in XML, probably in a well-established format such as Dublin Core or an extension thereof, it can be discovered and processed by any search engine or, in the not too distant future, reasoning engine, to answer queries such as "find all games played by Shuko in 1971". In addition, XML has a built-in mechanism for extending vocabularies (namespaces). This allows information specific to a particular application to be included in a document, with well-understood characteristics that allow other applications to ignore the extra stuff. DocBook is an example of an XML-based document format for articles and books, technical and otherwise, For instance, O'Reilly uses a variant of DocBook for all its publishing. Using XML to represent go information would make it much easier to integrate with document formats such as DocBook. As someone pointed out, using XML would lay to rest once and for all any questions about character encodings. It also provides the built-in xml:lang mechanism to represent parallel textual information in different languages, very useful for Oriental players' names, to give just one example. Many people think of XML documents only as text files, but in fact they can take any form, including being stored in databases which are optimized for performance in executing e.g. XQuery queries. How're ya gonnna do that with SGF? Many go applications are going to require additional types of information, such as the threaded commentaries mentioned by Bill S. Certainly compared to the option of forking SGF into a dozen proprietary formats which don't interoperate, or stuffing random s**t into the C[] field, would it not make sense to take the opportunity to upgrade to a single yet extensible model What say ye, architects of the computing universe? The XML formats that have been proposed thus far for go, unfortunately, lack imagination; they are little more than SGF with a thin XML veneer. In particular, they have the problem that they hang information such as commentary and diagrams off the game tree. What we need is a new format defined ground-up from an XML perspective. Realistically, putting up white-tower proposals is not going to be successful. What is needed is real-world proposals on which real-world applications are built, so that people can see the real-world benefit. Bob Myers -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Stuart A. Yeates Sent: Thursday, October 25, 2007 1:04 AM To: computer-go Subject: Re: [computer-go] XML alternatives to SGF I sat down and read the DTD and the documentation and have some direct feedback on it. I'm aware that the DTD is quite old, and some of the ideas and solutions I'm going to suggest might not have been available (or as popular) when the DTD was written. Lines starting with <! are quotes from the DTD. <!-- P is the Paragraph of HTML --> <!ELEMENT P (#PCDATA)> Referencing HTML in this way doesn't allow validation. Defining the standard using schemas allow importing of concepts such as "Paragraph of HTML" directly from an appropriate HTML standard. <!-- Each Go file consists of one or several GoGame --> <!ELEMENT Go (GoGame*)> I believe it is a mistake not to have a protocol version number here. <!ELEMENT Application (#PCDATA)> <!ATTLIST Application format CDATA #IMPLIED> It seems unfortunate that there is no explicit version number here and no url link to the application website. <!ELEMENT Date (#PCDATA)> <!ATTLIST Date format CDATA #IMPLIED> It would be great to define this in terms of a standard format (i.e. ISO date format), since more than once I've had to infer the formatting of a date an SGF file. <!ELEMENT User (#PCDATA)> The user tag is ambiguous, is this a person's name? a user name? a user name on what server? <!ELEMENT Copyright (P+)> It would be great to use a URL here to the licence under which the file is being distributed, for example, the creative commons licences on a lot of web content these days. <!ELEMENT Rules (#PCDATA)> <!ATTLIST Rules format CDATA #IMPLIED> Using a url to a ruleset here would be great. Even better would be a machine-interpretable ruleset, but I'm not counting on that anytime soon. <!ELEMENT Black ((at)*)> Using schemas allows the content of tags to be restricted. See also discussion in the docs. <!-- This is to take care of SGF tags, which are not translated --> <!ELEMENT SGF (Arg*)> <!ATTLIST SGF type CDATA #REQUIRED> <!ELEMENT Arg (#PCDATA)> This introduces ambiguity into the file format, since it is unclear what the precedence is. If the XML says one thing and the embedded SGF tags say another, which has precedence. cheers stuart On 10/22/07, Jason House <[EMAIL PROTECTED]> wrote: > An XML alternative [1] to SGF has recently come to my attention. What do > others think of this alternative? Personally, the effect of a tag affecting > the previous tag seems kind of strange to me. > > PS: I found out about this from [2], a recently closed GoGui feature request > to write more sane sgf files that contain the standard algebraic notation > used in all GUIs. > > [1] > http://www.rene-grothmann.de/jago/Documentation/xml.html > [2] > https://sourceforge.net/tracker/?func=detail&atid=489967&aid=1752711&group_i d=59117 > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/