Bob I'm sorry if you read my message as saying that I'm not in favour of an XML file format for go. I'm actually very much in favour of such a thing, which is why I spent two hours getting to understand the current contender and pointing out some of the issues that need to be fixed.
cheers stuart On 10/27/07, Bob Myers <[EMAIL PROTECTED]> wrote: > Some critics of an XML-based go format seem to be involved in a paranoid > fantasy that they are going to be forced by evil goblins to use it against > their will. No, Jennifer, that's not the case. Sure, if the format becomes > popular they may end up having to deal with it, but XML-formatted game > records and SGF will be round-trippable in the blink of an eye. > > Many of those complaining about XML don't seem to really know too much about > it. Griping that it's too big is roughly equivalent to wanting to go back to > six-bit character encodings. Carping that it's not readable misses the > point. Those who wonder if their favorite platform/language might not > support XML haven't checked recently. The point is that XML offers an > incredibly rich environment of transformability and extensibility and > interoperability. > > Go programmers are mostly interested in move sequence data, and that's > natural, but we should remember that there are lots of other pieces to the > overall puzzle, including commentary, organization of problem sets, and how > diagrams are handled. > > Let's take just a few examples. Say we want to store metadata in or > alongside SGF files, and/or retrieve/search/index the metadata already in > them, such as the name of the source. If the metadata is in XML, probably in > a well-established format such as Dublin Core or an extension thereof, it > can be discovered and processed by any search engine or, in the not too > distant future, reasoning engine, to answer queries such as "find all games > played by Shuko in 1971". > > In addition, XML has a built-in mechanism for extending vocabularies > (namespaces). This allows information specific to a particular application > to be included in a document, with well-understood characteristics that > allow other applications to ignore the extra stuff. > > DocBook is an example of an XML-based document format for articles and > books, technical and otherwise, For instance, O'Reilly uses a variant of > DocBook for all its publishing. Using XML to represent go information would > make it much easier to integrate with document formats such as DocBook. > > As someone pointed out, using XML would lay to rest once and for all any > questions about character encodings. It also provides the built-in xml:lang > mechanism to represent parallel textual information in different languages, > very useful for Oriental players' names, to give just one example. > > Many people think of XML documents only as text files, but in fact they can > take any form, including being stored in databases which are optimized for > performance in executing e.g. XQuery queries. How're ya gonnna do that with > SGF? > > Many go applications are going to require additional types of information, > such as the threaded commentaries mentioned by Bill S. Certainly compared to > the option of forking SGF into a dozen proprietary formats which don't > interoperate, or stuffing random s**t into the C[] field, would it not make > sense to take the opportunity to upgrade to a single yet extensible model > What say ye, architects of the computing universe? > > The XML formats that have been proposed thus far for go, unfortunately, lack > imagination; they are little more than SGF with a thin XML veneer. In > particular, they have the problem that they hang information such as > commentary and diagrams off the game tree. What we need is a new format > defined ground-up from an XML perspective. Realistically, putting up > white-tower proposals is not going to be successful. What is needed is > real-world proposals on which real-world applications are built, so that > people can see the real-world benefit. > > Bob Myers > > > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Stuart A. Yeates > Sent: Thursday, October 25, 2007 1:04 AM > To: computer-go > Subject: Re: [computer-go] XML alternatives to SGF > > I sat down and read the DTD and the documentation and have some direct > feedback on it. I'm aware that the DTD is quite old, and some of the > ideas and solutions I'm going to suggest might not have been available > (or as popular) when the DTD was written. Lines starting with <! are > quotes from the DTD. > > <!-- P is the Paragraph of HTML --> > > <!ELEMENT P (#PCDATA)> > > > Referencing HTML in this way doesn't allow validation. Defining the > standard using schemas allow importing of concepts such as "Paragraph > of HTML" directly from an appropriate HTML standard. > > > <!-- Each Go file consists of one or several GoGame --> > > <!ELEMENT Go (GoGame*)> > > > > I believe it is a mistake not to have a protocol version number here. > > > <!ELEMENT Application (#PCDATA)> > > <!ATTLIST Application format CDATA #IMPLIED> > > > It seems unfortunate that there is no explicit version number here and > no url link to the application website. > > > <!ELEMENT Date (#PCDATA)> > > <!ATTLIST Date format CDATA #IMPLIED> > > > It would be great to define this in terms of a standard format (i.e. > ISO date format), since more than once I've had to infer the > formatting of a date an SGF file. > > > <!ELEMENT User (#PCDATA)> > > > The user tag is ambiguous, is this a person's name? a user name? a > user name on what server? > > <!ELEMENT Copyright (P+)> > > It would be great to use a URL here to the licence under which the > file is being distributed, for example, the creative commons licences > on a lot of web content these days. > > > > <!ELEMENT Rules (#PCDATA)> > > <!ATTLIST Rules format CDATA #IMPLIED> > > Using a url to a ruleset here would be great. > > > > Even better would be a machine-interpretable ruleset, but I'm not > counting on that anytime soon. > > > > <!ELEMENT Black ((at)*)> > > Using schemas allows the content of tags to be restricted. See also > discussion in the docs. > > > > > <!-- This is to take care of SGF tags, which are not translated --> > > <!ELEMENT SGF (Arg*)> > > <!ATTLIST SGF > > type CDATA #REQUIRED> > > <!ELEMENT Arg (#PCDATA)> > > > > This introduces ambiguity into the file format, since it is > unclear what the precedence is. If the XML says one thing and the > embedded SGF tags say another, which has precedence. > > cheers > stuart > > On 10/22/07, Jason House <[EMAIL PROTECTED]> wrote: > > An XML alternative [1] to SGF has recently come to my attention. What do > > others think of this alternative? Personally, the effect of a tag > affecting > > the previous tag seems kind of strange to me. > > > > PS: I found out about this from [2], a recently closed GoGui feature > request > > to write more sane sgf files that contain the standard algebraic notation > > used in all GUIs. > > > > [1] > > http://www.rene-grothmann.de/jago/Documentation/xml.html > > [2] > > > https://sourceforge.net/tracker/?func=detail&atid=489967&aid=1752711&group_i > d=59117 > > > > _______________________________________________ > > computer-go mailing list > > computer-go@computer-go.org > > http://www.computer-go.org/mailman/listinfo/computer-go/ > > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/