Some critics of an XML-based go format seem to be involved in a paranoid
fantasy that they are going to be forced by evil goblins to use it against
their will. No, Jennifer, that's not the case. Sure, if the format becomes
popular they may end up having to deal with it, but XML-formatted game
records and SGF will be round-trippable in the blink of an eye. 

Many of those complaining about XML don't seem to really know too much about
it. Griping that it's too big is roughly equivalent to wanting to go back to
six-bit character encodings. Carping that it's not readable misses the
point. Those who wonder if their favorite platform/language might not
support XML haven't checked recently. The point is that XML offers an
incredibly rich environment of transformability and extensibility and
interoperability.

Go programmers are mostly interested in move sequence data, and that's
natural, but we should remember that there are lots of other pieces to the
overall puzzle, including commentary, organization of problem sets, and how
diagrams are handled.

Let's take just a few examples. Say we want to store metadata in or
alongside SGF files, and/or retrieve/search/index the metadata already in
them, such as the name of the source. If the metadata is in XML, probably in
a well-established format such as Dublin Core or an extension thereof, it
can be discovered and processed by any search engine or, in the not too
distant future, reasoning engine, to answer queries such as "find all games
played by Shuko in 1971".

In addition, XML has a built-in mechanism for extending vocabularies
(namespaces). This allows information specific to a particular application
to be included in a document, with well-understood characteristics that
allow other applications to ignore the extra stuff.

DocBook is an example of an XML-based document format for articles and
books, technical and otherwise, For instance, O'Reilly uses a variant of
DocBook for all its publishing. Using XML to represent go information would
make it much easier to integrate with document formats such as DocBook.

As someone pointed out, using XML would lay to rest once and for all any
questions about character encodings. It also provides the built-in xml:lang
mechanism to represent parallel textual information in different languages,
very useful for Oriental players' names, to give just one example.

Many people think of XML documents only as text files, but in fact they can
take any form, including being stored in databases which are optimized for
performance in executing e.g. XQuery queries. How're ya gonnna do that with
SGF?

Many go applications are going to require additional types of information,
such as the threaded commentaries mentioned by Bill S. Certainly compared to
the option of forking SGF into a dozen proprietary formats which don't
interoperate, or stuffing random s**t into the C[] field, would it not make
sense to take the opportunity to upgrade to a single yet extensible model
What say ye, architects of the computing universe?

The XML formats that have been proposed thus far for go, unfortunately, lack
imagination; they are little more than SGF with a thin XML veneer. In
particular, they have the problem that they hang information such as
commentary and diagrams off the game tree. What we need is a new format
defined ground-up from an XML perspective. Realistically, putting up
white-tower proposals is not going to be successful. What is needed is
real-world proposals on which real-world applications are built, so that
people can see the real-world benefit.

Bob Myers


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Stuart A. Yeates
Sent: Thursday, October 25, 2007 1:04 AM
To: computer-go
Subject: Re: [computer-go] XML alternatives to SGF

I sat down and read the DTD and the documentation and have some direct
feedback on it. I'm aware that the DTD is quite old, and some of the
ideas and solutions I'm going to suggest might not have been available
(or as popular) when the DTD was written. Lines starting with <! are
quotes from the DTD.

<!-- P is the Paragraph of HTML -->

<!ELEMENT P (#PCDATA)>


Referencing HTML in this way doesn't allow validation. Defining the
standard using schemas allow importing of concepts such as "Paragraph
of HTML" directly from an appropriate HTML standard.


<!-- Each Go file consists of one or several GoGame -->

<!ELEMENT Go (GoGame*)>



I believe it is a mistake not to have a protocol version number here.


<!ELEMENT Application (#PCDATA)>

<!ATTLIST Application format CDATA #IMPLIED>


It seems unfortunate that there is no explicit version number here and
no url link to the application website.


<!ELEMENT Date (#PCDATA)>

<!ATTLIST Date format CDATA #IMPLIED>


It would be great to define this in terms of a standard format (i.e.
ISO date format), since more than once I've had to infer the
formatting of a date an SGF file.


<!ELEMENT User (#PCDATA)>


The user tag is ambiguous, is this a person's name? a user name? a
user name on what server?

<!ELEMENT Copyright (P+)>

It would be great to use a URL here to the licence under which the
file is being distributed, for example, the creative commons licences
on a lot of web content these days.



<!ELEMENT Rules (#PCDATA)>

<!ATTLIST Rules format CDATA #IMPLIED>

Using a url to a ruleset here would be great.



Even better would be a machine-interpretable ruleset, but I'm not
counting on that anytime soon.



<!ELEMENT Black ((at)*)>

Using schemas allows the content of tags to be restricted. See also
discussion in the docs.




<!-- This is to take care of SGF tags, which are not translated -->

<!ELEMENT SGF (Arg*)>

<!ATTLIST SGF

        type CDATA #REQUIRED>

<!ELEMENT Arg (#PCDATA)>



This introduces ambiguity into the file format, since it is
unclear what the precedence is. If the XML says one thing and the
embedded SGF tags say another, which has precedence.

cheers
stuart

On 10/22/07, Jason House <[EMAIL PROTECTED]> wrote:
> An XML alternative [1] to SGF has recently come to my attention.  What do
> others think of this alternative?  Personally, the effect of a tag
affecting
> the previous tag seems kind of strange to me.
>
> PS: I found out about this from [2], a recently closed GoGui feature
request
> to write more sane sgf files that contain the standard algebraic notation
> used in all GUIs.
>
> [1]
> http://www.rene-grothmann.de/jago/Documentation/xml.html
> [2]
>
https://sourceforge.net/tracker/?func=detail&atid=489967&aid=1752711&group_i
d=59117
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to