On 23/10/2007, Gunnar Farnebäck <[EMAIL PROTECTED]> wrote:

> A potential problem with an XML library is the internal representation
> of the game tree. For debugging purposes it's not unusual to dump
> reading trees containing literally millions of moves, sometimes up to
> the limit of the available RAM. If an XML tree requires more bytes per
> move, the functionality would suffer. Does anybody know how big a node
> would become in expat for a move tag?

There are two widespread models for parsing XML, DOM and SAX, SAX does
not require you to be able to store the whole XML file in memory, it's
a streaming model.

> Next problem is of course the file size of the game records. If they are
> 5 or 10 times as large we're talking 9 MB or 18 MB for the game records.
>   Not a huge amount by itself but when considering the number of copies
> of GNU Go being distributed it sums up.

I'm not sure about C, but in Java it's normal to pipe XML through gzip
(which is included in the Java standard libraries), as this has been
found to increase read/write speed (i.e. the cpu hit of
(de-)compression is less than the speed up of writing fewer bytes to
the disk). I've not studied it deeply, but I imagine a compressed XML
file would be smaller than an SGF file.

> So what are the benefits? So far I haven't seen anything that is
> relevant for GNU Go. The readability is not really an issue, it's almost
> never possible to visualize a game record without a graphical viewer
> anyway, regardless of coordinate representation, and from the examples
> I've seen XML has been worse off than sgf on readability. Character sets
> are a non-issue for GNU Go, information about players is simply ignored.
> Version control conflicts have never happened with game records and I
> don't foresee it for the future.

Where I would see a win for GnuGo might be:

(*) a standard notation for "move X should have been played A rather
then B" which would allow clients to provide direct, machine readable,
feedback to developers and a potential format for regression tests.

(*) a standard notation for representing compile-time and runtime arguments.

(*) a standard notation for representing runtime information such as
the top N moves considered.

...


> But I can provide a hint for something I would find useful. If it's
> something I'm missing in today's sgf viewers it's a good way to dump and
> inspect a transposition table. It's possible to expand the
> transpositions into a big tree with duplicate subtrees but that makes it
> very difficult to traverse it efficiently. Alternatively the tree is cut
> off when the same position is reached again but then there's no easy way
> to find where the position was first reached, which is needed to follow
> the continuations.

My program doesn't use transposition tables, so I don't understand
them enough to know whether this is practical.

cheers
stuart
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to