I find it interesting the position I am taking on this. I'm a bit of believer in not making your parser too forgiving, or gratuitously breaking standards because it encourages sloppiness. In the days when PGN files for chess were just starting to become popular, it was horrible what you would find in the PGN files. Castling could appear as o-o, O-O, 0-0 and e1g1 among other things. Instead of Nbd2 you might have the incorrect disambiguation of N1d2 - which describes the move un-ambiguously, but is non-standard (unless there is also a knight on b3 in which case N1d2 is correct.)
So many people made their parsers "forgiving" because many of the problems were easily corrected and it was easy to tell what move was intended. The idea was to make you parser strict on export, forgiving on import. The problem with being forgiving on input is that it doesn't encourage others to strictly adhere to the standards. Suddenly new defacto standards are acceptable which can complicate things. Once everyone else forgives, you must forgive or suddenly your software becomes non-standard in a defacto way. I think the right way to deal with this is to always be strict but provide cleaner tools. If a tool can correct a non-standard format, make it clear that it is broken and needs correction. Then the file can be passed on in a properly working state. However, I differently with this issue of allowing "standard" notation with SGF. I advocate "defining" it as acceptable. One of the primary reasons for having non-binary formats such as SGF and XML is to make it human readable. It's supposed to be made easy to edit these files manually without absolutely requiring special tools, not to mention the ease of debugging. I would rather look at a text file than a hex dump and is why we use text formats instead of binary formats. Despite posts to the contrary, "ae" is harder for humans to read than "E5" for two reasons: 1. We are used to "E5" notation. This IS the standard used by everything except SGF. Esperanto may be just as easy to learn as English but nobody speaks it (or only about 1 million out of 6.5 billion do.) 2. I would argue that having a letter and a number is more clear than having 2 letters or 2 numbers. This is an aid to distinguishing file and rank, or row and column. If I say the A edge, it means something with "e5" notation but it could mean 2 different things with "aa" notation. Number is always row, Letter is always column. No big deal, but nice. - Don Jeff Nowakowski wrote: > On Tue, 2007-10-23 at 08:42 -0400, Don Dailey wrote: > >> GTP pretty much replace GMP. A lot of resistance because GMP was the >> defacto standard at the time. It would have been foolish to insist on >> being backwards compatible. >> > > GTP was a huge change in protocol with clear benefits. What's being > quibbled over now is minor change in the coordinate system at the cost > of breaking all existing tools, with the exception of a couple that have > implemented this incompatible change. The benefit does not outweight > the cost. > > -Jeff > > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/