Joost ´t Hart wrote:
Hello Joost!
> I am willing to take this on, provided that we agree on
> some sort of specification first.
This is great news :)
> So it is either
> a) a configurable quicker scan, without knowing what is at
> stake beyond,
> or
> b) a full scan, taking some time
I prefer b) but this might be cause I admit that I can not
understand the use of "correct only the first 1000
misspelled names".
> There is something to say for (a) as well.
>
> [Q] Do you a (all) agree that having a configurable
> limitation implies approach (b) for "infinite" and (a) for
> the other options?
I would agree. Though as stated above personally I see no
use in a limitation altogether.
> [Q] Do you have any suggestions to enhance the hour glass
> experience during the investigation phase?
> A bar on the progress of the scan through the dBase name
> list seems the most logical for approach (b), although it
> will not necessarily reflect the time spent/to-go on
> actually suggested corrections.
Well, it seems logical and it seems consistent. IMHO it is
also better than some %-bar that sticks for 99% of the time
at 99% done.
> [Q] So we add another configuration item somewhere in
> "Options" or is there room for a "Config" button somewhere
> in the maintenance dialog? I do not like the idea of
> having an intermediate dialog each time (before the
> investation is started).
I might add, that it would be very helpful if I could have a
set of spell checking files which can be switched there
easily. Besides the FIDE names I've some other SSP-files for
Club names e.g. Additionally, ICCF is work in progress. I
have it on my personal basis, needs work for a release,
though.
Additionally, if you dive into the spell checking code you
might notice, that ratings can be added to games that have
none. However, the current code can only handle ELO, ie.
FIDE ratings. That is, the current spell checker only
evaluates the ELO lines in a ratings file and silently
ignores other lines like ICCF. :(
It would be great if the other rating systems supported by
Scid (there are quite a bunch: ELO, ICCF, USCF, DWZ, BCF,
Rating) could be added there as well. E.g. it makes not too
much sense to add ELO ratings for a base of correspondence
games rated by ICCF, there these values should go to ICCF
field. (The aforementioned ICCF spell checking file is
actually a rating file.) The same is true for Club games
using the Clubs internal rating system, german players would
probably like to add DWZ to the proper field and so on.
As said, it's just about the interpretation of those lines
in the rating file as such. A syntax for the rating file
itself is already defined. I do not know if it is very much
work to come here to a broader coverage.
> [Q] What do you think of a producing a game list filter to
> produce a list of games with inconsistent game date vs.
> player life times?
Sounds a good idea that could also help to find errors in a
given database.
> [Q] Dunno if you got the chance to notice this, but the
> Chess Assistant guys follow the approach to explicitly add
> a player's country (FIDE style) to the name in a pair of
> ()'s. In the game list this country part is stripped off.
> What do you think of this?
I'd prefer some additional header fields like:
[whiteCountry "HUN"]
[blackCountry "EUR"]
This would be more compatible to PGN standard as Pascal also
pointed out.
> It avoids quite some name conflicts, but certainly not
> all.
In the best of all worlds those additional fields would be
used for checking, if present.
> [Q] The CA guys have added a 'FIDE-id' field to the player database.
This exists (or could exist) in the ratings files as well.
Lines start by ID, the dataset can look like this:
Aaldijk, A. A. # NED [2141]
= Aaldijk, AA
= Aaldijk, A
%ID 370251 (ICCF)
%ICCF 1996:1995 2000:2141 2000:2060,2017 2001:2004,1973
2002:1914,1921
%ICCF 2003:1918,1918 2005:1866,1867 2006:1816,1813
Notice, that at least ICCF and FIDE use identical schemes
for their respecitve IDs, therefore a destinciton is needed.
One could use the scheme suggested above, but also something
like
%ICCFID
%ELOID
etc. would make sense (ie. to link the ID with the rating
system used). Probably, from our database point of view the
latter is preferable.
However, Franz told me that FIDE ID's are not unique in the
sense that they are recycled. Therefore, one shouldn't rely
to much on them, they probably make most sense together with
the players lifespan only.
> Looks nice for cross referencing with the improving FIDE
> member lists (and we could add such id to the ssp as
> well), but to my experience there are too many
> (interesting) games played by players who do not even have
> a FIDE id. Comments?
It would not make sense to require an ID. Agree. Especially,
you surely would not want to strip of the games of Aljekine
etc ;)
But adding the ID if it can be assinged IMHO adds additional
value, indeed.
> Would it imply a change to the existing dBase format, then
> forget about it, imho.
I'd hold those IDs again in additional header fields. For CC
I currently use this kind of metadata:
[Event "Welcome Game"]
[Site "Internet Correspondence Chess Club"]
[Date "2009.02.10"]
[Round "?"]
[White "whitename"]
[Black "blackname"]
[Result "*"]
[WhiteRating "1234"]
[BlackRating "1234"]
[TimeControl "2592000+86400"]
[GameId "123456"]
[Source "http://www.iccc.com/game.aspx?game_id=123456"]
[WhiteNA "[email protected]"]
[BlackNA "[email protected]"]
[Mode "XFCC"]
[CmailGameName "ICCC-123456"]
[whiteCountry "SRB"]
[blackCountry "EUR"]
[whiteIccfID "12345"]
[blackIccfID "54321"]
From a database point of view this is horrible, indeed, and
one would surely replace e.g.
[White "whitename"]
[WhiteNA "[email protected]"]
[whiteCountry "SRB"]
[whiteIccfID "12345"]
by just one unique ID that links to a proper normalised
dataset, but this WOULD require to change the database'
internal format. (Quite heavily, I might say.) However, it
would have many advantages.
However, the above has the charming aspect that it is
perfecly compatible with PGN while adding quite a lot of
additional and sometimes very valuable data to a games set
of metadata.
--
Kind regards, / War is Peace.
| Freedom is Slavery.
Alexander Wagner | Ignorance is Strength.
|
| Theory : G. Orwell, "1984"
/ In practice: USA, since 2001
------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Scid-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scid-users