Re: [computer-go] Collection of games for train? (jgogears)

2008-03-09 Thread D Gilder
Here is a perl script that downloads games from the KGS archives. You need 
some usernames, which you can get simply by logging on to KGS and looking at 
who's playing. Some players have a 1000 or more 19x19 games in the archives, 
others hardly any. 

In my editor the code below looks odd so I've put it in an attachment as well.

dan


# This script will download player's games from the KGS archives
# It requires a file containing simply a list of kgs usernames, seperated by
# a newline. Names can be commented out by preceeding them with a '#'.
# The name of this file is stored in $playerfile (see below)
# The directory where the archives
# will be saved, is stored in $localbaseurl
# A player's index page is also stored there as user.html
# It makes some assumptions about how the KGS archives are arranged:
# It assumes a player's games (eg Abc's), are stored at
# http://www.gokgs.com/servlet/archives/en_US/Abc-2008-3.zip
# for March 2008, and that the link to April's zip is written
# gameArchives.jsp?user=Abc&year=2008&month=4
# Written by D.Gilder 2007

use strict;
use warnings;
use HTML::TokeParser;
use LWP::UserAgent;

# Change these two variables as necessary
#--
my $localbaseurl = '/home/dan/Desktop/KGS/tar/';
my $playerfile =   '/home/dan/Desktop/KGS/kgsplayers.txt';
#--

my $webbaseurl = 'http://www.gokgs.com/';
my $ua = LWP::UserAgent->new;

open(INPUT, "<", $playerfile) or die "Couldn't open $playerfile\n";
my $elapsedtime;
while () {
  chomp;
  $elapsedtime = time;
  unless (/^#/) {
print $_,"\n";
getuserpage($_);
# wait at least 4 seconds between calls to getuserpage
# as requested by William Shubert
# Do not delete the following line
sleep($elapsedtime - time + 4) if time - $elapsedtime < 4;
  }
}
close(INPUT) or die "Couldn't open $playerfile\n";

sub myconnect {
  my $url = shift;
  my $reply = $ua->get($webbaseurl.$url);
  # Check the outcome of the response
  $reply->is_success or die 'Couldn\'t connect to '.$url.' Stopped '."$!";
  return $reply;
}

sub getuserpage {
  my $user = shift;
  my $url = 'gameArchives.jsp?user='.$user;
  my $mylocalurl = $localbaseurl.'user.html';
  open(OUTFILE, ">",$mylocalurl) or die 'Can\'t open '.$mylocalurl;
my $reply = myconnect($url);
print OUTFILE $reply->content;
  close OUTFILE or die 'Can\'t close '.$mylocalurl;

  my $p = HTML::TokeParser->new($mylocalurl);

# Skip to start of user data

  $p->get_tag("table");
  $p->get_tag("table");
  while (1) {
my $a_token = $p->get_tag("a");
my $str = $a_token->[1]{href};
last unless defined $str && $str =~ /year=(.+)&month=(.+)/;
my $file = $user.'-'.$1.'-'.$2.'.tar.gz';
my $target = $webbaseurl.'servlet/archives/en_US/'.$file;
system("lwp-download $target $localbaseurl") unless -e $localbaseurl.
$file;
  }
}

On Sunday 09 March 2008, Stuart A. Yeates wrote:
> Hello Everyone
>
> I've been working for a while on a computer go player which takes a
> rather different tack[0]. Rather than using embedded programmatic
> domain knowledge (like GNU Go) or dynamic evaluation of board
> positions (UCT etc), it uses domain knowledge inferred from game
> records and a complex look-up during play.
>
> My approach is to define a linearisation of the board with respect to
> a position (or a set of linearisations, taking into account symmetry),
> I then use classical string processing techniques, principally a large
> prefix tree. Conceptually this tree is very large (one leaf for every
> vertex for every possible board position), but it is not fully
> expanded. I'm foreseeing that crafting rules relating to the expansion
> of the tree to be a core problem. Does anyone know of any research
> into similar approaches?
>
> The program will be slow and memory hungry to train, but should be
> fast to play. I'm anticipating it will be strong at the opening but
> possibly confused by random moves (i.e playing on the edge of the
> board).
>
> Currently I have developed a core system which is now plays games that
> games that look at least a little like go.
>
> What I'm after now is a good collection of games to train it on, so I
> can see check whether further developments are making a positive
> difference. What I think I need is a relatively homogeneous collection
> of tens or hundreds of thousands of 19x19 games of varying levels.
> Does anyone know of a collection such as this I can download
> relatively simply?
>
> cheers
> stuart
> [0] http://code.google.com/p/jgogears/
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/




kgshoover.pl
Description: Perl program
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Collection of games for train? (jgogears)

2008-03-09 Thread Stuart A. Yeates
Hello Everyone

I've been working for a while on a computer go player which takes a
rather different tack[0]. Rather than using embedded programmatic
domain knowledge (like GNU Go) or dynamic evaluation of board
positions (UCT etc), it uses domain knowledge inferred from game
records and a complex look-up during play.

My approach is to define a linearisation of the board with respect to
a position (or a set of linearisations, taking into account symmetry),
I then use classical string processing techniques, principally a large
prefix tree. Conceptually this tree is very large (one leaf for every
vertex for every possible board position), but it is not fully
expanded. I'm foreseeing that crafting rules relating to the expansion
of the tree to be a core problem. Does anyone know of any research
into similar approaches?

The program will be slow and memory hungry to train, but should be
fast to play. I'm anticipating it will be strong at the opening but
possibly confused by random moves (i.e playing on the edge of the
board).

Currently I have developed a core system which is now plays games that
games that look at least a little like go.

What I'm after now is a good collection of games to train it on, so I
can see check whether further developments are making a positive
difference. What I think I need is a relatively homogeneous collection
of tens or hundreds of thousands of 19x19 games of varying levels.
Does anyone know of a collection such as this I can download
relatively simply?

cheers
stuart
[0] http://code.google.com/p/jgogears/
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/