Re: [computer-go] How to properly implement RAVE?

2009-01-15 Thread Daniel Waeber
Thanks you. I think that I understand it now :)

On 23:21 Wed 14 Jan , Mark Boon wrote:
 You have to understand that the 'start' variable really starts at the  
 root from the position for which we do the search. So all the moves  
 'below' the playoutNode are also taken into account. The check in the  
 earlier part if (_colorMap[moveXY]==0) makes sure there's no move  
 between the playoutNode and the n.move as you call it.

Ah, yes, I did not get the meaning of start right. But still I think you
have to incrementally add values to the maps as you go down the tree.

I think there's a problem with caputres/ko-fights inside the tree.
All nodes after the capture should get the amaf color/value for the
stones put there after the capture and not the value of the stones that
were put there and captured.

Most likely I missed a little piece of code again, but hey, perhaps its
a real bug.

 That is, if I understand you correctly. Because I'm not quite sure  
 what you mean by 'finding all these nodes n'. This is not a sub-set of  
 RAVE as I understand it. What you see in the code is the accumulation  
 of the AMAF values after each playout. The RAVE value is calculated  
 using these values when comparing move-candidates, which is in an  
 altogether different place. (The MonteCarloTreeSearchResult  class in  
 my project).
 
 I'm afraid I haven't made things much clearer for you.

The problem is that I was confused, but now, as I understand it, all
your and all the other comments suddenly make sense ;)

 Mark

Thank again,
  wabu

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] How to properly implement RAVE?

2009-01-15 Thread Daniel Waeber
Hi,

On 11:24 Thu 15 Jan , Mark Boon wrote:
 
 On Jan 15, 2009, at 10:47 AM, Daniel Waeber wrote:
 
  Thanks you. I think that I understand it now :)
 
  On 23:21 Wed 14 Jan , Mark Boon wrote:
  You have to understand that the 'start' variable really starts at the
  root from the position for which we do the search. So all the moves
  'below' the playoutNode are also taken into account. The check in the
  earlier part if (_colorMap[moveXY]==0) makes sure there's no move
  between the playoutNode and the n.move as you call it.
 
  Ah, yes, I did not get the meaning of start right. But still I think  
  you
  have to incrementally add values to the maps as you go down the tree.
 
 But it does that. This happens when playoutNode is set to its parent  
 and the AMAF values are added again (now for the other color) until  
 the root is reached.

yes, but the weight/color maps stay the same for all updated nodes.

I think the first playoutNode (the one most deep inside the tree) only
should get amaf values for the random playout, the next one form random
playout + from the first playoutNode ... and the root node amaf values
form all the nodes.

I attached code for what I would do, but had no time to test it.

  I think there's a problem with caputres/ko-fights inside the tree.
  All nodes after the capture should get the amaf color/value for the
  stones put there after the capture and not the value of the stones  
  that
  were put there and captured.
 
  Most likely I missed a little piece of code again, but hey, perhaps  
  its
  a real bug.
 
 I did stop to think about ko and 'ishi no shita' (something like  
 'playing under the stones') a little bit but couldn't come to an  
 immediate conclusion what would be the best thing to do. So I decided  
 to leave it as it is. You didn't miss any little piece of code there.  
 Maybe there's room for improvement in case of ko, but my guess is the  
 difference will be minimal at best. If you find it does make a big  
 difference everyone here will be delighted to hear it.
 
 Given how it's performing I doubt there are major problems with my  
 AMAF-RAVE implementation.


regards
  wabu

 
 Mark
 
diff --git a/TesujiRefBot/source/tesuji/games/go/reference/search/MonteCarloTreeSearch.java b/TesujiRefBot/source/tesuji/games/go/reference/search/MonteCarloTreeSearch.java
index 5e47e00..ee650e1 100644
--- a/TesujiRefBot/source/tesuji/games/go/reference/search/MonteCarloTreeSearch.java
+++ b/TesujiRefBot/source/tesuji/games/go/reference/search/MonteCarloTreeSearch.java
@@ -566,6 +566,7 @@ public class MonteCarloTreeSearchMoveType extends Move
 			
 			TreeNodeMonteCarloTreeSearchResultMoveType node = getNodeToExpand(_rootNode, _searchAdministration);
 			TreeNodeMonteCarloTreeSearchResultMoveType playoutNode = node;
+			int start = _searchAdministration.getMoveStack().getSize(); // begin at playoutNode

 			if (node != null)
 {
@@ -580,7 +581,6 @@ public class MonteCarloTreeSearchMoveType extends Move
 				{
 		IntStack playoutMoves = _searchAdministration.getMoveStack();
 		byte color = _monteCarloAdministration.getColorToMove();
-		int start = _monteCarloAdministration.getMoveStack().getSize();
 		int end = playoutMoves.getSize();
 		double weight = 1.0;
 		double weightDelta = 1.0 / (end - start + 1); // Michael Williams' idea to use decreasing weights
@@ -613,6 +613,13 @@ public class MonteCarloTreeSearchMoveType extends Move
 if (_colorMap[xy]==color)
 	result.increaseVirtualPlayouts(_weightMap[xy]*score,_weightMap[xy]);
 			}
+			// add the move to the maps now
+			GoMove move = (GoMove) playoutNode.getContent().getMove();
+			int xy = move.getXY();
+			byte col = move.getColor();
+			_colorMap[xy] = col;
+			_weightMap[xy] = 1.0; //TODO: handle weight
+			
 			playoutNode = playoutNode.getParent();
 		}
 				}
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] How to properly implement RAVE?

2009-01-15 Thread Daniel Waeber
On 12:53 Thu 15 Jan , Mark Boon wrote:
 
 On Jan 15, 2009, at 12:33 PM, Daniel Waeber wrote:
 
  yes, but the weight/color maps stay the same for all updated nodes.
 
  I think the first playoutNode (the one most deep inside the tree) only
  should get amaf values for the random playout, the next one form  
  random
  playout + from the first playoutNode ... and the root node amaf values
  form all the nodes.
 
 OK, I think I see now what you're trying to say. This is something I  
 did think about. I hope my memory serves me well enough to say why I  
 didn't do it that way.
 
 - What you propose adds complexity and possibly computation (if it  
 means recalculating or adjusting the weight map).
 - I don't think it makes all that much difference.
 
 The reason I don't think it makes much difference is that adding AMAF  
 values for the moves above (closer to the root) the playoutNode is  
 that most likely those points are now occupied. Since the AMAF values  
 are used to compute which empty points are the best next candidate,  
 the AMAF values at occupied points are immaterial. They are not even  
 added. So it only makes a difference in cases where played stones get  
 captured and those points then are occupied again. This brings us back  
 to the issue discussed earlier about ko and ishi-no-shita.

Yes, I don't know if it would be a big improvement. It is not a real
speed decrease, as you just move the point were you add the additional
values to the maps.
Perhaps I'll run some tests when I have the time.

 Mark
 
 P.S. what do I need to open that file? Is it a SVN patch?

it's standard diff format, outputed by git. You can view it with a text
editor and apply it with unix patch commandline tool or use
eclipse-team-apply-patch. it's done agains svn-root, so you have to ignore
2 leading pathnames if you apply it inside the refbot folder.

regards
  wabu

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] How to properly implement RAVE?

2009-01-14 Thread Daniel Waeber
Hi,

I'm also interested at the same thing.

 snip
 
  I tried putting this into pseudo code, but it already looks like C. ;p
  http://pastie.org/357231
  If you could look at it, I would be most grateful.
 
 It sounds good but it seems to lack the check of whether a given move is
 first played in a given intersection. When you add that, it because a little
 more tricky to update in the tree. You also update the value of each move
 independently of the color, i.e. a position in which it is black turn will
 be update with white moves. You should restrict the update.

I have a question about the the part were the stats are updated.
(l.15-25). having an array of amaf-values in every node seems very memory
intensive and I don't get how you would access these values.

Something like searching for all nodes below the visited node with the same
pos+color as the ones visited and updating amaf-stats directly in these nodes
would make more sense to me. (but also costs more cpu time)

Hope I don't bring in more confusion into the topic, but I really would
like to get rave done right.

 Hopes that helps,
 Sylvain
 
 
 
 
  -Isaac
 

thanks for you help
  wabu

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] How to properly implement RAVE?

2009-01-14 Thread Daniel Waeber
Ok, I still have same questions about the refbot code.

On 10:29 Wed 14 Jan , Mark Boon wrote:
 
 On Jan 14, 2009, at 9:36 AM, Daniel Waeber wrote:
 
  I have a question about the the part were the stats are updated.
  (l.15-25). having an array of amaf-values in every node seems very  
  memory
  intensive and I don't get how you would access these values.
 
 You are right, it is memory intensive. I believe this is one of the  
 reasons that most implementations wait a certain number of playouts  
 before creating the next level of nodes.

  form http://pastie.org/pastes/357231 :
  node[visitedNode[i]].AMAFSum[visitedPos[j]]+=result; 
  node[visitedNode[i]].AMAFPlayed[visitedPos[j]]+=1;

I had some problems with these lines. looks to me like the nodes have an
array of boardsize*boardsize inside *all* nodes, increasing the memory
of the tree by a factor 10.
But, correct me if I'm wrong, the refbot code just adds two simple
values to the node for rave.

 Accessing the AMAF values depends on your implementation. The  
 following is a code-snippet from my MCTS reference implementation that  
 updates the AMAF values in the tree:
 
 if (_useAMAF)
 {
   IntStack playoutMoves = _searchAdministration.getMoveStack();
   byte color = _monteCarloAdministration.getColorToMove();
   int start = _monteCarloAdministration.getMoveStack().getSize();
   int end = playoutMoves.getSize();
   double weight = 1.0;
   double weightDelta = 1.0 / (end - start + 1); // Michael Williams' idea 
 to use decreasing weights
   GoArray.clear(_weightMap);
   GoArray.clear(_colorMap);
   for (int i=start; iend; i++)
   {
   int moveXY = playoutMoves.get(i);
   if (_colorMap[moveXY]==0)
   {
   _colorMap[moveXY] = color;
   _weightMap[moveXY] = weight;
   }
   weight -= weightDelta;
   color = opposite(color);
   }

until here it's clear to me.

   while (playoutNode!=null)
   {
   color = opposite(playoutNode.getContent().getMove().getColor());
   boolean playerWins = (blackWins  color==BLACK) || (!blackWins 
  color==WHITE);
   double score = playerWins ? 
 MonteCarloTreeSearchResult.MAX_SCORE : MonteCarloTreeSearchResult.MIN_SCORE;
   for (int i=0; iplayoutNode.getChildCount(); i++)
   {
   TreeNodeMonteCarloTreeSearchResultMoveType nextNode 
 = playoutNode.getChildAt(i);
   MonteCarloTreeSearchResultMoveType result = 
 nextNode.getContent();
   GoMove move = (GoMove) result.getMove();
   int xy = move.getXY();
   if (_colorMap[xy]==color)

 result.increaseVirtualPlayouts(_weightMap[xy]*score,_weightMap[xy]);

if i understand this code correctly, it updates the amaf vaules of all
direct children of the playoutNode according to the weight/color maps.

And that update is done for all nodes on the selected path.

   }
   playoutNode = playoutNode.getParent();

First of all, I miss an weight/colorMap update for xy here. Souldn't the
move of the current playoutNode be considered as an amaf move for all
the nodes below this one?

   }
 }

But, most of all, I miss that the code only updates the amaf values of
all direct children, and not of all nodes n below the playoutNode, where
there is no play at n.move on the path between n and the playoutNode.

Finding all these nodes n would be a costy thing to do, but wouldn't
that be the right thing to do? Implementing a realistic subset of RAVE
is another story, but first of all I want to understand the pure concept
of RAVE.

 playoutNode is the move-node from which the playout was done. The amaf  
 values are stored in its children by the increaseVirtualPlayout()  
 method. Note that it goes up the tree by assigning the parent to  
 playoutNode until it gets to the root.
 
 For more context it would be better to lookup the whole source at 
 http://plug-and-go.dev.java.net
 If you think some more comments in the code could clarify things  
 better I'm open to suggestions.

Thanks for the code, didn't know amaf already is inside the mcts refbot.

 
 Good luck.
 
  Mark

Regards,
   wabu


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/