Re: [Computer-go] MC in games of imperfect information

Isaac Deutsch Thu, 20 May 2010 04:57:28 -0700

> Does that make it clear enough?

Yes, thanks for the great explanation! I still have some questions of course 
that I can pose hopefully.



> Yes, precisely. Perhaps this would be easier to explain with an
> example. Imagine we are playing Texas Hold'em, and we want to figure
> out the probability distribution over the possible hands that an
> opponent has. At the beginning of the round, he may have any 2 cards,
> and all combinations are equally likely (except that we have seen our
> own two cards, so we know he doesn't have those). Imagine now that it
> is his turn to play, he happens to raise. Now my estimate of what he
> is likely to have should change. For instance, how likely is he to
> have two aces? Before he raised, if we don't have an ace ourselves, we
> would believe the probability of him having two aces is 4/50*3/49, or
> about 0.005. After he raises, we have more information. Following
> Bayes' theorem:
> 
> P(two_aces | raise) = P(two_aces) * P(raise | two_aces) / P(raise)
> 
> We just said that P(two_aces) = 0.005. Imagine that our quick
> probabilistic model says that someone with two aces raises 80% of the
> time. And the probability of raising in general is 10%. Plug
> everything in the formula, and you'll get 0.04, which is the new
> probability for two aces.

All clear up to here. :) I guess the "P(raise | two_aces)" bit could be 
extracted from professional games, for example?
Is it possible to update these probabilities incrementally? So, in the example, 
you start with a uniform probability. Then, you see that
the opponent raises, and you update the probability for the hand "two aces" to 
0.04. Let's say he does another action that makes two aces more
(or less) likely, can we use P(two_aces) = 0.04 as a starting point? Does it 
matter if the 2 actions correlate or not?


> So you don't need a separate model for that. You simply multiply the
> probability of each hand by the probability of the action given the
> hand, and when you have done it for each hand, you rescale everything
> so the sum continues to be 1.
> SInce in MC we are trying to compute the expected value of our utility
> function after each one of our possible actions, you should compute
> the average utility, weighted by the probability of each scenario (see
> the formal definition of "expected value" in Wikipedia). You could do
> this by biasing the dealing to give that opponent two aces 4% of the
> time, instead of the natural 0.5%. Since doing this will get
> complicated quickly (as other people take actions that further change
> the probabilities), it is easier to deal the cards normally, but then
> scale each scenario with its actual probability of being what happened
> (this is the method the formal definition of "expected value"
> suggests).

I see, this scaling method seems easier indeed. But, in tichu, everyone has 14 
cards at the beginning. So, it is impossible to do this for all possible hands. 
How should this problem be tackled? It seems beneficial to simulate the most 
likely hands first.

> 
> Notice that P(raise) = Sum_over_all_possible_hands ( P(raise | hand) ).

Again, we cannot calculate all possible hands most of the time. Should we say 
that all hands except the most likely ones should be neglected because their 
weight would be minimal anyway?


> Before trying to bias / weight the MC playouts, it would be worth trying a 
> pure-MC approach.  As you've described it below, this would be "Give all 
> players random cards, then play the game out randomly".  If you have access 
> to the rules-based bot, that is ideal, as you have a fixed-strength opponent 
> you can test against.  Although pure-MC in Go has been left behind by MCTS, 
> it should be a good place to start to validate the approach.  The fact that 
> the players will respond with bad moves most of the time doesn't invalidate 
> the approach (at least in Go).  I wouldn't go down the route of playing out 
> with deterministic rules as the choice of these could have a major influence 
> on the validity of the playout results.

OK. I have access to the rule-based bot, but alas it uses an incomplete rule 
set of the game of tichu.

Thanks,
Isaac Deutsch

_______________________________________________
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] MC in games of imperfect information

Reply via email to