John Smith wrote: > Paige Miller <[EMAIL PROTECTED]> wrote in message > >>I dislike binning numbers that are essentially on a continuous >>scale. I think methods designed to treat the ELO ratings as >>continuous will be more powerful statistically than methods based on >>binning. But for the sake of my understanding of your proposal, >>let's go with bins. > > I defer to your knowledge here, it was just an idea as I don't really > know how to proceed otherwise! > >>Oooh, average of ratios. Another not-so-good idea. Better to compute >>a ratio of the total number of wins divided by the total number of >>games of everyone in the bin. > > Would this not just give me some sort of discrete approximation to ELO > pdf? > >>Now you're coming close to stating an hypothesis, without actually >>stating one. Of course, figuring out what the distribution of this >>"binned-ratio-difference of series" statistic could be a difficult >>problem. > > I'll try the example route you suggest > > Lets take three players; Tom, Dick and Harry who have elo ratings of > 1600,1500 and 1400 respectively. > Now according to http://tournaments.tantrix.co.uk/ratings/simple.shtml > , the ELO ratings can be interpreted probabilistically as follows: > Tom would be expected to beat Dick 57% of the time and Harry 64% of > the time. Dick would also expect to beat Harry 57% of the time. > > Now lets imagine they had played each other 100 times, so that the > following table could be drawn up: > > Tom v Dick - Tom has 57 wins, 43 losses > Tom v Harry - Tom has 50 wins, 50 losses > Dick v Harry - Dick has 57 wins, 43 losses > > It can be seen that, mirable dictu, Toms record against Dick and Dicks > record against Harry are in line (exactly!) with the expected win/loss > record. > > The 'anomaly' seems to be Toms record against Harry - we would expect > 64 wins and 36 losses, but we have a 50:50 record. Is this just > chance, or is there a 'head to head effect'? > > If there is an effect, a follow-on question might be how can one > modify the probabilistic interpretation of ELO above to account for > this new effect.
I believe this illustrates a problem that I am having with your question and subsequent analysis. I find it inconceivable that Harry can beat Tom 50% of the time and yet have an ELO rating 200 points lower than Tom. Beating someone (Tom) 200 points above you 50% of the time should pull your ELO rating up more than losing 57% of the time to Dick, 100 points above you, will pull your rating down. Your example is inconsistent ... unless ... there are other players (other than Tom Dick) who played games against Harry, which leads to him having an ELO rating 200 points below Tom. If there are other players, then of course the ELO ratings could be what you claim. But I wasn't asking for an example with just data. I was asking for an example showing how YOU would analyze the data in this situation. You have proposed something which I called a "binned-ratio-difference of series" approach (the so-called BRDOS method), and I cannot follow that. I cannot see where you are going nor why this BRDOS method makes any sense at all. You didn't use your example to explain that. Given that you have 100 games head-to-head, I would think that there is a good chance that this is indeed more useful information for predicting than the ELO rating alone. But ... I'm sure that with real data you don't have 100 head-to-head games for all pairs of players, and in some cases you may have only 1 or 2 head-to-head games. So the question would be ... do you have enough head-to-head games for this to make a statistically significant improvement in predicting wins, or is the data so sparse that head-to-head information is relatively unreliable? And that depends on the model you use and the assumptions you make when fitting the model and testing parameters. In another post, Kjetil Halvorsen proposes a model that may indeed work, similar to a logistic regression. Seems like that would be a place to start. -- Paige Miller Eastman Kodak Company paige dot miller at kodak dot com http://www.kodak.com "It's nothing until I call it!" -- Bill Klem, NL Umpire "When you get the choice to sit it out or dance, I hope you dance" -- Lee Ann Womack . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
