On 28 Feb 2004 at 3:30, John Smith wrote: Some more ideas. Before speculating on/investigating the nature of the "head to head" or "style" effects, we should investigate if there are some statistical evidence that this effects exists. Without having any style variables, which would be difficult to get, the hypothesized "head to head" effect could be modeled as a random effect, with expextation zero and a common variance in the population.
>From earlier posts I have got the information that probability of win is given by a logistic regression on ELO difference, so that we have, with x the ELO difference and b a parameter, possibly known by definition: Prob(win) = exp(b x) / (1+exp(b x)). Now, if player pair i has random effect e_i, this model get changed into Prob(win) = exp(b x + e_i) / (1 + exp(b x + e_i)). The variance of the e_i could be estimated, and it could be tested if the random effects model gives a better fit than the simple fixed effects model. Well, it will of course give a better fit, numerically, since it has more parameters than the simple model, and extends it. The real question is if this better fit is really that much better that it cannot simply be explained by the expected misfit of the simple model, under the binomial probability model with independence. At least an analysis along this lines is doable, and not that difficult, with modern software. Any opinions? Kjetil Halvorsen > Paige Miller <[EMAIL PROTECTED]> wrote in message > > I dislike binning numbers that are essentially on a continuous > > scale. I think methods designed to treat the ELO ratings as > > continuous will be more powerful statistically than methods based on > > binning. But for the sake of my understanding of your proposal, > > let's go with bins. > > I defer to your knowledge here, it was just an idea as I don't really > know how to proceed otherwise! > > > Oooh, average of ratios. Another not-so-good idea. Better to compute > > a ratio of the total number of wins divided by the total number of > > games of everyone in the bin. > > Would this not just give me some sort of discrete approximation to ELO > pdf? > > > Now you're coming close to stating an hypothesis, without actually > > stating one. Of course, figuring out what the distribution of this > > "binned-ratio-difference of series" statistic could be a difficult > > problem. > > Thanks ;-) > > I'll try the example route you suggest > > Lets take three players; Tom, Dick and Harry who have elo ratings of > 1600,1500 and 1400 respectively. Now according to > http://tournaments.tantrix.co.uk/ratings/simple.shtml , the ELO > ratings can be interpreted probabilistically as follows: Tom would be > expected to beat Dick 57% of the time and Harry 64% of the time. Dick > would also expect to beat Harry 57% of the time. > > Now lets imagine they had played each other 100 times, so that the > following table could be drawn up: > > Tom v Dick - Tom has 57 wins, 43 losses > Tom v Harry - Tom has 50 wins, 50 losses > Dick v Harry - Dick has 57 wins, 43 losses > > It can be seen that, mirable dictu, Toms record against Dick and Dicks > record against Harry are in line (exactly!) with the expected win/loss > record. > > The 'anomaly' seems to be Toms record against Harry - we would expect > 64 wins and 36 losses, but we have a 50:50 record. Is this just > chance, or is there a 'head to head effect'? > > If there is an effect, a follow-on question might be how can one > modify the probabilistic interpretation of ELO above to account for > this new effect. . . > ================================================================= > Instructions for joining and leaving this list, remarks about the > problem of INAPPROPRIATE MESSAGES, and archives are available at: . > http://jse.stat.ncsu.edu/ . > ================================================================= . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
