Paige Miller asked <<< Statistical hypothesis tests are usually stated more formally, such as the parameter of interest is equal to X, against the parameter of interest is not equal (or less than, or greater than) X. By the way, how can you compare one hypothesis where you state "if their ELO ratings were all equal" to another hypothesis where ELO ratings don't have to be equal? That comparison, as you state it, doesn't make sense to me.
Does the ELO system allow you to say that a superiority of 100 points over your opponent leads to a victory Z% of the time? What percent of the time would a superiority of 50 points over your opponent lead to a victory? Before you can test an hypothesis, you will need to be more specific about what the hypothesis is, and what the ELO system predicts. By the way, I don't think the ELO system was set up to be predictive of wins, I believe it was set up to rank players' relative strengths based on past experience, so that they could be seeded into tournaments. >>> the ELO system is intended to rate players strengths, but it also allows one to compute a 'predicted wins' when two players play a match, and there are tables of this prediction. What I am unclear on is what question the original poster is interested in. If it is solely the accuracy of the ELO system, then one could have a model something like proportion(wins) = fx(difference in ratings) using some appropriate model (I forget if the proportion is related to the score difference linearly, or some other way.) But it seems to me that the poster is interested in somthing more complex; only I am not sure what. As a very amateur chess player, I can certainly think of reasons beyond ELO why one player might have a particularly good record against another - things like differences in style, psychology, and so on. What would be much harder is to come up with some operatinalized theory as to how these factors affect a player's chances. Although there is certainly a consensus regarding some players styles (particularly the strongest grandmasters), AFAIK there is no quantified rating of style......And this would seem to be necessary for what he has in mind. Also, such things are not available for the more average player OTOH, perhaps what could be done, is a MDS of the head to head records, with an attempt to describe and analyze why some are discrepant with ELO ratings. HTH Peter Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St www.peterflom.com New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax) . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
