Paige Miller asked
<<<
Statistical hypothesis tests are usually stated more formally, such 
as the parameter of interest is equal to X, against the parameter of 
interest is not equal (or less than, or greater than) X. By the way, 
how can you compare one hypothesis where you state "if their ELO 
ratings were all equal" to another hypothesis where ELO ratings 
don't have to be equal? That comparison, as you state it, doesn't 
make sense to me.

Does the ELO system allow you to say that a superiority of 100 
points over your opponent leads to a victory Z% of the time? What 
percent of the time would a superiority of 50 points over your 
opponent lead to a victory?

Before you can test an hypothesis, you will need to be more specific 
about what the hypothesis is, and what the ELO system predicts.

By the way, I don't think the ELO system was set up to be predictive 
of wins, I believe it was set up to rank players' relative strengths 
based on past experience, so that they could be seeded into
tournaments.
>>>

the ELO system is intended to rate players strengths, but it also
allows one to compute a 'predicted wins' when two players play a match,
and there are tables of this prediction.  

What I am unclear on is what question the original poster is interested
in.  If it is solely the accuracy of the ELO system, then one could have
a model something like

proportion(wins) = fx(difference in ratings)

using some appropriate model (I forget if the proportion is related to
the score difference linearly, or some other way.)

But it seems to me that the poster is interested in somthing more
complex; only I am not sure what.  As a very amateur chess player, I can
certainly think of reasons beyond ELO why one player might have a
particularly good record against another - things like differences in
style, psychology, and so on.  What would be much harder is to come up
with some operatinalized theory as to how these factors affect a
player's chances.  Although there is certainly a consensus regarding
some players styles (particularly the strongest grandmasters), AFAIK
there is no quantified rating of style......And this would seem to be
necessary for what he has in mind.   Also, such things are not available
for the more average player

OTOH, perhaps what could be done, is a MDS of the head to head records,
with an attempt to describe and analyze why some are discrepant with ELO
ratings.


HTH

Peter

Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
www.peterflom.com
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)


.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to