Re: Measure of Association Question.

2001-12-29 Thread Donald Burrill

On Fri, 28 Dec 2001, Petrus Nel wrote:

> I require some advice regarding the following:  One set of variables is 
> the grades obtained by students for different high school subjects 
> (i.e. the symbols candidates obtained such as A, B, C, D, etc. for each
> subject).  The other set of variables are the scores obtained for a
> college level subject (i.e. no symbols, just their percentages
> obtained).  I want to determine the correlation between their grades 
> for different high school subjects (A, B, C, D, etc.) and their 
> percentage scores for a college level subject.

Why do you want to?  _Nobody_ just wants correlation coefficients:  
there's always something more that is desired.

> The grades obtained for their high school subjects were coded on the 
> questionnaire as follows - 1=3DA, 2=3DB, 3=3DC, 4=3DD, 5=3DE, 6=3DF. 
> I`ve entered the data for the grades as 1,2,3, etc. to indicate the 
> grade (category) and the percentages (as the other variable) into SPSS. 
> How do I proceed?

What follows assumes that the answer to "Why?" implies that correlation 
coefficients are part of the desiderata, for good reason(s).

First, recompute each HS subject grade:  e.g., 
NEWGRADE = 7 - OLDGRADE
 so that both grades and percentages are coded in the same direction 
(higher values = better performance);  else your correlation coefficients 
will be negative where the relationship is positive, etc.

Second, produce scatterplots of grade vs. percentage, grade vs. grade, 
and percentqage vs. percentage, for all pairs whose correlations are of 
interest:  so that you can properly interpret correlation coefficients 
when you get them (and can be prepared to deal with nonlinear 
relationships, should there be any obvious from the plots). 
 For this purpose one needn't be too fancy:  character plots will do, 
high-resolution plots won't tell you anything more unless there are some 
rather odd nonlinearities among the relationships.

Third, invoke SPSS's CORREL routine to calculate all pairwise zero-order 
correlation coefficients.

Fourth, proceed with whatever your answer to "Why?" implies about 
subsequent analyses and interpretation.

-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Measure of Association Question.

2001-12-30 Thread Jay Warner

Good advice on all counts.

I'm curious about where you want to take this 'correlation' if found.  A
college admissions person could use the relationship (in the form of a
regression equation), if any, to predict the score on the 'college level
subject' for those students who were unable to take such a course in HS, or
to predict the performance on the college level subject, once admitted.  But
correlation & regression eq. do not produce the same end result, especially
where the independent variable contains significant measurement variance.

Donald Burrill wrote:

> On Fri, 28 Dec 2001, Petrus Nel wrote:
>
> > I require some advice regarding the following:  One set of variables is
> > the grades obtained by students for different high school subjects
> > (i.e. the symbols candidates obtained such as A, B, C, D, etc. for each
> > subject).  The other set of variables are the scores obtained for a
> > college level subject (i.e. no symbols, just their percentages
> > obtained).  I want to determine the correlation between their grades
> > for different high school subjects (A, B, C, D, etc.) and their
> > percentage scores for a college level subject.
>
> Why do you want to?  _Nobody_ just wants correlation coefficients:
> there's always something more that is desired.
>
> > The grades obtained for their high school subjects were coded on the
> > questionnaire as follows - 1=3DA, 2=3DB, 3=3DC, 4=3DD, 5=3DE, 6=3DF.
> > I`ve entered the data for the grades as 1,2,3, etc. to indicate the
> > grade (category) and the percentages (as the other variable) into SPSS.
> > How do I proceed?
>
> What follows assumes that the answer to "Why?" implies that correlation
> coefficients are part of the desiderata, for good reason(s).
>
> First, recompute each HS subject grade:  e.g.,
> NEWGRADE = 7 - OLDGRADE
>  so that both grades and percentages are coded in the same direction
> (higher values = better performance);  else your correlation coefficients
> will be negative where the relationship is positive, etc.

Translating letter grades into a number implies that the letters were
ordinal categorical data, and further, that they have equal increments
(distance between them).  Inasmuch as the grades came from a numerical
scale, typically with 60 = D, one could argue that this translation is fully
valid.

HOWEVER   As I recall, a correlation analysis assumes the data is
Normally distributed.  Grades usually ain't, and the upper limit of 'A'
forces the issue.  If you wish to compute confidence intervals, this may
come back to haunt you.

Nonetheless, good luck on this project, and don't forget Prof. Burrill's
next urging, that you look carefully at some scatter plots.  Those may tel
you as much or more than your calculations.

>
>
> Second, produce scatterplots of grade vs. percentage, grade vs. grade,
> and percentqage vs. percentage, for all pairs whose correlations are of
> interest:  so that you can properly interpret correlation coefficients
> when you get them (and can be prepared to deal with nonlinear
> relationships, should there be any obvious from the plots).
>  For this purpose one needn't be too fancy:  character plots will do,
> high-resolution plots won't tell you anything more unless there are some
> rather odd nonlinearities among the relationships.
>
> Third, invoke SPSS's CORREL routine to calculate all pairwise zero-order
> correlation coefficients.
>
> Fourth, proceed with whatever your answer to "Why?" implies about
> subsequent analyses and interpretation.
>
> -- DFB.
>  
>  Donald F. Burrill [EMAIL PROTECTED]
>  184 Nashua Road, Bedford, NH 03110  603-471-7128
>
> =
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
> =

--
Jay Warner
Principal Scientist
Warner Consulting, Inc.
 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX: (262) 681-1133
email: [EMAIL PROTECTED]
web: http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Measure of Association Question.

2002-01-02 Thread John Uebersax

[EMAIL PROTECTED] (Petrus Nel) wrote in message 
news:<000201c18fe2$f73aeee0$ed9e22c4@oemcomputer>...

> I require some advice regarding the following: One set of variables is 
> the grades obtained by students for different high school subjects (i.e. 
> the symbols candidates obtained such as A, B, C, D, etc. for each 
> subject). The other set of variables are the scores obtained for a 
> college level subject (i.e. no symbols, just their percentages 
> ... 
> The grades obtained for their high school subjects were coded on the 
> questionnaire as follows - 1=A, 2=B, 3=C, 4=D, 5=E, 6=F.  
> ...
> How do I proceed?

Simpler answer:

First, change the coding to 1=F, 2=E, 3=D, 4=C, 5=B, 6=A.   In the US
at least
there is no 'E'; if so, the correct coding would be 1=F, 2=D, 3=C,
4=B, 5=A.

If the latter coding is used, calculate the Spearman rank correlation
between the grade in a given high school course and the college score.

If the former coding is used, you can use either the Pearson
correlation or the Spearman rank correlation; the Pearson correlation
would probably be better.

More complex answer:

The approach above ignores the fact that within each letter grade
there is variation--e.g., all students who get a 'B' are not at the
same level.  Further, there is censoring at the upper end and lower
ends of the scale--e.g., no matter how well a person does, the highest
grade they can get is an 'A'.

The polyserial correlation can account for this.  The polyserial
correlation estimates what the correlation of grade and score would be
if grades were measured on a continuous scale.  An assumption is that
there is a bivariate normal distribution between (1) the continuous
latent variable of which grade is a manifest representation and (2)
the percentage score.

The polyserial correlation is related to the polychoric correlation. 
For information about the polychoric correlation, see:

http://ourworld.compuserve.com/homepages/jsuebersax/tetra.htm

Drasgow F. Polychoric and polyserial correlations. In Kotz L, Johnson
NL (Eds.), Encyclopedia of statistical sciences. Vol. 7 (pp. 69-74).
New York: Wiley, 1988.

I don't know if SPSS will calculate the polyserial correlation--the
last I
heard it did not.  If not, the polyserial correlation can be
calculated with the program PRELIS, which is distributed with LISREL. 
Many universities have copies of LISREL/PRELIS.

If you are interested in comparing to see which high school classes
best predict college scores, then, as a practical matter, I would
expect you would draw the same conclusions regardless of whether you
used the Pearson, the Spearman, or the polyserial correlation
coefficients.

Good luck!


John Uebersax, PhD (805) 384-7688 
Thousand Oaks, California  (805) 383-1726 (fax)
email: [EMAIL PROTECTED]

Agreement Stats:   http://ourworld.compuserve.com/homepages/jsuebersax/agree.htm
Latent Structure:  http://ourworld.compuserve.com/homepages/jsuebersax
Existential Psych: http://members.aol.com/spiritualpsych
Diet & Fitness:http://members.aol.com/WeightControl101



=
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
  http://jse.stat.ncsu.edu/
=