Re: I have a problem evaluating a Grading System.

Stan Maxwell Wed, 01 May 2002 11:45:38 -0700

Dennis,
  The system I described is complex.  The people here think that Since I 
know a little bit about statistics I can figure out how to analyse it.  That 
may not be true but I am persistent.  Persistance and a chocolate bar make a 
good combination to motivate me to work.  The management here wants 
everything in simple terms.  I can understand that.  Thank you for reminding 
me of the difference between fairness and bias.
  I'm following your suggestion of breaking down this process into steps.
Suppose for example we have a series of four application reviews.  Each 
review has as input (m a sample from one grop and n as a sample from another 
group) new applications, applications that are first amendments of 
applications, applications that are second amendments of applications and 
applications that are third amendments of applications.  For each of those 
there are three processes.  The first one is performed by the reviewer:  An 
application is either scored or not scored.  Say the fractions are p scored 
and q not scored.  So if m applications are recieved then m*p are scored and 
m*q are not scored. The second process applies to the m*p applications.  The 
reviewer assigns each application a score x.  So now I have a set of scores 
with some closure {x|f(x)>0}.  The third process is performed by the 
applicant.  Some fraction r of the m*q (not scored) applications are amended 
and r*m*q applicatons are submitted in the next review cycle.  Thus the 
complemetary fraction (1-r) or s*m*q applications are not submitted again.  
In my data I have applications with no more than three amendments.  For 
notation I subscript (m,pi,f(x)i,ri),i=1,2,3,4.  Remenber I have the two 
groups to compare the first with m applications the second with n 
applications.
Annotate the second group as (n,ti,g(y)i,vi).  Eight null hypotheses compare 
fractions
Ho: pi=ti and Ho:ri=vi and four null hypotheses compare distributions 
Ho:fi=gi. I tried a t-test on the fractions but got into trouble because of 
changes in power as the proportions changed.  I am considering the 
transformation ai=2*arcsin(sqrt(pi))and bi=2*arcsin(sqrt(ti)) so Ho:ai=bi 
using absolute value of |ai-bi| which should have equally detectable 
differences of effect size regardless of the location of the proportions p 
and t.  For the comparison of the density functions I have tried the 
Mann-Whitney-Wilcoxon signed rank test (the M-W-W) by assuming the 
distributions f and g are the same except for location. Ho:Mx=My.  
Management here dosen't understand the M-W-W and want to use average scores. 
  But the underlying distributions of scores is unknown and so I don't know 
what to use for a critical value for differences in the mean scores.  Where 
can I get suggestions on what tests to use?  Are there other list serves and 
discussion groups that may be appropriate for the type of questions I have?
Stan





>From: Dennis Roberts <[EMAIL PROTECTED]>
>To: "Stan Maxwell" <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
>Subject: Re: I have a problem evaluating a Grading System.
>Date: Wed, 01 May 2002 12:28:57 -0400
>
>I think that the system you originally described is a very complex one ...
>that involves many facets to the decision making process. I will let others
>opine about if it appears to be overly complex or not. But, what I would
>say is that the more facets and/or steps in the process that applicants and
>reviewer/decision makes go through ... the HARDER it is to bring
>statistical evidence to bear that something biased (UNfair?) has been done.
>
>In addition, we have to try to separate the concepts of "bias" from
>"fairness" ... since, they are not necessarily equivalent. Bias IS a
>statistical phenomenon ... say, if we find for two subgroups who take a
>test ... subgroup A and subgroup B ... where both A and B have the SAME
>average ability ... but, on some particular test ITEM ... B has a much
>lower p value for answering the item correctly than A ... we say the item
>is biased ... but, is it unfair? Maybe yes ... maybe no
>
>Fairness seems to involve a value judgement that, is not necessarily
>present in the concept of bias.
>
>All I can suggest at this point, and I have not heard any other person
>respond to your inquiry, is to list out IN order, each step of the overall
>process and, carefully examine from a LOGICAL analysis point of view
>(first), what could go awry at this step ... that would make the final
>decision down the line ... something that is not desirable ... and also ask
>at each of these steps ... would reasoned judgement say that this process
>at this step ... is a fair one or not?  That is, is the process we use at
>(say) step 1 ... clearly flawed if we follow it to it's logical end point?
>
>
>
>At 02:29 PM 5/1/02 +0000, Stan Maxwell wrote:
>>Dennis
>>Fairness has to be tested using a purely statistical metric.  I have two
>>numeric measures from the outcome of the employment interview
>>process.  The first is the yes or no decision to consider the
>>applicant.  For those applicants where the answer was yes I have a integer
>>score on a fixed interval.  An applicant that is scored in the upper
>>50%tile can be hired.  I have five Groups of Applicants. I have five
>>groups of reviewers.  Each combination has between 5 and 140 applications
>>per wave.  Applications are not randomly assigned to reviewers.  A fair
>>review assumes the joint distribution of yes/no and score is the same for
>>each applicant group within review group.  I need distribution free
>>statistics and tests that control for both alpha and beta risk.  The
>>reviews come in waves.  Applicants that fail in a wave can reapply.  I
>>have data on 18 waves.
>>Stan
>>
>>>From: Dennis Roberts <[EMAIL PROTECTED]>
>>>To: "Stan Maxwell" <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
>>>Subject: Re: I have a problem evaluating a Grading System.
>>>Date: Tue, 30 Apr 2002 15:37:33 -0400
>>>
>>>before we can really attempt an answer to this problem ... the question 
>>>has
>>>to be answered ... what do YOU think or what are YOU considering ... to 
>>>be
>>>UNfair?
>>>
>>>without some rather clear operational definition of that term ... then i
>>>don't think there is any good answer to your question ..
>>>
>>>At 07:31 PM 4/30/02 +0000, Stan Maxwell wrote:
>>>>I have a problem evaluating a Grading System.
>>>
>>>Dennis Roberts, 208 Cedar Bldg., University Park PA 16802
>>><Emailto: [EMAIL PROTECTED]>
>>>WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm
>>>AC 8148632401
>>
>>
>>_________________________________________________________________
>>MSN Photos is the easiest way to share and print your photos:
>>http://photos.msn.com/support/worldwide.aspx
>>
>
>Dennis Roberts, 208 Cedar Bldg., University Park PA 16802
><Emailto: [EMAIL PROTECTED]>
>WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm
>AC 8148632401
>


_________________________________________________________________
Join the world�s largest e-mail service with MSN Hotmail. 
http://www.hotmail.com

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: I have a problem evaluating a Grading System.

Reply via email to