Determing Best Performer

2001-02-09 Thread Grant Robertson

Hi All,
I have a client who insist on determining which call centre representative
is the weeks best performer. We conduct weekly interviews and ask
respondents to rate each representative. For some reps we conduct 5
interviews others 1 and so on. Now the question is how do you go about
determining which rep performed best, we combine the two top box ratings. We
have looked at margin's of error for each rep but this hasn't proved to be
useful as you will be unlikely to find significant differences with such
small bases sizes. Any suggestion as to the best method to do this would be
appreciated.

Regards,
Grant




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Significance Testing in Experiments

2001-02-09 Thread Richard A. Beldin

This is a multi-part message in MIME format.
--A6F82FD48F6AB6950337BD14
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

R A Fisher showed that random assignment of treatments to subjects from
"almost any population" generates a proability distribution of treatment
differences which satisfies the assumptions of ANOVA  Random sampling of
the population is not required but extrapolation to populations not
studied is risky, of course.

--A6F82FD48F6AB6950337BD14
Content-Type: text/x-vcard; charset=us-ascii;
 name="rabeldin.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Richard A. Beldin
Content-Disposition: attachment;
 filename="rabeldin.vcf"

begin:vcard 
n:Beldin;Richard
tel;home:787-255-2142
x-mozilla-html:TRUE
url:netdial.caribe.net/~rabeldin/Home.html
org:BELDIN Consulting Services
version:2.1
email;internet:[EMAIL PROTECTED]
title:Professional Statistician (retired)
adr;quoted-printable:;;PO Box 716=0D=0A;Boquerón;PR;00622;
fn:Richard A. Beldin
end:vcard

--A6F82FD48F6AB6950337BD14--



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Determing Best Performer (fwd)

2001-02-09 Thread Bob Hayden


The most cost-effective method is to roll a die.  (See Deming.)

- Forwarded message from Grant Robertson -

Hi All,
I have a client who insist on determining which call centre representative
is the weeks best performer. We conduct weekly interviews and ask
respondents to rate each representative. For some reps we conduct 5
interviews others 1 and so on. Now the question is how do you go about
determining which rep performed best, we combine the two top box ratings. We
have looked at margin's of error for each rep but this hasn't proved to be
useful as you will be unlikely to find significant differences with such
small bases sizes. Any suggestion as to the best method to do this would be
appreciated.

Regards,
Grant




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=

- End of forwarded message from Grant Robertson -

-- 
 

  _
 | |Robert W. Hayden
 | |  Work: Department of Mathematics
/  |Plymouth State College MSC#29
   |   |Plymouth, New Hampshire 03264  USA
   | * |fax (603) 535-2943
  /|  Home: 82 River Street (use this in the summer)
 | )Ashland, NH 03217
 L_/(603) 968-9914 (use this year-round)
Map of New[EMAIL PROTECTED] (works year-round)
Hampshire http://mathpc04.plymouth.edu (works year-round)

The State of New Hampshire takes no responsibility for what this map
looks like if you are not using a fixed-width font such as Courier.

"Opportunity is missed by most people because it is dressed in
overalls and looks like work." --Thomas Edison



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Significance Testing in Experiments

2001-02-09 Thread Jay Warner

Magill, Brett wrote:

> The more general concern about significance testing notwithstanding, I have
> a question about the use of testing, or other inferential statistical
> techniques, in experiments using human subjects, or any other research
> method that does not use probability sampling...
> 
> Now, all of these tests that we run--whether from ANOVA, regression,
> difference of means, correlations, etc.--are based on the assumption that we
> have sampled from a population using some probability sampling procedure.
> And the meaning of p is inextricably linked to the properties of the
> sampling distribution.
> 
> However, little experimental research with human subjects is done using a
> sample.  

maybe a nit pick, but on the contrary, they are _all_ done with  a 
sample.  - a group of measured elements of the set of the population.  
The sample must be finite, which it is.

> Most often, in my experience, these studies use volunteer subjects
> or naturally existing groups.  These subjects are then randomly assigned to
> treatment and control groups.  

If we can assert, loudly, that the subjects are 'representative of the 
population' that we care about, then how we obtained them is of no 
concern, I'd say.  I can, indeed must, define the 'population' to 
include the group of non-subjects I wish to make a prediction on.  For 
example, I would not test females with Viagra, as I don't believe it 
will have any effect (I'm guessing in technical ignorance, here).  a 
counter example: did the first tests with Thalidomide include any 
pregnant women?  I guess no.  Then nobody should predict how the drug 
will work on pregnant women (or that it will not have side effects).  
The tragedy was that the predictions made (whether explicitly or by 
uninformed decision/action) were proven absolutely false.

the difficulty, as I see it, is that occasionally, we define a 
population, solicit some subjects, and then assert that the subjects 
match the population, when they don't.  the solicitation of volunteers 
occasionally has a selection effect.  For example, in the USA it tends 
to under represent African Americans, who often distrust medical 
researchers.

> Yet, in every instance that I know of,
> results are presented with tests of significance.  It seems to me that
> outside of the context of probability sampling, these tests have no meaning.
> Despite this, presentation of such results with tests of significance are
> common.  
> 
> Is there a reasonable interpretation of these results that does not rely on
> the assumption of probability sampling? It seems to me that simply
> presenting and comparing descriptive results, perhaps mean differences,
> betas from a regression, or some other measure of effect size without a test
> would be more appropriate. This would however be admitting that results have
> no applicability beyond the participants in the study.  Moreover, these
> results say nothing about the number of subjects one has, which p might help
> with regard to establishing minimum believability. Yet, conceptually, p
> doesn't work.
> 
> Am I missing the boat here? Significance testing in these situations seem to
> go over just fine in journals. Appreciate your clarification of the issue.
> 
> Regards
> 
> Brett

How would you design a study, using a Baysian approach? 

Jay

-- 
Jay Warner
Principal Scientist
Warner Consulting, Inc.
 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX:(262) 681-1133
email:  [EMAIL PROTECTED]
web:http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Significance Testing in Experiments

2001-02-09 Thread Thom Baguley

Richard A. Beldin wrote:
> 
> R A Fisher showed that random assignment of treatments to subjects from
> "almost any population" generates a proability distribution of treatment
> differences which satisfies the assumptions of ANOVA  Random sampling of
> the population is not required but extrapolation to populations not
> studied is risky, of course.

An excellent summary.

Random sampling from a specifiied population is (nigh on) impossible unless
you have a finite, proximate target population in mind (e.g., a specific pool
of voters, consumers or whatever).

Extrapolation or generalization will always be tricky. For example, imagine I
could randomly sample from everyone on the planet. I'd still have problems
trying to generalize those findings to people alive in 10 or 50 years time (or
in the past).

Generalization has to rely heavily on theory and domain knowledge no matter
how good the stats.

Thom


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Two sided test with the chi-square distribution?

2001-02-09 Thread Thom Baguley

Donald Burrill wrote:
> Only if you are unwilling to assume that the writer of the term shares
> your understanding of proper usage;  no?

True. OTOH I feel I have a good case here (and, of course, I think my opinion
is correct!).

Jim raised a not unreasonable objection - that the use of one-tailed =
directional is very widespread. My counter is that:

- those of us who teach ought to help try and clear up the confusion
- those who write or review ought to propogate good, clear reporting

Thom

(I think an earlier reply of mine bounced - so forgive a repeat).


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Statistics is Bunk

2001-02-09 Thread Jay Warner

Eric Bohlman wrote:

> Jeff Rasmussen <[EMAIL PROTECTED]> wrote:
> 
>>  I tried to subtlety nudge the Pro-Bunk side to look at Quantum Physics and 
>Heisenberg's Uncertainty Principle as well as Eastern Philosophies,[snip of excellent 
>stuff]
> 
> 
> 
> I think this ultimately comes down to George Box's famous quip that all 
> models are wrong, but some models are useful.  

'Tain't a quip.  It's fact!  :)

> [more snip of more excellent stuff]
> 
Might also push Persig's Zen & the Art of Motorcycle Maintenance.  At 
the very edge of understanding, there is a zone of intuition, not 
objectivity.  A miraculous place.

Jay

-- 
Jay Warner
Principal Scientist
Warner Consulting, Inc.
 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX:(262) 681-1133
email:  [EMAIL PROTECTED]
web:http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: careers in statistics

2001-02-09 Thread Jay Warner

a)there is always plenty of work for those who like what they do.

b)If it's money you're wanting, then you have to be able to do 
something that someone else will pay to have done.

c)happiness comes to those who find (a) and (b) in the same 
activity.  Or who do not compromise (a) very much in pursuit of (b).

d)What can a master in stats do?  to translate to what CEO's and 
other money sources understand, I'd suggest you emphasize the technical 
issues you can address, and not mention the 'pure' math angle of 
statistics.  Or search well for the CEO's who understand the details of 
the PDCA loop.

JohnD231 wrote:

> Id like anyones input on the following:
> 
> 1. job prospects at the masters level

better than MS in non-data driven topics.

> 
> 2. starting pay

You won't starve, and you may pay off your loans quickly.

> 
> 3. job satisfaction

that's your responsibility, not the company''s.

> 
> 
> Thanks
> 
> 
> =
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
> =
> 
> 
> 

-- 
Jay Warner
Principal Scientist
Warner Consulting, Inc.
 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX:(262) 681-1133
email:  [EMAIL PROTECTED]
web:http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Determing Best Performer

2001-02-09 Thread Jay Warner



Grant Robertson wrote:

> Hi All,
> I have a client who insist on determining which call centre representative
> is the weeks best performer. We conduct weekly interviews and ask
> respondents to rate each representative. For some reps we conduct 5
> interviews others 1 and so on. Now the question is how do you go about
> determining which rep performed best, we combine the two top box ratings. We
> have looked at margin's of error for each rep but this hasn't proved to be
> useful as you will be unlikely to find significant differences with such
> small bases sizes. Any suggestion as to the best method to do this would be
> appreciated.
> 
> Regards,
> Grant

Question 1)What does the boss want to accomplish with the 
information you build?  If it's improved performance, what measurement 
will help the call centre reps move in that direction?  And I'm begging 
the question of what is 'performance.'  You can't.

If the reps learn that the top performer title is awarded for what 
amount to random selection, it will loose any motivating 
characteristic.  Selecting only the top individual each week tends to 
include a large randomness component.  How about the top 15% on a rating 
scale, as one improvement.

Also, make some kind of run chart of the ratings (ugh!  but it's your 
client.), and see that there is an award for consistent high ratings. 

And re-visit question 1.  Does your survey 'instrument' get measures of 
what the boss _really_ cares about, or an honest indicator of same?

Jay

-- 
Jay Warner
Principal Scientist
Warner Consulting, Inc.
 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX:(262) 681-1133
email:  [EMAIL PROTECTED]
web:http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Statistics is Bunk

2001-02-09 Thread Robert J. MacG. Dawson



Eric Bohlman wrote:

> 
> You might also want to touch on operationalism (W. Edwards Deming's
> writings would be good introductions).  The Florida election would be a
> good example to use: to an operationalist, it's completely meaningless to
> talk about the True Vote without reference to how the vote is counted, and
> it would be nonsense to compare counting methods based on how close they
> get to the "true" totals.

*Would* that be a good example (unless one were trying to attack
operationalism?)  An election seems to me like an excellent example of a
situation in which there *is* a reality in there somewhere even if you
cannot get to it - the intention of the voter. The true total is what
you would get if every legally eligible voter was able to vote and
permitted to double-check the counting of their own vote. A good
counting system is one that gives a result close to this. If
operationalism says this is nonsense, then operationalism is an ass.

A faireer application of operationalism might be the system up to the
point where the vote is cast; there is a very real question about what
the "true" popular will is when some people choose not to vote.
Obviously, penalizing nonvoters (as I believe Australia still does) or
offering rides to the polling station changes things one way; hassling
voters, having few polling stations, or having byzantine requirements
for potentially eligible voters to be permitted to vote changes things
in the other way.  (While the Florida decision to permit the Republicans
to correct technically invalid applications for overseas ballots was
illegal, one can see the point of including voters - what was truly
reprehensible was that the Democrats were not permitted to make similar
corrections.) However, it is probably true that there is no
"platinum-iridium standard" for how to handle the potential voter who
does not care enough to vote.

-Robert Dawson


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: careers in statistics

2001-02-09 Thread dennis roberts

At 09:12 AM 2/9/01 -0600, Jay Warner wrote:

>>3. job satisfaction
>
>that's your responsibility, not the company''s.

i agree with all the other points that jay made but, i  disagree to some 
extetn with this latter one ...

how well you are satisfied with your "job" is a mix between the

1. match between your skills and what the job demands
2. the truthfulness of the employer at carefully delineating what your job 
really will be
3. how much effort YOU make
4. what primary and secondary resources the employer provides FOR your work

4 is important ... and if lacking to a substantial degree (which you may 
not be able to ascertain UNtil you are on the job)

for example ... not related to stat specifically ... but, what if you get a 
faculty appointment where, part of that job will be to teach a large intro 
section of stat ... and, the promise is made to you that there will be 
resources for you to carry out that responsibility ... such as a good 
classroom with good tech for demos, etc. ... teaching assistant(s) to 
handle the volume of office hours, etc. ...

and, while these happen for the first semester or two ... slow and surely 
they start dwindling away ... can we really expect you to be really 
satisfied? i doubt it and, it is not all your responsibility to make it so 
either




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



help needed

2001-02-09 Thread Maciej Teliszewski

This is a multi-part message in MIME format.
--3AD431850A0D5139328B0361
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hi!

I need clear description of FULL BINOMAL model. Thanx in advance.

Regards
Maciek

--
Ohm Namah Shivaya
One world One Race One Trance
Invoke the tribal spirit : )) !


--3AD431850A0D5139328B0361
Content-Type: text/x-vcard; charset=us-ascii;
 name="m.teliszewski.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Maciej Teliszewski
Content-Disposition: attachment;
 filename="m.teliszewski.vcf"

begin:vcard 
n:Teliszewski;Maciej
tel;work:583 31 37
x-mozilla-html:TRUE
org:EUROZET;Marketing
version:2.1
email;internet:[EMAIL PROTECTED]
title:Doradca ds. planowania strategicznego
adr;quoted-printable:;;00-503 Warszawa=0D=0Aul. =AFurawia 8
fn:Maciej Teliszewski
end:vcard

--3AD431850A0D5139328B0361--



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



ANOVA : Repeated Measures?

2001-02-09 Thread Sylvain Clément


We have data from an experiment in psychology of hearing. There are 3
experimental conditions (factor C). We have collected data from 5
subjects (factor S). For each subject we get 4 measures of performance
(M for Measure factor) in each condition. What is the best way to
analyse these data?

We've seen these possibilities :

a)  ANOVA with repeated measures with 2 fixed factors : subjects &
conditions  and the different measures as the repeated measure factor
(random factor).

b) ANOVA with two fixed factor (condition & measure) and a random
factor (repeated measure-> subject factor).

c) ANOVA with one fixed factor (condition) and the other two as
random.

We think that the a) design is correct (assuming and verifying that
there is no special effect of the measure factor such as training
effects).

Other psychologist advised us to use the b) design because
psychologists use to consider the subject effect as random. (in
general experiments in psychology are ran with at least 20 to 30
subjects).

The last design (c)) is a possibility if we declare that we have no
hypothesis on the effects of subject & repetition factors.


I have only little theoretical background in stats and I like to know
what exactly imply these possible designs.

Thanks in advance for your help

Sylvain Clement
"Auditory function team"
Bordeaux, France




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Significance Testing in Experiments

2001-02-09 Thread Mike Granaas


I agree with the attached and only wish to add that this discussion points
to the need for replication.  I have no trouble believing that a study on
undergraduates at college x generalizes to some population.  Which
population that might be remains open to speculation.  Replication gives
us a chance to determine how broad or narrow that population might be.

Michael

On Fri, 9 Feb 2001, Thom Baguley wrote:

> Richard A. Beldin wrote:
> > 
> > R A Fisher showed that random assignment of treatments to subjects from
> > "almost any population" generates a proability distribution of treatment
> > differences which satisfies the assumptions of ANOVA  Random sampling of
> > the population is not required but extrapolation to populations not
> > studied is risky, of course.
> 
> An excellent summary.
> 
> Random sampling from a specifiied population is (nigh on) impossible unless
> you have a finite, proximate target population in mind (e.g., a specific pool
> of voters, consumers or whatever).
> 
> Extrapolation or generalization will always be tricky. For example, imagine I
> could randomly sample from everyone on the planet. I'd still have problems
> trying to generalize those findings to people alive in 10 or 50 years time (or
> in the past).
> 
> Generalization has to rely heavily on theory and domain knowledge no matter
> how good the stats.
> 
> Thom
> 
> 
> =
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
> =
> 

***
Michael M. Granaas
Associate Professor[EMAIL PROTECTED]
Department of Psychology
University of South Dakota Phone: (605) 677-5295
Vermillion, SD  57069  FAX:   (605) 677-6604
***
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



test for whether an observation comes from a distribution

2001-02-09 Thread Ravi Bapna

i am looking for a way to determine whether an observation from the
real-world (one data point) comes from a distribution that has been created
out of a simulation process. the simulation is repeated 30 times, so there
ae that many data points.

the idea is to test whether the simulation process replicates the real-world
phenomenon, of which we have only one observation. i would even be
comfortable with something that tells me, "there is a high likelihood of
drawing the real-world observation from the distribution created by the
simulation."

thanks a lot.

ravi bapna




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Significance Testing in Experiments

2001-02-09 Thread Herman Rubin

In article ,
Magill, Brett <[EMAIL PROTECTED]> wrote:
>The more general concern about significance testing notwithstanding, I have
>a question about the use of testing, or other inferential statistical
>techniques, in experiments using human subjects, or any other research
>method that does not use probability sampling...

>Now, all of these tests that we run--whether from ANOVA, regression,
>difference of means, correlations, etc.--are based on the assumption that we
>have sampled from a population using some probability sampling procedure.
>And the meaning of p is inextricably linked to the properties of the
>sampling distribution.

The idea that probability is only related to sampling from a
"population" is what needs to be discarded.  Instead, one should
take the view that observations are random, which means just that
there are unknown probabilities involved.  Deciding which action
to take depends on this consideration.  It is only for historical
reasons that the set of observations made is called a sample.

>However, little experimental research with human subjects is done using a
>sample.  Most often, in my experience, these studies use volunteer subjects
>or naturally existing groups.  These subjects are then randomly assigned to
>treatment and control groups.  Yet, in every instance that I know of,
>results are presented with tests of significance.  It seems to me that
>outside of the context of probability sampling, these tests have no meaning.
>Despite this, presentation of such results with tests of significance are
>common.  

Testing is needed, but not the use of preassigned significance
levels.  The fundamental principle of decision making under
uncertainty is that one must

Simultaneously consider all consequences of the
proposed procedure in all states of nature.

This will involve the probabilities of the various actions
being tken in the various states.

This is what you should use to replace "significance" testing.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



MIT Sexism & statistical bunk

2001-02-09 Thread Gene Gallagher

Here is an interesting stats problem from this week's Boston Globe.
In 1999, an internal MIT report concluded that women faculty at MIT had
received lower pay and fewer resources then their male colleagues.  This
week, the Independent Women's Forum (IWF) released a statistical
analysis showing that MIT "may have reacted to political correctness
before checking all the evidence."  This report supposedly documents,
"compelling differences in productivity, influence and grant funding
between the more senior males and females."  The report argues that the
gender difference in MIT salary and lab space was justified because "few
would question the fairness of rewarding those who publish more widely,
are more frequently cited, or raise the most in grant funds (p. 8, IWF
report)"

The Independent Womens Forum 13-p report is available at:
http://www.iwf.org/news/mitfinal.pdf

With the headlines "Fuzzy math on women" and  "MIT bias claims
debunked," the Boston Globe reported the IWF's major conclusion, that
women professors probably deserved their lower pay and smaller labs
because of their lower productivity:
http://www.boston.com/dailyglobe2/039/metro/MIT_bias_claims_debunked+.sh
tml
An Globe op-ed piece by Cathy Young on 2/7/01 states, "Monday, the
Women's Forum followed up with another report that demonstrates that the
senior women in MIT's biology department, however distinguished in their
field, did not quite measure up to their male colleagues in the number
of publications, frequency of citation, or outside grants."  

I had just presented an example of sex discrimination to my graduate
stats class (Chapter 2 in Ramsey & Schafer's Statistical Sleuth), and I
told the class that the IWF had probably done a multiple regression
analysis showing that gender wasn't a significant term in a regression
of salary  after publication number or citation frequency had been added
as an explanatory variable.   When I downloaded the IWF report, I was
quite frankly amazed that a statistician had attached his name to it.
Admittedly, the IWF was hamstrung by being unable to get any of the
relevant data from MIT, but that didn't prevent the IWF from reaching
the conclusion that the productivity of women faculty was less than men.

I typed the publication and citation data from p. 11-12 of the report as
an SPSS file and exported it as an excel file.  Both are available in a
zipped file on my web site:

http://www.es.umb.edu/edg/ECOS611/MIT-IWF.zip

As a class exercise, you could have your stats classes use
non-parametric tests (or parametric) to test whether there really is a
significant difference between male and female faculty in the biology
department at MIT in publication number or citation frequency (the grant
data isn't provided).  Note, that the authors of the IWF study divided
the biology faculty into older and younger groups and did separate
analyses on each group.  The Rush-Limbaugh dittoheads in your class, who
want to find striking gender differences in productivity, will be
profoundly disappointed with these IWF data (not one null hypothesis can
be rejected at alpha=0.05 using a Mann-Whitney U test).

--
Eugene D. Gallagher
ECOS, UMASS/Boston


Sent via Deja.com
http://www.deja.com/


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The meaning of the p value

2001-02-09 Thread Herman Rubin

In article <[EMAIL PROTECTED]>,
Rich Strauss <[EMAIL PROTECTED]> wrote:
>At 08:50 AM 2/2/01 -0500, Herman Rubin wrote:

>>But this is not the case with fixing a p value.  Most
>>testing problems have the property that the appropriate
>>procedure to be used corresponds to a p value for that
>>problem AND THAT SAMPLE SIZE, but the p value to be used
>>depends quite substantially on the sample size.

>I'm not sure I understand your point.  Are you saying that the p-value
>depends on sample size beyond the use of degrees-of-freedom to choose the
>appropriate null distribution?

>Rich Strauss

This is exactly what I am saying.  As the sample size
increases, the probability of incorrect acceptance when the
hypothesis is false decreases for a fixed p-value, so some
of this improvement should be used instead to decrease the
probability of the type I error.

To give an example, let X_i be independent two dimensional
(only chosen for computational convenience) normal random
variables with mean vector \mu and covariance matrix qI.
Also, again for computational convenience, suppose the
importance of accepting the null hypothesis if \mu lies in
a set of area A not containing 0 to be A/(2\pi) times the
importance of rejecting the null hypothesis if it is true.
Other formulations will give similar results.

Now if we have a sample of size n, the mean has variance
v = q/n.  As expected, rejection should take place if the
norm of the sample mean exceeds some value r.  Then the
proability of incorrect rejection will be exp(-r^2/2v),
and the corresponding importance of incorrect acceptance
will be r^2/2, for a total "loss" of r^2/2 + exp(-r^2/2v).

If we minimize this with respect to r, we find that
r^2 = 2 ln(1/v), if v < 1, and 0 if v >= 1.  So the
p-value for the optimal procedure would be min(v, 1).

Other scenarios are likely to be harder to calculate,
but the conclusions are similar.  The p-value to use
will depend on the precision of the usual estimator,
which depends on the sample size, for any given type
of problem.

-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: ANOVA : Repeated Measures?

2001-02-09 Thread Donald Burrill

If for each Subject you have 4 Measures in each of the 3 Conditions, then 
both Conditions and Measures are repeated-measures factors:  you design 
may be symbolized as   S x C x M  -- that is, Subjects (5 levels) are 
crossed with both Conditions and Measures.  This design is equivalent to 
  R(SxCxM)  where R (Replications) is "the ubiquitous nested factor" as 
one author has put it, random with one level.  (And since it only has one 
level, it has zero degrees of freedom and zero sum of squares;  but using 
it formally often helps one to see what the proper error mean square 
would be for each effect modelled in the design, even if no such mean 
square is actually available in the data.)
Your choices then are, for each of the three factors, whether to 
treat it as fixed or random.  Conditions are presumably fixed -- they 
usually are, because they usually represent all the conditions one is 
interested in considering.  (I can imagine wanting to treat them as a 
random sample of 3 drawn randomly from a population of possible 
experimental conditions, but that seems to me very unlikely.)
Measures might go either way.  If what they represent is a series 
of opportunities to observe the subjects' response to each condition, one 
might treat the factor as fixed, the levels representing the sequence 
(1st, 2nd, 3rd, 4th) in which the opportunities are presented.  This 
would permit examining differences among the 4 levels as possibly 
reflecting learning (one becomes a little more skilled each time one is 
asked to respond to a condition, perhaps?), or fatigue (after one has 
done it once, the action starts to become boring or otherwise wearisome), 
or a kind of resultant between learning and fatigue.  Or, if you really 
think it reasonable to model each encounter as equivalent to each other 
encounter (in the same Condition), and the only variation among levels of 
Measure is random replication variance, Measure might be treated as 
random. 
Subjects are usually treated as random, because one usually wants 
to generalize to a population of subjects "like these", and one may even 
have selected the Ss randomly from a pool of potential Ss for the 
experiment.  But you haven't very many Subjects, and perhaps you want to 
model individuial differences between them of some kind or other;  or, 
for some as yet unspecified reason, you are interested only in these 
particular Ss and not in a population of Ss which they might be argued to 
represent;  in either of which cases you may wish to treat Ss as fixed. 
Of course, to carry out _any_ tests of hypotheses, at least one 
of the three factors must be declared random, or you will have no 
legitimate error mean square against which to test the hypothesis mean 
square for any of the possible effects.
In terms of your three possibilities:
 (a) has C and S fixed, M random;
 (b) has C and M fixed, S random (although I don't think it correct to 
describe S as a "repeated-measure" factor:  in my lexicon, a "repeated 
measure" factor is any factor in a design that is _crossed with_  S);
 (c) has C fixed, S and M random.

It may be informative to carry out more than one formal analysis, 
using different fixed/random choices.  This would tell you what results 
are robust with respect to those choices, and what results depend on how 
you choose to treat one or another of the formal factors.  In case it's 
useful, here is a table of the proper error mean squares for each effect: 

  Error mean square under
 Source(a)  (b) (c)
C  CM   CS  (CS + CM - CSM)
S  SM   --  SM
M  --   SM  SM
CS CSM  --  CSM
CM --   CSM CSM
SM --   --  CSM
CSM--   --  --

(Where the entry is "--", the proper error mean square would be R(SCM), 
if it were available.  In its absence, one could use the mean square for 
CSM, making the assumption that there is no 3-way interaction -- that may 
or may not be a reasonable assumption to make.)
-- DFB.

On Fri, 9 Feb 2001, Sylvain Clément wrote:

> We have data from an experiment in psychology of hearing. There are 3
> experimental conditions (factor C). We have collected data from 5
> subjects (factor S). For each subject we get 4 measures of performance
> (M for Measure factor) in each condition. What is the best way to
> analyse these data?
> 
> We've seen these possibilities :
> 
> a)  ANOVA with repeated measures with 2 fixed factors : subjects &
> conditions  and the different measures as the repeated measure factor
> (random factor).
> 
> b) ANOVA with two fixed factor (condition & measure) and a random
> factor (repeated measure-> subject factor).
> 
> c) ANOVA with one fixed factor (condition) and the other two as
> random.
<  snip, arguments in favor of one or another of thes

Re: test for whether an observation comes from a distribution

2001-02-09 Thread Donald Burrill

Two comments:  (1) You have not told us what other distribution(s) are 
possible;  in the absence of this information it is imopssible to assess 
how likely it may be for one particular datum to have "come from" (i.e., 
to have been randonly drawn from) the distribution of interest.
  (2) You cannot "determine" whether the "real-world" datum came from 
this distribution, in any case.  At best, one might be able to describe 
its likelihood of having been drawn from this distribution, relative to 
its likelihood of having emerged if some other distribution were the 
datum-generator, so to speak;  and even this only by making assumptions 
that might well be argued to be unreasonable.

On Fri, 9 Feb 2001, Ravi Bapna wrote:

> i am looking for a way to determine whether an observation from the
> real-world (one data point) comes from a distribution that has been  
> created out of a simulation process. the simulation is repeated 30 
> times, so there are that many data points.
> 
> the idea is to test whether the simulation process replicates the 
> real-world phenomenon, of which we have only one observation. 

With only one datum, this sounds impossible to me.

> i would even be comfortable with something that tells me, "there is a 
> high likelihood of drawing the real-world observation from the 
> distribution created by the simulation."

Don't see how you can even do this.  If the simulation produces a large 
number of different possible values, all you can say is that the 
probability that this distribution produces that particular value is "p", 
and "p" isn't likely to be very large, I shouldn't think.

 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: MIT Sexism & statistical bunk

2001-02-09 Thread Gene Gallagher

The link to the datafiles appears to be case sensitive, so
>
> http://www.es.umb.edu/edg/ECOS611/MIT-IWF.zip

should be:

http://www.es.umb.edu/edg/ECOS611/mit-iwf.zip

Gene Gallagher


Sent via Deja.com
http://www.deja.com/


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



SLR sensitivity program I wrote

2001-02-09 Thread EAKIN MARK E

If you have Visual Basic 6 installed on your computer, I would be
interested in your feedback on a computer program that I wrote for my
class. It demonstates the effect of changing the X and/or Y values of a
point on simple linear regression estimates. It will only run under a VB 6
environment but I am trying to convert it to Java.

The program can be downloaded from:

www2.uta.edu/eakin/busa5325.htm

 

Mark Eakin  
Associate Professor
Information Systems and Management Sciences Department
University of Texas at Arlington
[EMAIL PROTECTED] or
[EMAIL PROTECTED]



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



No Subject

2001-02-09 Thread BAECULA



subscribe


Re: ANOVA : Repeated Measures?

2001-02-09 Thread J. Williams

On Fri, 09 Feb 2001 16:17:04 GMT, [EMAIL PROTECTED]
(Sylvain Clément) wrote:

>
>We have data from an experiment in psychology of hearing. There are 3
>experimental conditions (factor C). We have collected data from 5
>subjects (factor S). For each subject we get 4 measures of performance
>(M for Measure factor) in each condition. What is the best way to
>analyse these data?
>
>We've seen these possibilities :
>
>a)  ANOVA with repeated measures with 2 fixed factors : subjects &
>conditions  and the different measures as the repeated measure factor
>(random factor).
>
>b) ANOVA with two fixed factor (condition & measure) and a random
>factor (repeated measure-> subject factor).
>
>c) ANOVA with one fixed factor (condition) and the other two as
>random.
>
>We think that the a) design is correct (assuming and verifying that
>there is no special effect of the measure factor such as training
>effects).
>
>Other psychologist advised us to use the b) design because
>psychologists use to consider the subject effect as random. (in
>general experiments in psychology are ran with at least 20 to 30
>subjects).
>
>The last design (c)) is a possibility if we declare that we have no
>hypothesis on the effects of subject & repetition factors.
>
>
>I have only little theoretical background in stats and I like to know
>what exactly imply these possible designs.
>
>Thanks in advance for your help
>
>Sylvain Clement
>"Auditory function team"
>Bordeaux, France

First, it seems to me, we need  to know the scale type of your "M"
data, i.e., interval, ordinal, etc.  Are your data means?  Scores on
some type of testing device?  Are they like: "hears"/"doesn't hear" or
what?  If your data are nominal or ordinal, you may be required to use
another statistical mode.   From what I glean from your account is
that the M factor is the response variable, right?  Or is it truly a
factor as well which you want to analyze for main effects,
interaction, etc.?

 The experimental condition variable is fixed (unless it is meant to
be a random selection from a population of experimental conditions).
Essentially, a repeated measures ANOVA is treated similarly to a
randomized block design with your subjects as the random factor.  Each
subject receives all  3 treatments.  
 
I think it might help for you  to clarify the hypotheses first, then
develop a statistical plan which will correctly fit your data.  Data
collection and good results are possible if the hypotheses and
statistical modality are determined a priori.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Determing Best Performer

2001-02-09 Thread Rich Ulrich

On 9 Feb 2001 02:33:08 -0800, [EMAIL PROTECTED] (Grant Robertson)
wrote:

> Hi All,
> I have a client who insist on determining which call centre representative
> is the weeks best performer. We conduct weekly interviews and ask
> respondents to rate each representative. For some reps we conduct 5
> interviews others 1 and so on. Now the question is how do you go about
> determining which rep performed best, we combine the two top box ratings. 
 [ ... ]

 - I agree with the earlier posts, that you should point out the
relevant literature to your boss;  that the endeavor generally goes
against a whole lot of professional advice about the morale and
ratings, etc.

What you are doing seems arbitrary and onerous, that you should
combine the "two top box ratings"  when the number of ratings varies
from 1 to 5.

For the beginnings of fairness:  Do the ratings "proportionate to" the
amount of service, and then you might consider some rule like, a Bonus
for  *every*  rating of  "Exceptional".


-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: MIT Sexism & statistical bunk

2001-02-09 Thread dennis roberts

At 05:19 PM 2/9/01 +, Gene Gallagher wrote:


>  The report argues that the
>gender difference in MIT salary and lab space was justified because "few
>would question the fairness of rewarding those who publish more widely,
>are more frequently cited, or raise the most in grant funds (p. 8, IWF
>report)"


this raises a related but perhaps an even more troubling matter ... (which 
many say ... ah shucks, that is just "market" forces at play ... and thus, 
don't even consider it a legitimate variable to enter into the fray) ... 
but i do

the largest % of the salary variance at most institutions, large ones 
anyway, is NOT rank but, college ... ie, variations across colleges are 
greater than within ranks ...

these differences can be massive ... (if you think the difference between 
male and females anywhere approach college differences, think again)

so, if one wants to examine (IF they do) the matter of productivity ... 
then the argument would go something like this:

if you believe that more productivity (assuming rank were constant) 
deserves more  ... then, that notion should apply ACROSS the 
institution as a whole ...

which we know does not of course ...

the productivity issue is a lame variable in the overall scheme of things 
... since, those making the most money and in the highest salaried colleges 
HAVE the most time to devote to this activity called "scholarship" ... 
because they have the smallest teaching and advising loads, in general ...

at penn state for example, according to our policy manual, salary 
increments are based on MERIT ONLY ... that is, the notion of an across the 
board increment for everyone because cost of living goes up ... has no 
legal place in our system (rather stupid i say) ... so technically, if only 
merit is to be the factor, merit would have to relate (either totally or 
darn close to it) ... productivity ..
but, if you try to push the notion of REAL productivity ... the logic 
breaks down quickly since, differences in salary seem to have little to do 
with productivity ... but rather, WHERE you happen to be within the entire 
university system

what DOES productivity mean anyway? the # of articles? who really READS 
them? HOW much money you bring in?? how many students you teach? etc. etc.

it is really difficult, at the micro manage level of trying to 
differentiate salary ... and salary increments ... by productivity measures 
... when it appears that so many NON productivity factors are the key 
elements in general level of salary for faculty and, the amount of 
increments given





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



The number of DOE runs for variable factors

2001-02-09 Thread L. D. Marks

I am looking for software (GNU preferable) that will give possible DOE 
runs for N factors (all of which have 2 levels). I want some flexibility 
in how many runs I choose to make. Any suggestions?

N.B. N will be anything from 20-120, and the number of runs should 
ideally be in the range 1000-4000. The levels are discrete (bits in 
fact) and would be used as initial input for a genetic search algorithm.
I don't need the "best", rather something better than choosing values 
randomly. (We have confirmed in test cases that a DOE or
error-correcting 
code starting point is better than random values.)

-- 
Laurence Marks
Department of Materials Science and Engineering
Northwestern University
Tel: (847) 491-3996 Fax: (847) 491-7820
mailto:[EMAIL PROTECTED]  http://www.numis.nwu.edu

Workshop at LBL May 17-19 2001 "New approaches to the Phase Problem"
http://xraysweb.lbl.gov/esg/phasing/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: MIT Sexism & statistical bunk

2001-02-09 Thread Gene Gallagher

In article <[EMAIL PROTECTED]>,
  [EMAIL PROTECTED] (dennis roberts) wrote:
>
> this raises a related but perhaps an even more troubling matter ...

>
> the largest % of the salary variance at most institutions, large ones
> anyway, is NOT rank but, college ... ie, variations across colleges
are
> greater than within ranks ...
>
> these differences can be massive ... (if you think the difference
between
> male and females anywhere approach college differences, think again)
>
> so, if one wants to examine (IF they do) the matter of productivity
...
> then the argument would go something like this:
>
> if you believe that more productivity (assuming rank were constant)
> deserves more  ... then, that notion should apply ACROSS the
> institution as a whole ...
>
> which we know does not of course ...
>
> the productivity issue is a lame variable in the overall scheme of
things
> ... since, those making the most money and in the highest salaried
colleges
> HAVE the most time to devote to this activity called "scholarship" ...
> because they have the smallest teaching and advising loads, in general
...


> it is really difficult, at the micro manage level of trying to
> differentiate salary ... and salary increments ... by productivity
measures
> ... when it appears that so many NON productivity factors are the key
> elements in general level of salary for faculty and, the amount of
> increments given


I agree wholeheartedly that publication number and citation frequency
should not be the prime determinant of a faculty members' value to an
institution.  The shocking thing about this IWF document is the chutzpah
involved.  The IWF authors tout their report as providing compelling
evidence that MIT women faculty are inferior in publications and
citations, when the data that they present is insufficient to reject the
hypothesis of no gender difference in either publication number or
citation frequency.  The Boston Globe accepted the IWF report's
conclusions and published two articles describing the IWF alternate
hypothesis that MIT men deserved their higher salaries and University
resources because the women had demonstrably weaker publication and
citation records.  The IWF authors found that papers published by
younger female faculty are cited at a higher rate than their male
colleagues.  To most objective scientists, this would seem to pose a
problem for the IWF hypothesis that salary differences are due to
differences in productivity and stature.   The IWF authors found a way
to deal with this, stating "several females had more citations per paper
than most males.  It seems to us that this would be an unlikely outcome,
if as commonly claimed, the work of females were truly devalued (p. 8 of
IWF report)."  Wow, what chutzpah!

--
Eugene D. Gallagher
ECOS, UMASS/Boston


Sent via Deja.com
http://www.deja.com/


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The number of DOE runs for variable factors

2001-02-09 Thread Bob Wheeler

Rather than use experimental designs which are
constructed to meet criteria irrelevant to your
problem, you might consider Latin hypercubes or a
numerical sequence such as a Faure sequence. These
are space filling structures that might make good
starting sets. Latin hypercubes may be easiest for
you. The columns of a Latin hypercube are
permutations of the same set of values, say the
integers 1 to N. Each column uses a different
peurmutation. See McKay, et.al. (1979) A
comparison of three methods for selection values
of input vriables in the analysis of output from a
computer code. Technometics. 21-2. 239-245.

"L. D. Marks" wrote:
> 
> I am looking for software (GNU preferable) that will give possible DOE
> runs for N factors (all of which have 2 levels). I want some flexibility
> in how many runs I choose to make. Any suggestions?
> 
> N.B. N will be anything from 20-120, and the number of runs should
> ideally be in the range 1000-4000. The levels are discrete (bits in
> fact) and would be used as initial input for a genetic search algorithm.
> I don't need the "best", rather something better than choosing values
> randomly. (We have confirmed in test cases that a DOE or
> error-correcting
> code starting point is better than random values.)
> 
> --
> Laurence Marks
> Department of Materials Science and Engineering
> Northwestern University
> Tel: (847) 491-3996 Fax: (847) 491-7820
> mailto:[EMAIL PROTECTED]  http://www.numis.nwu.edu
> 
> Workshop at LBL May 17-19 2001 "New approaches to the Phase Problem"
> http://xraysweb.lbl.gov/esg/phasing/index.html

-- 
Bob Wheeler --- (Reply to: [EMAIL PROTECTED])
ECHIP, Inc.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=