Avoiding Linear Dependencies in Artificial Data Sets

2001-03-12 Thread jim clark

Hi

I like to use small, artificially generated data sets with
integer parameters to introduce analyses.  Often, however, I find
it difficult to avoid undesirable contingencies among the scores
(e.g., linear dependencies in within-subject designs).  Is there
an algorithmic way to generate such scores and avoid such
dependencies?  Here is a small example with 4 scores for each of
5 subjects.  The following analysis reveals the undesirable
linear dependencies.  I'm assuming the dependencies arise from
the noise vectors that I used to generate the cell scores by
adding them to the main effect of the factor and the subject
effects.  Is there a systematic way to create such noise vectors
to avoid linear dependencies?

data list free / subj vl lo hi vh
begin data
1 3 3 5 52 1 3 7 9 3 6 8 8 10   4 7 8 6 7   5 3 3 9 9
end data
manova vl lo hi vh /wsf = conc(4) /print = cell
  /contr(conc) = poly

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -
 Cell Means and Standard Deviations
 Variable .. VL
 Mean  Std. Dev.  N   95 percent 
Conf. Interval
 For entire sample  4.000  2.449  5   .959 
 7.041
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -
 Variable .. LO
 Mean  Std. Dev.  N   95 percent 
Conf. Interval
 For entire sample  5.000  2.739  5  1.600 
 8.400
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -
 Variable .. HI
 Mean  Std. Dev.  N   95 percent 
Conf. Interval
 For entire sample  7.000  1.581  5  5.037 
 8.963
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -
 Variable .. VH
 Mean  Std. Dev.  N   95 percent 
Conf. Interval
 For entire sample  8.000  2.000  5  5.517 
10.483
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -

Tests of Between-Subjects Effects.

 Tests of Significance for T1 using UNIQUE sums of squares
 Source of Variation  SS  DFMS F  Sig of F

 WITHIN CELLS  40.00   4 10.00
 CONSTANT 720.00   1720.00 72.00  .001

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -
 Estimates for T1
 --- Individual univariate .9500 confidence intervals

 CONSTANT

  Parameter   Coeff.Std. Err.  t-Value   Sig. t   
Lower -95%CL- Upper

1  12.00  1.41421  8.48528   .00106
  8.07351 15.92649

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -

* * * * * * * * * * * * * * * * * A n a l y s i s   o f   V a r i a n c e -- Design   
1 * * * * * * * * * * * * * * * * *

Tests involving 'CONC' Within-Subject Effect.


 Mauchly sphericity test, W =  .0
 Chi-square approx. =  .  with 5 D. F.
 Significance =  .

 Greenhouse-Geisser Epsilon =  .40650
 Huynh-Feldt Epsilon = .49123
 Lower-bound Epsilon = .3

AVERAGED Tests of Significance that follow multivariate tests are equivalent to
univariate or split-plot or mixed-model approach to repeated measures.
Epsilons may be used to adjust d.f. for the AVERAGED results.

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
 *   * *
 *   W A R N I N G   * The WITHIN CELLS error matrix is SINGULAR.  *
 *   * These variables are LINEARLY DEPENDENT  *
 *   * on preceding ones ..*
 *   *   T3*
 *   * Multivariate tests will be skipped. *
 *   * *
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -
07:51:26The University of Winnipeg SUN SPARCSolaris

* * * * * * * * * * * * * * * * * A n a l y s i s   o f   V a r i a n c e -- Design   
1 * * * * * * * * * * * * * * * * 

IRT/Rasch Modeling with SAS?

2001-03-12 Thread Lee Creighton

Hi all,

I am working on a dissertation that analyzes some international tests of
mathematics achievement. I need to use the responses (which can be
considered "correct"/"incorrect") and estimate an IRT (Item Response Theory)
model to describe the test.

In a nutshell, assume the test measures a single trait in an individual.
Then the IRT curve that I am looking for (something they call a 3-parameter
logistic, which I think is not a 100% correct name) is described by the
following function (best viewed in a fixed-width font):

   1 - c
P(T)= c + ---
 -1.7 a(T - b)
   1 + e

This curve takes a person's ability T and produces the probability that a
person with that trait will answer the question correctly.

The main problem, of course, is that T is unknown, as well as the three
parameters a, b, and c. So the estimation problem is quite tricky. I can't
find a reference that tells me exactly the recipe for finding it, but the
best I can tell is that the algorithm would start with an initial guess for
T, fit the curve parameters a, b, and c, then use this curve to re-estimate
T. The process repeats until some convergence criterion is reached.

Does anyone know if SAS will do this? I have found a piece of software that
claims to fit "Rasch models", but the classical Rasch model is a
one-parameter version of what I'm looking for (set b and c to zero, and you
have a Rasch model). Plus, the software costs about $1000, and I don't have
that to spare. The software (one called "BIGSTEPS" is the only one I can
find that will deal with the 89,000 students I have to deal with) is not
exactly "Microsoft Bob" in its ease of use.

This whole IRT/Rasch area is brand new to me, so I may be asking the wrong
crowd, but if anyone has any SAScode or guidance, I'd sure like to hear it.


--
`  ___  '
   -  (O o)  -
--ooO--(_)--Ooo---
_ __  __ _
   | |  \/  |  __ \ ® Lee Creighton
   | | \  / | |__) |  SAS Statistical Instruments
   _   | | |\/| |  ___/
  | |__| | |  | | |[EMAIL PROTECTED]
   \/|_|  |_|_|   5275R SAS Campus Drive
 (919) 531-3755
Statistical Discovery Software





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical

2001-03-12 Thread jim clark

Hi

On 12 Mar 2001, Radford Neal wrote:
 Yes indeed.  And the context in this case is the question of whether
 or not the difference in performance provides an alternative
 explanation for why the men were paid more (one supposes, no actual
 salary data has been released).
 
 In this context, all that matters is that there is a difference.  As
 explained in many previous posts by myself and others, it is NOT
 appropriate in this context to do a significance test, and ignore the
 difference if you can't reject the null hypothesis of no difference in
 the populations from which these people were drawn (whatever one might
 think those populations are).

Personally, I am not interested in the question of statistical
testing to dismiss the alternative explanation being proposed;
indeed, I suspect that the original claim about gender being the
cause of salary differences would not stand up very well either
to statistical tests.  But there does seem to me to be more than
just saying ... "see there is a difference" and that statistical
procedures would have a role to play.  For example, wouldn't the
strength and consistency of the differences influence your
confidence that this was indeed the underlying factor?  The same
difference in means due to one or two outliers would surely not
mean the same thing as a uniform pattern of productivity
differences, would it?  And wouldn't you want to demonstrate that
there was a significant and ideally strong within-group
relationship between productivity and salary before claiming that
it is a reasonable alternative for the between-group differences?  
Or at least, wouldn't that strengthen the case?  I appreciate
that in some domains (e.g., intelligence testing), people are
reluctant to make inferences about between-group differences on
the basis of within-group correlations, but that is the basic
logic of ANCOVA and related methods.

Best wishes
Jim


James M. Clark  (204) 786-9757
Department of Psychology(204) 774-4134 Fax
University of Winnipeg  4L05D
Winnipeg, Manitoba  R3B 2E9 [EMAIL PROTECTED]
CANADA  http://www.uwinnipeg.ca/~clark




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk

2001-03-12 Thread dennis roberts

At 02:25 PM 3/12/01 +, Radford Neal wrote:


In this context, all that matters is that there is a difference.  As
explained in many previous posts by myself and others, it is NOT
appropriate in this context to do a significance test, and ignore the
difference if you can't reject the null hypothesis of no difference in
the populations from which these people were drawn (whatever one might
think those populations are).

the problem with your argument is this ...

now, whether or not formal inferential statistical procedures are called 
for ... if there is a difference in salary ... and differences in any OTHER 
factor or factors ... one is in the realm of SPECULATION as to what may or 
may not be the "reason" or "reasons" for THAT difference

in other words ... any way you say that the difference "may be explained 
by"  is a hypothesis you have formulated ...

so, in this general context ... it still is a statistical issue ... that 
being, what (may) causes what ... and, this calls for some model 
specification ... that links difference in salaries TO differences in other 
factors/variables

if we do not view it as some kind of a statistical model ... then we are in 
no position to really talk about this case ... not in any causal or quasi 
causal way ... and, i thought that was the main purpose of this entire 
matter ... what LEAD to the gap in salaries?? ... was it something based on 
merit? or something based on bias?

i don't see how else we could check up on these kinds of issues other than 
some statistical questions being asked ... then tested in SOME fashion 
(though i am not specifying exactly how)




Radford Neal


Radford M. Neal   [EMAIL PROTECTED]
Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED]
University of Toronto http://www.cs.utoronto.ca/~radford



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
=

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



One tailed vs. Two tailed test

2001-03-12 Thread auda

Hi, all,
We are testing a group of subjects on their performance in two different
conditions (say, A and B), and we are testing them individually. We have an
alternative hypothesis that reaction time in condition A should be longer
than in condition B, so we perform a one-tailed t test. However, for some
subjects, they showed the pattern reverse to our alternative hypothesis--RT
B RT A, and the p value is significant under one tailed test.

Could we claimed that these "reversed" subjects showed "significant" results
in the opposite direction, or we should treat them as non-significant
results?

Thanks,
Erik




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Avoiding Linear Dependencies in Artificial Data Sets

2001-03-12 Thread Bob Wheeler

It isn't actually that easy, in the sense that
most data humans make up has a low efficiency with
respect to design criteria -- the determinant of
the cross-product matrix tends to be small. The
simplest way is to use a computer program that
calculates algorithmic designs. 

jim clark wrote:
 
 Hi
 
 I like to use small, artificially generated data sets with
 integer parameters to introduce analyses.  Often, however, I find
 it difficult to avoid undesirable contingencies among the scores
 (e.g., linear dependencies in within-subject designs).  Is there
 an algorithmic way to generate such scores and avoid such
 dependencies?  Here is a small example with 4 scores for each of
 5 subjects.  The following analysis reveals the undesirable
 linear dependencies.  I'm assuming the dependencies arise from
 the noise vectors that I used to generate the cell scores by
 adding them to the main effect of the factor and the subject
 effects.  Is there a systematic way to create such noise vectors
 to avoid linear dependencies?
snip
-- 
Bob Wheeler --- (Reply to: [EMAIL PROTECTED])
ECHIP, Inc.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: One tailed vs. Two tailed test

2001-03-12 Thread Thom Baguley

auda wrote:
 
 Hi, all,
 We are testing a group of subjects on their performance in two different
 conditions (say, A and B), and we are testing them individually. We have an
 alternative hypothesis that reaction time in condition A should be longer
 than in condition B, so we perform a one-tailed t test. However, for some
 subjects, they showed the pattern reverse to our alternative hypothesis--RT
 B RT A, and the p value is significant under one tailed test.
 
 Could we claimed that these "reversed" subjects showed "significant" results
 in the opposite direction, or we should treat them as non-significant
 results?

If you do a one-tailed test, no. The fact that you are entertaining this
possibility suggests you should be using a two-tailed test. The one-tailed
test has no power to detect differences in the discounted (non-predicted)
direction hence should only be used when you would reject such a finding a priori.

I'm a bit puzzled as to why you test each participant individually? You'd
expect (unless the effect is huge) for some participants to go against the
average pattern. If you do need to test each person individually you need to
use the two-tailed non-directional test and use a correction for multiple
testing (e.g., Bonferonni or similar).

Thom


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk

2001-03-12 Thread Irving Scheffe

Jim:

I agree with Radford Neal's comments,
and urge careful reconsideration of the
foundation behind some of the comments
made. 

For example, suppose you had a department
in which the citation data were

   Males   Females
   12220 1298
2297 1102

The male with 12220 is, let's imagine, a Nobel Prize
winner. The salaries for the 4 people are 

   Males   Females
  156,880  121,176
  112,120  114,324

The females approach the dean of science and declare that
there is discrimination against them. They've measured
the labs, and the men have more space. Moreover, they
feel marginalized and depressed, as their status has
been slowly slipping in the department. Moreover, they
are paid less than men of the same age.

Careful examination of mean salary shows that the mean 
salaries are 134,500 for men and only 117,750 for women.

With great brouhaha, the administration, without
publishing the above data, declares that there
was a discrimination problem, and it was addressed
by giving both the women a 16,000 raise.

As Radford Neal has pointed out succinctly, the argument about
outliers is irrelevant, and I want to emphasize with this example
that it is irrelevant on numerous levels. First of all,
it is not necessarily clear whether, and in which of several
senses, our Nobel Prize winner is an outlier in his group.
Second, even if he is -- so what? Surely you would not argue
that this means he didn't deserve his salary!

In fact, careful examination of the salary data [never
made public by the administration] together with the
performance data might well have led to the conclusion
that it is the male faculty who are underpaid.

Although, as Dr. Neal pointed out, it is not logically
relevant to the issue, I would like to
explore your notion, echoed without
justification by Rich Ulrich, that the
huge difference in citation performance between
MIT senior men and women might be due
to "one or two outliers."

Take a look at the data again, and tell me
which male data you consider to be outliers
within the male group, and why. For example, 
are the men with 2133 and
893 "outliers," or those with 12830 and 11313?

The data for the senior men and women:

12 year citation counts:

   MalesFemales
 --
128302719
113131690
106281301
 43961051
 2133 935
  893
 ---

As for the notion of exploring the relationship between
salary, gender, and performance -- I'd be more than happy
to examine any data that MIT would make available. They
will, of course, not make such data available. It is too
private, they say.


Best regards,

Jim Steiger

--
James H. Steiger, Professor
Dept. of Psychology
University of British Columbia
Vancouver, B.C., V6T 1Z4
-


Note: I urge all members of this list to read
the following and inform themselves carefully
of the truth about the MIT Report on the Status
of Women Faculty. 

Patricia Hausman and James Steiger Article,
"Confession Without Guilt?" :
  http://www.iwf.org/news/mitfinal.pdf  

Judith Kleinfeld's Article Critiquing the MIT Report:
 http://www.uaf.edu/northern/mitstudy/#note9back

Original MIT Report on the Status of Women Faculty:
 http://mindit.netmind.com/proxy/http://web.mit.edu/fnl/


On Mon, 12 Mar 2001 08:55:17 -0600, jim clark [EMAIL PROTECTED]
wrote:

Hi

On 12 Mar 2001, Radford Neal wrote:
 Yes indeed.  And the context in this case is the question of whether
 or not the difference in performance provides an alternative
 explanation for why the men were paid more (one supposes, no actual
 salary data has been released).
 
 In this context, all that matters is that there is a difference.  As
 explained in many previous posts by myself and others, it is NOT
 appropriate in this context to do a significance test, and ignore the
 difference if you can't reject the null hypothesis of no difference in
 the populations from which these people were drawn (whatever one might
 think those populations are).

Personally, I am not interested in the question of statistical
testing to dismiss the alternative explanation being proposed;
indeed, I suspect that the original claim about gender being the
cause of salary differences would not stand up very well either
to statistical tests.  But there does seem to me to be more than
just saying ... "see there is a difference" and that statistical
procedures would have a role to play.  For example, wouldn't the
strength and consistency of the differences influence your
confidence that this was indeed the underlying factor?  The same
difference in means due to one or two outliers would surely not
mean the same thing as a uniform pattern of productivity
differences, would it?  And wouldn't you want to demonstrate that
there was a significant and ideally strong within-group
relationship between productivity and salary before claiming that
it is a reasonable alternative for the between-group differences?  
Or at least, wouldn't that 

Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical

2001-03-12 Thread jim clark

Hi

On Mon, 12 Mar 2001, Irving Scheffe wrote:
 Jim:
 For example, suppose you had a department
 in which the citation data were
 
Males   Females
12220 1298
 2297 1102

When I said outlier, I had in mind hypothetical data of the
following sort (it doesn't matter to me whether it is the
salaries or the citation rates):

MalesFemales
170001000
 10001000
 10001000
 10001000

Avg  50001000

vs.
Males   Females
50001000
50001000
50001000
50001000

Avg 50001000

I would view the latter somewhat differently than the former with
respect to differences between these samples of males and
females, and with respect to the kinds of explanations I would
seek (e.g., somewhat general to males, something specific to
male 1).

 The male with 12220 is, let's imagine, a Nobel Prize
 winner. The salaries for the 4 people are 
 
Males   Females
   156,880  121,176
   112,120  114,324

Of course if the salaries were:
Males   Females
   112,120   121,176
   156,880   114,324

You probably might want not to promote the hypothesis of
productivity differences explaining the gender differences.  That
was the point of one of my later comments.

 As Radford Neal has pointed out succinctly, the argument about
 outliers is irrelevant, and I want to emphasize with this example
 that it is irrelevant on numerous levels. First of all,
 it is not necessarily clear whether, and in which of several
 senses, our Nobel Prize winner is an outlier in his group.
 Second, even if he is -- so what? Surely you would not argue
 that this means he didn't deserve his salary!

Assuming a correlation between productivity and salary (or
winning of Nobel prizes).

 In fact, careful examination of the salary data [never
 made public by the administration] together with the
 performance data might well have led to the conclusion
 that it is the male faculty who are underpaid.

I'm in perfect agreement with this, although I still think that
statistics would play a positive role in identifying the
determinants of salary.

 Although, as Dr. Neal pointed out, it is not logically
 relevant to the issue, I would like to
 explore your notion, echoed without
 justification by Rich Ulrich, that the
 huge difference in citation performance between
 MIT senior men and women might be due
 to "one or two outliers."

I don't remember making any such attribution.  I asked a question
about whether detractors of statistical testing would view
equivalently differences due to some outliers and more consistent
results, in the sense I showed above.  I'm not sure it is any
more palatable to have one's motives misconstrued by people
arguing against gender-related bias than to have them
misconstrued by people arguing for gender-related bias.

 Take a look at the data again, and tell me
 which male data you consider to be outliers
 within the male group, and why. For example, 
 are the men with 2133 and
 893 "outliers," or those with 12830 and 11313?

Not having taken any position on it, I am not too sure I feel any
compulsion to answer your question.  I guess I would turn it
around and say, would you interpret your results exactly the same
as the modified results that I have presented below?

 The data for the senior men and women:
 12 year citation counts:
MalesFemales
  --
 128302719
 113131690
 106281301
  43961051
  2133 935
   893
  ---

Average 7032  1539

Modified (Hypothetical ... for pedagogical purposes only ... no
hidden agenda results ...)

Males Females
34500 1500
 1500 1500
 1500 1500
 1500 1500
 1500 1500
 1500

 Avg 7000 1500

To me, these data are much less suggestive of general differences
in productivity between males and females, would not be an
adequate account of widespread (i.e., consistent or uniform
across individuals) differences in salaries, and so on.  Am I
correct to assume that for you the consistency of the differences
between the groups (which is what a statistical test measures) is
completely irrelevant?  Or are you implicitly engaging in
inferential-like thinking when you examine the actual
distributions?

 As for the notion of exploring the relationship between
 salary, gender, and performance -- I'd be more than happy
 to examine any data that MIT would make available. They
 will, of course, not make such data available. It is too
 private, they say.

But were the data made available to you, would you use any
statistical procedures in the examination?  Would you care
whether the differences in salary were significant?  The
differences in productivity?  The differences in any number of
potential confounding variables?  What about the significance and
strength of the relationships between predictors and
salary?  What about whether the gender difference was significant
after productivity was 

Re: One tailed vs. Two tailed test

2001-03-12 Thread Jerry Dallal

auda wrote:
 
 Hi, all,
 We are testing a group of subjects on their performance in two different
 conditions (say, A and B), and we are testing them individually. We have an
 alternative hypothesis that reaction time in condition A should be longer
 than in condition B, so we perform a one-tailed t test. However, for some
 subjects, they showed the pattern reverse to our alternative hypothesis--RT
 B RT A, and the p value is significant under one tailed test.
 
 Could we claimed that these "reversed" subjects showed "significant" results
 in the opposite direction, or we should treat them as non-significant
 results?
 

Don't do one-tailed tests.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Avoiding Linear Dependencies in Artificial Data Sets

2001-03-12 Thread Elliot Cramer

I'm not clear on what your design is but it seems that
the problem is in the between S effect not within.  Note that you only
have 4 df within and  4 dependent variables




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Speaking of ANOVA in SPSS...

2001-03-12 Thread Will Hopkins

I'm trying to reduce all stats to a few simple procedures that 
students can do EASILY with available stats packages.  A two-way 
ANOVA or an ANCOVA is as complex as I want to go. I thought SPSS 
would do the trick, but I was amazed to discover that it can't.

Here's the example.  I want students to convert repeated-measures 
data into unpaired t tests or non-repeated measures ANOVA, by using 
change scores between the time points of interest.  That's no problem 
when there is just the group effect:  the analysis becomes a simple 
unpaired t test.  But when you have an extra between-subjects effect 
(e.g. males and females in the treatment and control groups) it 
becomes a two-way ANOVA.  You make a column of change scores between 
the time points of interest (e.g., post and pre), and that's your 
dependent variable.  The two independent effects are group (exptal 
and control, say) and sex (male and female).  The group term gives 
the effect of the treatment averaged for males and females.  Again, 
no problem there, but what I want is an appropriate customized 
contrast of the interaction term, which yields the difference in the 
overall effect between males and females.  SPSS version 10 can't do 
it.  I checked the on-line help, and it looks like you have to use 
the command language.  Well really, what student is going to manage 
that?  It's out of the question.  Sure, you can get a p value for the 
interaction, but I want confidence limits for the difference between 
males and females.  I've got my students to convert the p value, the 
degrees of freedom, and the observed value of the effect into 
confidence limits, but I shouldn't have to resort to that.

I'd also like SPSS to do an ANCOVA, but again I want to do contrasts 
for the interaction, and again, they ain't there.  Or did I miss 
something?  If so, please let me know.  And can you let me know of 
any simple, and preferably CHEAP or FREE, packages that will do what 
I want?

Will
-- 
Will G Hopkins, PhD FACSM
University of Otago, Dunedin NZ
Sportscience: http://sportsci.org
A New View of Statistics: http://newstats.org
Sportscience Mail List:  http://sportsci.org/forum
ACSM Stats Mail List:  http://sportsci.org/acsmstats

Be creative: break rules.



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: One tailed vs. Two tailed test

2001-03-12 Thread Will Hopkins

At 7:34 PM + 12/3/01, Jerry Dallal wrote:
Don't do one-tailed tests.

If you are going to do any tests, it makes more sense to one-tailed 
tests.  The resulting p value actually means something that folks can 
understand:  it's the probability the true value of the effect is 
opposite to what you have observed.

Example:  you observe an effect of +5.3 units, one-tailed p = 0.04. 
Therefore there is a probability of 0.04 that the true value is less 
than zero.

There was a discussion of this notion a month or so ago.  A Bayesian 
on this list made the point that the one-tailed p has this meaning 
only if you have absolutely no prior knowledge of the true value. 
Sure, no problem.

But why test at all?  Just show the 95% confidence limits for your 
effects, and interpret them:  "The effect could be as big as upper 
confidence limit, which would mean  Or it could be lower 
confidence limit, which would represent...  Therefore... "  Doing it 
in this way automatically addresses the question of the power of your 
study, which reviewers are starting to ask about. If your study turns 
out to be underpowered, you can really impress the reviewers by 
estimating the sample size you would (probably) need to get a 
clear-cut effect.  I can explain, if anyone is listening...

Will
-- 
Will G Hopkins, PhD FACSM
University of Otago, Dunedin NZ
Sportscience: http://sportsci.org
A New View of Statistics: http://newstats.org
Sportscience Mail List:  http://sportsci.org/forum
ACSM Stats Mail List:  http://sportsci.org/acsmstats

Be creative: break rules.



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: One tailed vs. Two tailed test

2001-03-12 Thread Donald Burrill

On Tue, 13 Mar 2001, Will Hopkins wrote in part:

 Example:  you observe an effect of +5.3 units, one-tailed p = 0.04. 
 Therefore there is a probability of 0.04 that the true value is less 
 than zero.

Sorry, that's incorrect.  The probability is 0.04 that you would find an 
effect as large as +5.3 units (or more), if (a) the true value is zero 
and (b) the sampling distribution of the test statistic is what you think 
it is.  (The probability of finding an effect this large, in this 
direction, is less than 0.04 if the true value is less than zero (and 
your sampling distribution is correct).)

  snip  

 But why test at all?  Just show the 95% confidence limits for your 
 effects, and interpret them:  "The effect could be as big as upper 
 confidence limit, which would mean  Or it could be lower 
 confidence limit, which would represent...  Therefore... "  Doing it 
 in this way automatically addresses the question of the power of your 
 study, which reviewers are starting to ask about. If your study turns 
 out to be underpowered, you can really impress the reviewers by 
 estimating the sample size you would (probably) need to get a 
 clear-cut effect.  I can explain, if anyone is listening...

You had in mind, I trust, the _two-sided_ 95% confidence interval!
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=