Heaven help us all!

2000-08-22 Thread Jerry Dallal

Want a nice classroom exercise?

If I weren't giving you the CNN URL, I wouldn't
blame you for accusing me of making this up!

http://www.cnn.com/2000/HEALTH/diet.fitness/08/21/fat.supplement/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: 2-level factorial design manual analysis

2000-08-22 Thread Donald Macnaughton

I give a tutorial (in the form of two heavily annotated computer 
programs) that illustrates a simple general matrix algebra ap-
proach to computing sums of squares in unbalanced (and balanced) 
analysis of variance.  The tutorial is in terms of Yates' ap-
proach to visualizing such computations.  (The programs are writ-
ten in SAS IML, but one need not understand SAS or IML to under-
stand the tutorial.)  The tutorial is available at

  http://www.matstat.com/ss/

---
Donald B. Macnaughton   MatStat Research Consulting Inc
[EMAIL PROTECTED]  Toronto, Canada
---


Brian A. Bucher wrote (on 00/8/22)
 Bob Wheeler ([EMAIL PROTECTED]) wrote:
: If you just want an answer, use a statistical
: package such as MiniTab. This ensures that the
: computations will at least be correct, and that
: you will be provided with the appropriate error
: estimates.

 Well, at the same time I want an answer I also want
 to learn the basic mechanics and have a general
 understanding of what I'm doing.  Since there's a
 tradeoff between spending my time learning details
 about DOEs/stats and working on my other projects,
 it becomes an optimization task in itself! :)

: If you must do it yourself, perhaps the best thing
: for you and for this particular problem is to use
: Yates' algorithm. It should be described in BHH,
: but if not, you will find it in many other
: statistical texts.

 Thanks for the info!

 Brian

: Brian A Bucher wrote:
: 
: I'd like to learn how to analyze a 4-factor, 2-level full-
: factorial (and maybe fractional factorial) designs.  My op-
: tions at this moment are:
: 1. Use Excel
: 2. Learn and use R
: 
: Could someone give me an estimate on how much time it will
: take for me to accomplish either of these?  I'm presently
: reading/reviewing Statistics for Experimenters by Box,
: Hunter, Hunter.
: 
: Thanks for any info,
: Brian


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Student fears (was: Histograms etc.)

2000-08-22 Thread Herman Rubin

In article 41890A69A4EAD211A74600805F9F86F90276@ESC-S1,
Olsen, Chris [EMAIL PROTECTED] wrote:
Herman and All --


 On the other hand, the type of clear measurements and
 formulation in these fields are not generally available
 in psychology.  So they make the massive mistake of 
 letting statistics do their thinking for them.  A typical
 example of this is to convert their data to normality,
 and to scale it to a given mean and variance.  This means
 that the data is a descriptive measure on this ONE group
 of individuals, or possibly even only of the sample taken,
 and that no conclusions should be drawn (although they
 are often done) about others.  It is especially bad when
 they allow their moral judgments to cloak their models.



  If you will pardon the interloping of a lurker in this thread, I don't
quite understand the above paragraph, or at least I don't see how the
elements fit into a cohesive whole.  I certainly do not wish to defend
psychology, but am wondering why psych is singled out for these alleged
sins.

1)  I am unaware that there is a "natural" metric, ordained by mother
nature, that would govern the choice of distribution for a set of data.

There can be many natural metrics, but they are all determined
without regard for a frequency distribution.  In most cases,
metrics are determined by additivity properties.  Most of the
units I am aware of in physics either have this property, or
the reciprocals do, or they appear as coefficients in equations.

The others are logarithmic metrics, with the scale so that 
equal differences correspond to equal ratios on an additive
scale.  Such is the case with scales for sound level, stellar
magnitude, and earthquake intensity.

There are some, such as hardness of minerals, but I doubt
that anyone would use Moh's scale in a regression equation.

 If
normal works for the task at hand, I don't see the problem.

The choice of origin and size of the unit, such as temperature,
is still the specification of a linear scale.  At least it
was intended to be that way.  

 It would seem
to me that transformations of variables, while perhaps clouding and making
more difficult the interpretation of units, is practiced in the "hard"
sciences as well.

The transformations are natural transformations.  There are
very few kinds of transformations considered at all, and
these all are related to absolute considerations, not some
probability distribution of the results.  The only exception
which comes to mind offhand is "half life", and here it is
still a value (time) on an absolute scale.  The type of
values encountered may influence choice of base points and
units, but the type of scale is not so affected.

 In psychology one would be hard pressed to justify many
of the quantities as possessing units, but presuming such units existed
(perhaps time-to-proficiency on some discrimination task) I don't know why
the psychology types are performing any worse a scaling sin than, say, using
logs to convert concentrations of hydrogen ions into pH.  

The psychology types set up their scales so that a certain
proportion of the group for norming lies in a certain
interval; this is quite different than the choice of linear
or logarithmic scale.

A chemist is given the one piece of information that the pH
scale gives the negative logarithm of the hydrogen ion
concentration, and now knows the meaning of any pH value.
He will know how much to use to titrate a solution, for
example.  If a scientist is told that three measurements
are such that the middle one is equidistant from the other
two, one know what this means in absolute terms.  If a
psychologist is told that one score on a scale determined
by probabilities is between two others, this does not give
that type of information; see further.

2)  I don't understand how the conversion to normality "means" that "the
data is a descriptive measure on this ONE group..."  It would seem to me
that generalizability is a function of sampling or experimental design, not
scaling and transforming.  There would be no more or less generalizability
of a conclusion by virtue (or lack thereof) of a re-scaling or reexpression
of the data.

Suppose psychologist A sets up a scale by studying college
graduates, and psychologist B by studying elementary school
students.  There will not be a simple relationship between
these scales.  On the other hand, if one geographer measures
length in miles, another in kilometers, another in versts,
there is no major problem; twice as long is still twice as
long.  If we encounter extraterrestrial physicists or chemists
or astronomers, setting up the relations between the scales
will be relatively simple.  With one psychologist setting up
a scale based on the normal distribution in China and another
in Algeria, who knows?

3)  I don't quite understand how the psychologists are "letting the
statistics do their thinking for them," or at least how this might
distinguish psychologists from, say, geologists or 

Re: Which statistical test?

2000-08-22 Thread Donald Burrill

On Sun, 20 Aug 2000, jkroger wrote in part:

 I want to show that in some conditions, the difference between the length
 of A's response and B's response is greater than in other conditions:
 duration(A) - duration(B) is significantly greater in some conditions.
 
 I tried a t-test for each condition, subtracting B from A at each interval
 and using a t-test to determine if the resulting sample differed from 0.

Yes, but this does not address the question you said you want to show:  
which is not that  d(A) - d(B)  differs from zero, but that 
 (d(A) - d(B)) in condition 1 (say)  (d(A) - d(B)) in condition 2. 

(As an aside, using a t-test would be arguably appropriate for a planned 
comparison;  but it is much too sensitive for pursuing comparisons that 
were suggested by the fall of the data, so to speak, which I gather is 
the case in the present instance.)

Presumably you have the mean durations for each cell of the design from 
the ANOVA you mentioned in a subsequent post, and appropriate error mean 
squares for testing assorted null hypotheses (or constructing confidence 
intervals, or both).  Plug these into a post hoc contrast analysis (I'd 
recommend the method of Scheffe', since the phenomenon appears to be one 
you noticed in analyzing the data, not one you anticipated) for the 
contrast 
d(A1) - d(B1) - d(A2) + d(B2)

(where for the hypotheses the d's represent population means, and for 
the analysis one would substitute the observed sample means), for which 
the null hypothesis is that the value of the contrast is zero and your 
conjecture is that the value is positive (although, since it IS a post 
hoc contrast, you should test it against a two-sided alternative 
hypothesis). 

You may in fact have a number of such conjectures that you want 
to pursue;  the virtue of the Scheffe' method (and criterion) is that the 
Type I error rate is "experimentwise".

On Sun, 20 Aug 2000, jkroger wrote:

 I have two timecourse measures, A and B. At 20 consecutive intervals, A
 and B are measured, and the results are plotted. Both signals rise quickly
 to about the same height, then fall. Sometimes A stays elevated longer.
 
 There are eight separate trials (representing eight conditions), producing
 eight pairs of curves.
 
 I want to show that in some conditions, the difference between the length
 of A's response and B's response is greater than in other conditions:
 duration(A) - duration(B) is significantly greater in some conditions.
 
 I tried a t-test for each condition, subtracting B from A at each interval
 and using a t-test to determine if the resulting sample differed from 0.
 Unfortunately, in a couple conditions where it appears the A response is
 about the same as the B response, but the t-test is so sensitive that even
 small differences between A and B produce significance. The t value for
 the condition (#1) which it is important to demonstrate has a longer A
 duration (as is clearly obvious on inspection) is over 38. The conditions
 in which A - B is minimal still have significant t's of 5 or 8 (when a p
 of .05 requires a t of around 2).
 
 So according to the test I've chosen, A-B in almost all of the conditions
 is significant. What test will allow me to reveal the much greater
 significance of condition #1 relative to the others? I thought of
 chi-square (sum(A), sum(B) for all intervals; crossed with 1-8), but as
 chi-square is for frequency data, I'm not sure if it's applicable here.
 
 Thanks for any guidance,
 Jim

 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Plotting Distribution!!!

2000-08-22 Thread Donald Burrill

For openers, you're going to have to describe your problem with a good 
deal more precision, in order for anyone to provide any kind of useful 
help. 

On Fri, 18 Aug 2000, Veeral Patel wrote:

 I have a data whose histogram has a unique distribution exhibited by 
 it.  I am trying to fit different curves to the data and to see which 
 one has the best fit. 
How did you have in mind assessing goodness of 
fit?  None of your subsequent remarks address this point.

 The first one I am trying is gamma, i got my optimum alpha and beta 
 values.  And then simply fed my data (x values) into the gamma 
 distribution function and got my f(x) values. 

At this point it would be reasonable to ask how well f(x) for each  x  
agrees with the data  h(x)  [for your initial histogram of  x ].

 Now the question is how do I plot these. 

It is not entirely clear what you want to mean by "these".  What 
information do you want to display, and what utility do you want the 
display to have?  If you mean "plot f(x) vs. x", what do you expect 
the plot to tell you?

 I looked at books and for the distribution plots they like have  f(x) 
 on the vertical axis and quantile values on the horizontal.  Now how do 
 I obtain the quantiles or is there another way to do the plot of f(x) 
 and  x??  because if i plot f(x) and x i get weird looking lines on the 
 graph. 
What do you mean by "plot f(x) and x"?  If you were 
plotting  f(x) vs. x  (as one would expect), you should produce a plot 
of the gamma function.  On the other hand, such a plot would provide no 
obvious information about how well the gamma function fits the original 
histogram.  Perhaps there are some salient features of your analysis 
that you haven't yet told us.  
It is impossible to diagnose problems that are described merely 
as "weird looking".

 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Student fears (was: Histograms etc.)

2000-08-22 Thread dennis roberts

At 02:22 PM 8/22/00 -0500, Herman Rubin wrote:

No geographer would take the heights of mountains and
convert them to a probability scale.


i beg to differ ... for, it is not totally an uninteresting question that 
someone might ask ... for all mountains ... what is the p value for 
selecting at random from all mountains ... one that is a height of 10,000 
feet or more ...

or, just to ask: what would a frequency distribution look like for all 
mountains ... say ... where the scale on the baseline goes from 0 to 2500, 
2501 to 5000, etc. ...

good geographers ... would/should have some knowledge of this ... not that 
they would spend their lives doing these tabulations but, it is part of the 
knowledge base in which they work

if you have lived ONLY  in pennsylvania ... some mountains look pretty TALL 
while others seem rather short ... while, to those who have lived in 
florida all their lives (and never seen pictures or surfed the WWW) ... ALL 
of these look like the alps ... but, if you lived in the alps ... the 
mountains of pennsylvania, even the tallEST ones,  look like little bumps 
on the horizon





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



teaching software for stats/maths

2000-08-22 Thread Andrew McLachlan

Hi All

Here at my university, the Student Learning Centre (SLC) provides
additional maths and statistics tuition for those struggling with
first year courses. (They teach many other things as well.)  

The SLC wants to know if there are any software teaching packages that
they could get for students, for example on a CD, that the student
could take away and work on themselves (or run from the university
network).  In particular, the SLC is interested in packages or
tutorials that are useful for bridging from high school level (years
11  12) to first year university level. 

Of particular interest are the subjects calculus, and statistics.
Price is not important at this stage in the search.

Does anyone have experience using such software?
Can anyone suggest products or URLs?

Any help would be greatly appreciated.
Thanks in advance.

--
Andrew McLachlan,  PhD Student.   [EMAIL PROTECTED]
Ecology  Entomology Group,  P.O. Box 84,  Lincoln University, 
Canterbury,  New Zealand.  ph. +64 3 325-2811,  fax. +64 3 325-3844.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



lowess

2000-08-22 Thread Ming Hsu

Hi, can someone explain to me the strength and weaknesses behind the
lowess regression?  In particular, what has its applications been in the
biological sciences?

Also, how good are the bootstrap methods of computing confidence
regions?

Thanks,
Ming Hsu



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Book for second course in undergrad stats.

2000-08-22 Thread Ken Kelley

From Experience the Edwards Textbooks are quite appropriate for your needs
(at least I believe they are).  My choice would be the Fifth Edition of his
Experimental Design.  I am not, however, familiar with how readily available
the books are.  Further, they do not cover MANOVA, but it does cover all
other topics you wanted and is quite good in doing so.
K2

"Paul W. Jeffries" wrote:

 Dear list members,

 I would appreciate recommendations for a text to use in an advanced stats
 course.  The students are undergraduate psychology majors who have taken
 the department's intro stats course.  Since their math background is
 limited, I would like a book that develops ideas intuitively rather than
 mathematically.  I would like to cover at least multiple
 regression/correlation, factorial ANOVA, repeated-measures ANOVA, MANOVA,
 ANCOVA/MANCOVA.

 Any suggestions for textbooks are welcome.

 Thank you,
 Paul W. Jeffries
 Department of Psychology
 SUNY--Stony Brook
 Stony Brook NY 11794-2500

 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: teaching software for stats/maths

2000-08-22 Thread T.S. Lim

In article [EMAIL PROTECTED],
  [EMAIL PROTECTED] (Andrew McLachlan) wrote:
 Hi All

 Here at my university, the Student Learning Centre (SLC) provides
 additional maths and statistics tuition for those struggling with
 first year courses. (They teach many other things as well.)

 The SLC wants to know if there are any software teaching packages that
 they could get for students, for example on a CD, that the student
 could take away and work on themselves (or run from the university
 network).  In particular, the SLC is interested in packages or
 tutorials that are useful for bridging from high school level (years
 11  12) to first year university level.

 Of particular interest are the subjects calculus, and statistics.
 Price is not important at this stage in the search.

 Does anyone have experience using such software?
 Can anyone suggest products or URLs?

 Any help would be greatly appreciated.
 Thanks in advance.

 --
 Andrew McLachlan,  PhD Student.   [EMAIL PROTECTED]
 Ecology  Entomology Group,  P.O. Box 84,  Lincoln University,
 Canterbury,  New Zealand.  ph. +64 3 325-2811,  fax. +64 3 325-3844.


You could try "Fathom: Dynamic Statistics Software". The URL is listed
at

   http://www.kdcentral.com

--
T.S. Lim
[EMAIL PROTECTED]
www.Recursive-Partitioning.com
_
Get paid to write reviews! http://recursive-partitioning.epinions.com


Sent via Deja.com http://www.deja.com/
Before you buy.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



within group agreement for nominal/ordinal data

2000-08-22 Thread Ken Reed

I'm trying to test whether a variable measures a group-level property, and
so I'm looking for an analog to eta-squared, intra-class correlation etc for
nominal or ordinal data.

I have data comprising 2000 workplaces, within samples of individuals drawn
from each (n=20,000).

One variable has 4 categories (agree-neutral-disagree, don't know).

1. How can I estimate how much of the total variability derives from between
groups (workplaces) and within groups?

2. Is there a rule-of-thumb for what would be evidence of strong
within-group agreement?

3. Can I do this in SPSS?




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=