Re: One-tailed, two-tailed

2001-12-30 Thread Donald Burrill

On Sun, 30 Dec 2001, Stan Brown wrote in part:

 A. G. McDowell [EMAIL PROTECTED] wrote:

 The significance value associated with the one-tailed test will always
 be half the significance value associated with the two-tailed test,
 
 For means, yes.  Not for proportions, I think. 

Oh?  Why not?  Is there something about proportions that militates 
against assigning 1/2 alpha to each tail of the sampling distribution? 

  snip, the rest  

-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Missing data cell problem

2001-12-30 Thread Donald Burrill

In trying to clear out my e-mail inbox, I came across this post, for 
which there seemed not to have been any responses.

On Fri, 2 Feb 2001, Caroline Brown wrote:

 I have an analysis problem, which I am researching solutions to, and 
 David Howell of UVM suggested I mail the query to you.  
 My problem is how to deal with a two way- repeated measures design, 
 in which one cell could not be measured:
A1  A2  A3
 B1 ok  ok  ok
 B2 -   ok  ok
 B3 ok  ok  ok
 B4 ok  ok  ok
 
 There is a good theoretical reason for this absence, as levels of 
 factor A are set sizes, and A1 is one item, Factor B is cueing to 
 spatial location and in the 1 item set size, there are no other items 
 competing for 'encoding' resources (thus there can be no INVALID cue).
 
 If you know of any texts or papers on this issue, or have any thoughts 
 as to its solution, I would be most grateful.

One approach is to estimate the cell mean in the A1B2 cell, under the 
constraint that it not contribute to the AxB interaction;  and then 
carry out the usual 2-way ANOVA (but with one fewer d.f. for 
interaction). 

If we use the following two contrasts, one for main effects in A and one 
for main effects in B, their product represents a contrast involving the 
12 cells.  Set that contrast equal to zero (so it doesn't contribute to 
the interaction SS.  (All other interaction contrasts orthogonal to this 
one will not involve the missing cell.)

For A:  2A1 - A2 - A3.  For B:  -B1 + 3B2 - B3 - B4.  Product contrast:
  -2A1B1 + A2B1 + A3B1 + 6A1B2 - 3A2B2 - 3A3B2
 - 2A1B3 + A2B3 + A3B3 - 2A1B4 + A2B4 + A3B4  =  0, whence

 A1B2 = (2A1B1 - A2B1 - A3B1 + 3A2B2 + 3A3B2 + 2A1B3 - A2B3 - A3B3
  + 2A1B4 - A2B4 - A3B4)/6

(where 2A1B3 = twice the cell mean in the (A1,B3) cell, etc.)

You now have cell means for each cell and can carry out the usual ANOVA. 
Because the estimated value of A1B2 infects your A1 average and your B2 
average, the row and column effects (sources A and B in the ANOVA) 
are not, strictly speaking, independent;  although the A2:A3 contrast is 
independent of contrasts involving only B1, B3, B4.

Hope this helps (if belatedly!).

-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Statistical illiteracy

2001-12-29 Thread Donald Burrill

On Wed, 26 Dec 2001 [EMAIL PROTECTED] wrote (edited):

   I came across a table of costume jewelry at a department store with 
   a sign that said 150% off.   I asked them how much they would 
   pay me to take it all off of their hands.  I had to explain to them 
   what 150% meant, and they then explained to me how percentages are 
   computed in the retail trade:  first we cut the price in half 
   (50%).  Then we cut it in half again.  Now we have cut it in half 
   a third time.  50% + 50% + 50% = 150% off.
 ...
  ...  if they advertise a 150% discount directly, without referring 
  to the sequence of three 50% discounts, might they not be liable to 
  legal action for misrepresentation?

 I would tell the clerk in the store, Ah, you get 150% off by taking 
 75%-off of 75%-off.  I'll take it. (1/16 price vs. 50%-off 50%-off 
 50%-off =1/8 price).

Why settle for 1/16?  Take 60% off after 90% off.  Or 55% after 95%. 
Or 50% after 100%, which ought to underline the illogic even for 
arithmetically illiterate retailers.

-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Logarithms (was: When to Use t and When to Use z Revisited)

2001-12-25 Thread Donald Burrill

On Tue, 11 Dec 2001, Vadim and Oxana Marmer wrote:

 besides, who needs those tables? we have computers now, don't we?
 I was told that there were tables for logarithms once. I have not seen 
 one in my life. Is not it the same kind of stuff?

If you _want_ to see one, you have no farther to go than to Sterling 
Library and look up what there is under mathematical tables.  (Unless, 
in the years since I worked there as an undergraduate, they've thrown 
them all out, which I would hope to be unlikely.)

-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: analysis of criterion test data

2001-12-24 Thread Donald Burrill

On 24 Dec 2001, Carol Burris wrote:

 I am a doctoral student who wants to use student performance on a
 criterion test, a state Regents exam, as a dependent variable in a
 quasi-experimental study.  The effects of previous achievement can be 
 controlled for using  a standardized test, the Iowa test of Basic
 skills.  What kind of an analysis can I use to determine the effects
 of a particular treatment on Regents exam scores?  The Regents exam
 produces a percentage correct score, not a standardized score,
 therefore it is not interval data, 

Non sequitur, and probably not true.  Percentage correct, if it means 
what it says, is the same variable as number of items correct (merely 
reduced to a percentage by dividing by the total number of items and 
multiplying by 100%), which is about as interval as you can expect to get 
in this business.  (Even a standardized score is often but a linear 
transformation of the number of items right.)  

Of course, if you have mis-stated things (I don't have personal knowledge 
of the Regents exams and the marking thereof) and what is produced is a 
set of percentiles rather than percentage correct, THAT variable is not 
interval (although it can be converted to an interval score fairly 
readily, by making some assumptions about the form of the distribution).

 and I can not use analysis of covariance (or at least that is what I 
 surmise from my reading).  Any suggestions?

The phrase can not should not even be in your vocabulary in this 
context.  You can ALWAYS carry out an analysis of covariance (or a 
multiple regression analysis, or an analysis of variance;  and any of 
these in their univariate or multivariate forms).  Whether the results 
mean what you would like them to mean is another matter, of course, and 
that depends to some degree on what assumptions you are willing (and what 
assumptions you are UNwilling!) to make about the variables you have and 
about the models you are entertaining.  First carry out your analyses 
(several of them, if you're unsure, as most of us are at the outset, 
which one is best in some useful sense);  then look for ways in which 
the universe may be misleading you (or ways in which you may be deceiving 
yourself).  If several analyses seem to be telling you much the same 
thing (at least in a general way), then that thing is probably both 
believable and reliable.  If they tell you different things, you know the 
data isn't different, so the differences must be reflecting differences 
(possibly subtle ones) in the questions being addressed by the several 
analyses:  which in turn means that something interesting is going on, 
and it msay repay you well to find out what that something is.

However, if the analysis you think you want is analysis of covariance, 
I'd strongly urge you to carry it out as a multiple regression problem, 
or as a general linear model problem;  analysis of covariance programs 
often do not permit the user to examine whether the slope of the 
dependent variable on the covariate interacts with the treatment variable 
(that is, whether the slopes are different in different groups, thus 
contradicting the too-facile assumption of homogeneity of regression).
Such an interaction does not invalidate the analysis;  it merely makes 
the interpretation more challenging.  And if such an interaction is 
visibly present, the analysis that assumes its absence will in general 
have less power to detect _other_ interesting things.

-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Is this how you would have done it?

2001-12-23 Thread Donald Burrill

On Sat, 22 Dec 2001, Ralph Noble asked:

 How would you have done this?
 
 A local newspaper asked its readers to rank the year's Top 10 news stories
 by completing a ballot form.  There were 10 choices on all but one ballot
 (i.e. local news, sports news, business news, etc.), and you had to rank
 those from 1 to 10 without duplicating any of your choices.  One was their
 top pick, 10 their lowest.  Only one ballot had more than 10 choices, 
 because of the large number of local news stories you could choose from.
 
 I would have thought if you only had 10 choices and had to rank from 1 to
 10, then you'd count up all the stories that got the readers' Number One
 vote and which ever story got the most Number One votes would have been
 declared the winner.

That is certainly one way of determining a winner.  But if one were 
going to do this in the end, there is not much point to asking for ranks 
other than 1, because that information is not going to be used at all.  
(Unless, of course, one uses a variant of this method for the breaking of 
ties, or for obtaining a majority of votes cast for the winner.)

 Not so in the case of this newspaper.  So maybe I do not understand
 statistics. 

Non sequitur.  You are not discussing statistics, you are discussing the 
choice of methods of counting votes.

 The newspaper told the readers there were several ways it could have 
 tallied the rankings. 

This is true.  Several may be an understatement.

 The newspaper decided to weight everybody's responses and gave each 
 first place vote a value of 10, each second place nine, each third place 
 eight, and so on. They then added together the values for each story and 
 then ranked the stories by point totals.
 
 So is this an accurate way to have tallied the votes? 

Why not, assuming they didn't err in their arithmetic?  In what sense do 
you want to mean accurate?  I would use the word to describe the care 
with which the chosen method was carried out, not the choice of method, 
as you appear to mean.   Accurate ordinarily refers, at least by 
implication, to how closely some standard or other is being met:  what 
standard did you have in mind?

 And why weight them since the pool in all but one category only had 10 
 items to choose from?

One answer is, precisely because all categories (but the one, and you 
haven't quite described what happened to the one, but I'll assume that 
only the ranks 1 to 10 were used in that case) had 10 items.  If you add 
up all the 1st, 2nd, 3rd, etc. votes _without_ weighting them (that is to 
say, weighting them equally instead of unequally), you get the same total 
for each item, and have no way of declaring a winner.  (This may not be 
true for the one category, since there are more than 10 items but only 10 
ranks to be apportioned among them.)

One could, of course, have weighted them according to their ranks
 (1st = 1, 2nd = 2, etc.) and chosen the one with the _lowest_ point 
total.  (This of course is equivalent to what the newspaper actually 
did:  this point total equals 11 minus the newspaper's point total, and
you get the same winners this way.)  Or according to the reciprocal of 
their ranks (1st = 1, 2nd = 1/2, 3rd = 1/3, etc.) and added those up, 
and taken the highest score.  This is not equivalent to the method 
actually used, although sometimes the results are not different. Etc.

If you conclude from all this that the choice of counting method for 
tallying votes is an arbitrary one, you are quite right.  It is.

-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: s-function in SPSS curve estimation

2001-12-21 Thread Donald Burrill

On Thu, 20 Dec 2001, Johannes Hartig wrote:

 Does anyone know the original applications
 or the meaning of the S-function in SPSS?
 I know the function itself:
 Y = e**(b0 + (b1/t)) or
 ln(Y) = b0 + (b1/t)
 and I know how the curve looks like, but I am wondering in which 
 fields of research this function is typically used and which empirical 
 relations it describes?

You may find it looks a little more like other functions you have seen 
somewhere if you rewrite it as 
Y = a*e**(b1), or equivalently
ln(Y) = ln(a) + b1
 When it is desired to find the value of a, it is simply e**(b0), 
 from your equation above.

In biological contexts, this describes an exponential growth curve 
(which applies to some period of almost any organism's life, usually 
its extreme youth, before environmental constraints restrict its growth 
rate).  Then the parameter b1 is positive and is intimately connected 
to doubling time, the length of time during which the organism doubles 
in size.  I suspect that this is why your original formulation had b1/t 
in the exponent.

If b1 is negative, then the equation models exponential decay, and the 
parameter b1 is connected (in exactly the same way as above) to 
half-life.  Applications include (perhaps obviously) the diminution 
over time of the radioactivity of a radioactive substance.

- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: Statistical illiteracy

2001-12-17 Thread Donald Burrill

On Fri, 14 Dec 2001, Wuensch, Karl L wrote:

 I came across a table of costume jewelry at a department store with a 
 sign that said 150% off.   I asked them how much they would pay me to 
 take it all off of their hands.  I had to explain to them what 150% 
 meant, and they then explained to me how percentages are computed in 
 the retail trade:  first we cut the price in half (50%).  Then we cut 
 it in half again.  Now we have cut it in half a third time. 
  50% + 50% + 50% = 150% off.

Interesting.  Not altogether surprising, though.  In a conversation with a
local bank mortgage person, I explained that part of my income is in
Canadian funds, deposited into my bank in Toronto, and the current
exchange rate is (approximately) 1.50 (Canadian $ for each US $).  She
then wanted to calculate the equivalent US income by discounting the
Canadian value by 50%.  I pointed out that this was incorrect:  one would
discount the Canadian value by 33%.  She said I hear what you're saying,
but went on to indicate that it somehow wasn't relevant.  I could not tell
whether (a) she didn't believe me, (b) she didn't know how to deal with
the arithmetic of exchange rates, (c) this is the way we do it here, (d)
something else, or (e) a combination of the above.  Whatever the case, I 
decided it would be the better part of valor to deal with another bank. 

But back to your retail trade:  if they advertise a 150% discount 
directly, without referring to the sequence of three 50% discounts, might 
they not be liable to legal action for misrepresentation?

-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: When to Use t and When to Use z Revisited

2001-12-09 Thread Donald Burrill

On Sun, 9 Dec 2001, Ronny Richardson wrote in part:

 Bluman has a figure (2, page 333) that is supposed to show the student
 When to Use the z or t Distribution.  I have seen a similar figure in
 several different textbooks. 

So have I, sometimes as a diagram or flow chart, sometimes in paragraph 
or outline form.

 The figure is a logic diagram and the first question is Is sigma
 known? If the answer is yes, the diagram says to use z. I do not 
 question this;  however, I doubt that sigma is ever known in a business 
 situation and I only have experience with business statistics books. 

Depends partly on what parameter one is addressing (either as a 
hypothesis test or as a confidence interval).  For the mean of an unknown 
empirical distribution, I expect you're right.  But for the proportion of 
persons in a population who would want to purchase (for a currently 
topical example) a Segway, the population variance is a known function of 
the proportion (the underlying distribution being, presumably, binomial), 
and for this case the t distribution is simply inappropriate, and one 
ought to use either the proper binomial distribution function, or else 
the normal approximation to the binomial (perhaps after satisfying 
oneself that N is sufficiently large for the approximation to be credible 
with the hypothesized (or observed) value of the proportion;  various 
textbook authors offer assorted recipes for this purpose).

{  Snip, discourse on N = 30, although I'd 
   think it were rather on  df = 30.  }

 However, other authors go well beyond 30.  Aczel (3, inside cover) has
 values for 29, 30, 40, 60, and 120, in addition to infinity.  Levine 
 (4, pages E7-E8) has values for 29-100 and then 110 and 112, along with 
 infinity.  I could go on, but you get the point.  If you always switch 
 to z at 30, then why have t tables that go above 28?  Again, the 
 infinity entry I understand, just not the others. 

{  Snip, assorted quotes ...  }

 So, Berenson seems to me to be saying that you always use t when you
 must estimate sigma using s.  Levine (4, page 424) says roughly the 
 same thing, ...

 So, I conclude  {slightly edited -- DB}

 1) we use z when we know the sigma and either the data are normally
 distributed or the sample size is greater than 30 so we can use the
 central limit theorem. 

I would amend this to the sample size is large enough that we can... 
Whether 30 is in fact large enough or not depends rather heavily on what 
the true shape of the parent population actually is.  (If it's roughly 
symmetrical and bell-shaped, 30 may be O.K.)

 2) When n30 and the data are normally distributed, we use t. 

 3) When n is greater than 30 and we do not know sigma, we must estimate 
 sigma using s so we really should be using t rather than z. 

 Now, every single business statistics book I have examined, including 
 the four referenced below, use z values when performing hypothesis 
 testing or computing confidence intervals when n30. 

 Are they 

 1. Wrong 
 2. Just oversimplifying it without telling the reader 

 or am I overlooking something? 

I vote for both 1. and 2., since 2. is in my view a subset of 1, although 
others may not share this opinion.  I would add 

  3.  Outdated.

on the grounds that when sigma is unknown, the proper distribution is t 
(unless N is small and the parent population is screwy) regardless how 
large the sample size may be.  The main (if not the only) reason for the 
apparent logical bifurcation at N = 30 or thereabouts was that, when 
one's only sources of information about critical values were printed 
tables, 30 lines was about what fit on one page (plus maybe a few extra 
lines for 40, 60, 120 d.f.) and one could not (or at any rate did not) 
expect one's business students to have convenient access to more 
extensive tables of the t distribution.  And, one suspects latterly, 
authors were skeptical that students would pay attention to (or perhaps 
be able to master?) the technique of interpolating by reciprocals between 
30 df and larger numbers of df (particularly including infinity). 

But currently, _I_ would not expect business students to carry out the 
calculations for hypothesis tests, or confidence intervals, by hand, 
except maybe half a dozen times in class for the good of their souls:  
I'd expect them to learn to invoke a statistical package, or else 
something like Excel that pretends to supply adequate statistical 
routines.  And for all the packages I know of, there is a built-in 
function for calculating, or approximating, the cumulative distribution 
of t for ANY number of df.  The advice in any _current_ business-
statistics text ought to be, therefore, to use t _whenever_ sigma is not 
known.  And if the textbook isn't up to that standard, the instructor 
jolly well should be.

{  Snip, references.  See the original post for more details.  }

-- DFB.
 

Re: What usually should be done with missing values ...

2001-12-02 Thread Donald Burrill

On 1 Dec 2001, jenny wrote:

 What should I do with the missing values in my data.  I need to 
 perform a t test of two samples to test the mean difference between 
 them. 
 How should I handle them in S-Plus or SAS?

1.  What do S-Plus and/or SAS do with missing values by default?  
(All packages have defaults, and sometimes they're even sensible 
ones.  If your package(s) do what you want done, or at least do something 
you can live with, that's probably the most comfortable resolution of 
your question.)

2.  Why are there missing values?  And what do these reasons imply (if 
anything) about the values themselves? 
 There are essentially two choices available: 
  (a) treat the values as missing, that is, discard each of the cases for
which the variable in question is missing for the duration of the analysis
of that variable, and retrieve those cases again when dealing with some 
other variable for which their value is not missing.  This is the default 
in MINITAB and SPSS, although for some analyses (in both packages) the 
missing cases are deleted listwise (in multiple regression, for example, 
if any of the variables in the model be missing, the whole case is 
deleted fron the analysis) and for some the missing cases are deleted 
pairwise (in reporting a correlation matrix, for example, a case is 
deleted from the computation of a correlation coefficient if either of 
the two variables is missing, but is retained for other correlation 
coefficients for which both variables are non-missing in this case).
  (b) Impute some value to the missing variable for this case.  There are 
a great variety of imputation schemes, all of them (so far as I know) 
suffering from the logical defect that one must assume something about 
the missing value, and the assumption may not only be untrue, it may be 
wildly in error.  One approach is to substitute the mean of this variable 
for the missing value;  but if the _reason_ the value is missing implies 
that the actual value is likely to be extremely high or extremely low, 
this is evidently not a good strategy.  Another approach is to use some 
variant of multiple regression to predict the missing value from the 
existing values of other variables;  again, this assumes that the missing 
value would be close to the regression line (or surface), and if the 
_reason_ implies an extreme value or outlier, this is not particularly 
likely to yield a realistic value.

This is of course a simplified account (some might say oversimplified) of 
the problem of missing-ness, but may suggest some useful ideas.  

Personally, I generally prefer to acknowledge that I don't know the value 
that's missing, and let the case be temporarily discarded, at least for a 
first run at an analysis (or series of analyses);  most of the time.
  And if I chose to use a method of imputation, I'd usually want to 
report results both of analyses in which the missing data are honestly 
missing, and analyses in which imputed values are used, so that I (and 
my readers) could see the effect(s) of the imputation.

And since you want to test for differences between means, you almost 
certainly should NOT substitute a _mean_ for any missing value.  If you 
substitute the overall mean, you will tend to diminish the real 
difference, if any, between the two sample means, and if there's a lot of 
missing data you could end up not finding differences where they would 
have been evident if you'd permitted the missing cases to be discarded. 
If yhou substitute the mean of this subgroup, you will not change the 
apparent difference between the means, but you WILL reduce the 
within-group (pooled or not) variance, so that you will have spuriously 
high sensitivity to differences between the means.

Whether there is an aregument that would support any other method of 
imputation in your case, I cannot tell.  I'm inclined to doubt it, but 
that maybe merely a reflection of my usual skepticism (or, perhaps, 
curmudgeonliness).

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Evaluating students: A Statistical Perspective

2001-11-27 Thread Donald Burrill

On Tue, 27 Nov 2001, Thom Baguley wrote in part:

 Donald Burrill wrote:
  
  On Fri, 23 Nov 2001, L.C. wrote:
  
   The question got me thinking about this problem as a
   multiple comparison problem.  Exam scores are typically
   sums of problem scores.  The problem scores may be
   thought of as random variables.  By the central limit theorem,
   the distribution of a large number of test scores should look
   like a Normal distribution,
  
  Provided, of course, that the test scores in question are iid.  Now 
  it is possible to imagine that test scores for different persons are 
  measured independently (although I am aware of skepticism in the 
  ranks on this point!), but that they are identically distributed 
  seems unlikely at best.
 
 I'd argue that they probably aren't that independent.  If I ask three 
 questions all involving simple algebra and a student doesn't
 understand simple algebra they'll probably get all three wrong. 

True.  But this does not seem to me to speak to the issue of 
independence, which as I understand it is an assumption that responses 
made by student A to items on a test are unrelated to (i.e., do not 
affect and are not affected by) the responses made by student B to those 
items.  Surely student A, who has not (let us suppose) adequately 
remembered what s/he needs to know of simple algebra, is not to be held 
responsible for the fact that student B doesn't remember any either? 

 In my experience most statistics exams are better represented by a
 bimodal (possibly a mix of two skewed normals) than a normal
 distribution. Essay based exams tend to end up with a more unimodal
 distribution (though usually still skewed).

Interesting.  Scores on my exams tend to be negatively skewed in general, 
and to show evidence of several clusters (that may or may not show up as 
apparent modes):  the several persons at the bottom, often clustered at 
some little distance from their nearest neighbor(s), who almost seem 
dtermined to fail;  and two to four clusters moving up the scale from 
there, which sometimes fall into ranges useful for grades of D, C, B. 
Sometimes, but not always, there are another few students clustered at 
the top.

-- Don.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Evaluating students: A Statistical Perspective

2001-11-24 Thread Donald Burrill

On Sat, 24 Nov 2001, L.C. wrote:

 Thanks for the reply!
 
 As for the iid, it's reasonable to believe the questions could be 
 drawn from some population.  Why not the answers? 

If the questions are selected in accordance with some table of 
specifications, they are not from _a_ population, but from many;  
and there is no _a priori_ reason I can think of to suppose that 
their item characteristics are iid.

As for the answers, the usual reason for wanting to evaluate students 
is precisely because they are (or one hopes they are!) different in 
their levels of skill (or whatever):  the task is to assess these skill 
levels, and it is nonsense to assume that all the persons are id on the 
measure on which one hopes to identify differences.

 (Hey! I've heard much worse justifications for
 statistical assumptions! :) At any rate, bell curves do
 arise often enough in this context to be written about.

Of course, bell curve does not necessarily imply normal distribution. 
You can get quite nice bell curves from binomial distributions, e.g.
 Also of course, any real data must be discrete, not continuous, so 
cannot technically be normally distributed anyway. 
 (It is possible that the distribution may be more or less well 
approximated by a normal distribution with the same mean  variance, 
but that's not the same thing.) 

 As for wanting gaps in the resulting distribution... That
 was my point.  When you do have a bell curve, it shouldn't
 be satisfying;  it should be disturbing. 

Depends on how bell-like the curve is.  For almost any interesting 
variable that can be measured on humans, one expects rather a lot of 
people in the middle, and progressively fewer toward the extremes, of 
the distribution;  doesn't one?  (And if not, why not?)

 This is the maddening
 aspect of psychometry - they engineer these nice normal
 distributions on which to base their diagnoses. You'd think
 they'd *want* bimodal, discrete, or mixed continuous/discrete
 distributions, but no.  They diagnose by Z scores (thereby
 defining their own prevalences :) and assert that they are
 discovering diseases, and not punishing unusual people.
 
 Best Regards,
 -Larry (And they get to testify in court) C.

Hmm.  This thread started out as evaluating students, in the context of 
classes and teacher-made tests, as I recall.  Not exactly the same thing 
as diagnosing (in a quasi-medical sense) or discovering diseases, I 
shouldn't think.
 One wonders, then, why you aren't posting these complaints in a 
newsgroup of psychometricians, rather than one of statistics teachers?


 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Need help with a probability problem

2001-11-20 Thread Donald Burrill

On 20 Nov 2001, J. Peter Leeds wrote:

 I'm working on a formula for measuring decision making skill and am 
 trying to estimate the probability that a person of known skill can 
 distinguish among different response option contrasts and avoid a type 
 II error. 
One effective way of avoiding a Type II error is to 
reject the hypothesis being tested.  Of course, this entails a non-zero 
probablility of making a Type I error...  :-)
Seriously, though, I believe it is not possible to _avoid_ a 
Type II error in the process of accepting the hypothesis being tested; 
one can only [attempt to] control the probability of such an error.  
Perhaps this is what you meant, but it isn't exactly what you wrote.

 The problem actually breaks down to a rather simple analogy:
 
 Imagine that a man has been sentenced by court to run a gauntlet
 composed of four club-wielding executioners.  The court places the best 
 execution 
You mean executioner, surely?

 at the beginning of the gauntlet followed by the second, third and 
 fourth best.  Based on past performance the first executioner has a 
 .90 probability of striking the man, while the remaining executioners 
 have .50, .30, and .20 respectively.  What is the man's probability of  
 being struck by at least one of the executioners and how is this 
 calculated? 
 
 Notice that the events are not independent because if the man is fast 
 (or lucky, or skillful?) enough to make it past the first executioner 
 his odds of making it past the rest are improved since he will have 
 survived the best executioner.

In other words, the probabilities associated with the other three 
executioners are NOT .50, .30, and .20 as advertised, but some 
(presumably) smaller values?  In other words, the probability of being 
struck by the second executioner is .50 only if one has already been 
struck by the first executioner?  This doesn't seem very sensible... 
And what model have you (if any) for recalculating the other three 
probabilities for those who manage to escape the first (and then the 
second, and then the third) executioner?  I do not see why you quote 
values of alleged probabilities, only to say in the next breath that 
those probabilities are false.  Nor do I quite believe your assertion 
of non-independence:  seems to me they might very well BE independent, 
if only one knew what the REAL probabilities were.  No?

 What is this sort of problem called? (e.g., conditional probability,
 joint probability, Bayesian probability, etc.).  Please excuse the
 inanity of the example but it is much easier than trying to explain my 
 research.

Easier it may be, but one can't help suspecting that some aspects of the 
inanities evident are not paralleled by structures or relationships in 
whatever your real problem is;  which rather vitiates the underlying 
(if unstated) assumption that analysis of the inane example will be in 
some way helpful in analyzing the real circumstances.  Or, to put it 
another way, the inane example may be wholly inadequate as a model for 
whatever phenomenon you're really trying to deal with.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: F distribution

2001-11-18 Thread Donald Burrill

On 17 Nov 2001, Myles Gartland wrote:

 In an F distribution, the critical value for the lower tail is the
 reciprocal of the the critical value of the upper tail (with the
 degrees of freedom switched).
 
 Why?  I understand how to calculate it, but do not get why the math
 works.

Essentially for the same reason that in the normal distribution the 
critical value for the lower tail is the negative of the critical value 
for the upper tail.

Thnke about it.  For F = V1/V2, where V1 and V2 are two variance 
estimates with numbers of degrees of freedom n1 and n2 respectively, 
the relevant F distribution is said to have n1 and n2 degrees of 
freedom, naming the numerator first and then the denominator. 

For F = V2/V1, the relevant F distribution has n2 and n1 d.f. (hence the 
interchange of the numbers of degrees of freedom to which you allude).

Notice that V2/V1 is the reciprocal of V1/V2.  If V1/V2 is sufficiently 
larger than 1 that the hypothesis of equal variances in the populations 
can be rejected, then V2/V1 must be sufficiently smaller than 1 to permit 
rejection.  Hence the critical value for V2/V1 must be the reciprocal of 
the critical value for V1/V2, and the d.f. are interchanged simply by the 
choice of which direction to divide.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: regression on timeseries data: differencing ?

2001-11-15 Thread Donald Burrill

  On Tue, 13 Nov 2001, Wendy (alias Eric Duton?) wrote:
  
   When applying multiple regression on timeseries data, should I check
   (similarly to ARIMA-models) for unit roots in the dependent variable
 
   and the predictor variables and perform the necessary differencing
   
   OR
   
   could I simply start the multiple regression analysis on the pure
   timeseries and  check the residuals on the general assumptions of 
   regression analysis (esp. autocorrelation) ?

and I replied:

  1.  Why do you write as though these were mutually exclusive options?

to which Eric Duton responded:

 Actually I'm a bit confused. When looking at a timeseries course they 
 stress the need for stationarity of the series. 

Courses always simplify, and sometimes oversimplify.

 On the other hand in Multiple regression theory they stress the errors 
 should be iid N(0,constant var). 

I don't know about should.  It is often convenient if this is true, and 
in the nature of things the observed residuals (errors) always have 
mean 0 anyway.

 So strictly speaking it seemed to me I shouldn't
 worry about preliminary stationarity tests in multiple regression
 between timeseries and just check the residuals afterwards. But then I
 saw a paper where they did check for stationarity before estimating the
 parameters ... And of course another where they didn't ... Therefore I'm
 totally lost whether I should or should not carry over the preliminary
 stationarity testing into multiple regression theory when confronted
 with timeseries for Y and X's. 

Should and should not have no meaning to me in the absence of any 
context that would indicate the value system, or perhaps the theology, 
that specifies the nature of should.  I do not understand why you 
waffle around worrying about should when you could have been carrying 
out BOTH analyses, after which you would know if the difference in 
analytical approach entails any difference(s) in results, and whether any 
such difference(s) be interesting enough to pursue and attempt to explain 
(via future research).

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: When one should stop sampling?

2001-11-13 Thread Donald Burrill

On 12 Nov 2001, Niko Tiliopoulos wrote:

 I am acting as the stats advisor for my unit in the psychology 
 department of the University of Edinburgh, UK.  Last week a colleague 
 of mine presented me with the following issue, and I am not quite sure 
 how to respond:

 She is running a psychological experiment, in which she a priori 
 specified her sample size as 200 people. 

_Intended_ sample size, surely?  Where was this specified?  In a proposal 
for funding the research in question?  There may be a non-statistical 
question lurking in the underbrush here, about the degree to which she is 
(or thinks she is, or may be made to appear as though she is) committed 
to actually _using_ 200 people, and what the penalty (if any) is for not 
using that many.

 She has already sampled 40 participants and a preliminary effect size 
 (ES) analysis suggests an almost zero effect.  Based on previous 
 research, she was expecting a detectable effect even with 40
 subjects - though I suspect she was not expecting enough power to get 
 a significant result at that stage.  In addition, it appears that the 
 reason the ES she gets is nowhere close to the expected figure may be 
 because of a design flaw. 

Do you mean a flaw in the original design (that is, in the logic) of the 
study, or a glitch in carrying out some intended aspect of the design 
(that is, in the implementation of a perfectly adequate design)?  And is 
there evidence on the alleged flaw, persuasive enough to convince T.C. 
Mits (or equivalent!) that this isn't just a weak excuse invented to 
conceal some more culpable defect?  (Although I have difficulty imagining 
what I might have meant by that phrase!)

 So she asked me whether it is justified to go up to, say, 100 
 participants, check again her ES and if it's still near zero, stop 
 sampling, 

Why waste the time and energy of _another_ 60 subjects, if one really 
believes [part of] the problem is a design flaw?  The more obvious 
approach would be to fix the (newly-discerned?) problem, and start over 
with a new batch of Ss.  Is there a justifiable reason for NOT fixing 
defects when they've been found?

 or whether she had to sample all 200 people because she had said so in 
 her protocol?

Her protocol (by which I suppose you mean her research proposal?) surely 
embodied, at least implicitly, the conditions that the research design was 
competently done, and that the procedures were being (or, were to be) 
carried out consistently with the design.  The presence of a design flaw, 
even if it's only a putative one, denies one or both of these conditions, 
and therefore logically revokes any implicit responsibilities to carry 
out the entire protocol as originally specified.  However, I can imagine 
scenarios in which a legal (as distinct from logical) responsibility may 
exist that would need to be addressed in legal terms.  (I can't tell 
whether any such thing applies in this case, of course.)

 I do think it would be foolish to keep sampling when one has grounds to 
 believe that there is no effect or that there is a flaw in the study.

Right.  That's called sending good money after bad, and (at least 
according to North American folkore) Scots are noted for their antipathy 
toward any such activity.

 I believe that if the plot of subjects versus power, suggested that 
 the power curve levelled after a given sample size, that would be enough 
 justification to stop sampling (needless to say that participants that 
 satisfy her protocol are precious and hard to find). 

Probably overkill, and quite possibly impossible.  (After all, we keep 
being reminded that the null hypothesis is never actually true, which 
implies that the ES is not exactly zero, which implies that with a 
sufficient sample size (maybe ten million or so?) the power curve would 
indeed level out -- near power = 1.0.)  If one wanted to invoke a 
statistical argument (in the face of whatever logical argument and/or 
evidence exists of a design flaw and/or of an ES an order of magnitude 
smaller than one had reason to expect in the beginning), it might be more 
persuasive to show that an upper bound on ES (say, the top of a 95% 
confidence interval) would imply no practical value whatever for so small 
an ES.  (Presumably the presence of an interestingly large ES would have 
implied some change, or recommendation for change, in practice 
somewhere.) 

 Her query though sounds to me more like a methodological (if not 
 ethical) one, rather than a true statistical problem, and thus this 
 bottom-up justification may not suffice.

Dunno.  Attempts to identify the pure effect of any problem or 
condition (e.g., to distinguish between inherited and environmental 
influences in the presence of both) are usually doomed to failure by 
their very nature.  And are you suggesting that there IS, or could be, 
some justification for imposing a failed experimental protocol on a 
group of innocent bystanders (the additional Ss whose time  

Re: regression on timeseries data: differencing ?

2001-11-13 Thread Donald Burrill

On Tue, 13 Nov 2001, Wendy (alias Eric Duton?) wrote:

 When applying multiple regression on timeseries data, should I check 
 (similarly to ARIMA-models) for unit roots in the dependent variable 
 and the predictor variables and perform the necessary differencing
 
 OR
 
 could I simply start the multiple regression analysis on the pure 
 timeseries and  check the residuals on the general assumptions of 
 regression analysis (esp. autocorrelation) ?
 
 Wendy

1.  Why do you write as though these were mutually exclusive options?

2.  Why did you send three (!) copies to the list?
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Evaluating students

2001-11-13 Thread Donald Burrill

On Wed, 14 Nov 2001, Alan McLean wrote in part:

 Herman Rubin wrote:
  
  A good exam would be one which someone who has merely
  memorized the book would fail, and one who understands
  the concepts but has forgotten all the formulas would
  do extremely well on.
 
 Since to understand the concepts almost always means understanding 
 (and hence knowing) the formulas, I would interpret someone who has
 'forgotten all the formulas' as understanding the concepts only in 
 the most superficial manner, and so should do badly!

Non sequitur.  To know formulas (in a deep sense of understanding them) 
is one thing;  to be able to write them verbatim is another thing 
altogether (and something that xerographic copiers do better than people 
do, by and large).  Of course, it is easier to ask questions about the 
details of formulas than to probe a student's deepr understandings...

 Overall, the evaluation of students is driven mostly by budget,
 (lecturers') time, lecturers' interest, the number of students, 
 politics - the best one can do is to assess students as honestly as 
 possible within the range allowed by these factors!

Sadly, this is true;  and not infrequently exacerbated by administrative 
rulings (not to say interference!).  At the university where I teach 
part-time, for example, course marks are to be submitted within 72 hours 
of the final examination.  Not a circumstance that encourages (let alone 
rewards) setting the kinds of exams that Herman describes.

-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Z Scores and stuff

2001-11-10 Thread Donald Burrill

You persist in repeating your original request in your original phrasing, 
with no elaboration(s) that might resolve the ambiguities therein.

On Sat, 10 Nov 2001, Mark T wrote:

 On Fri, 09 Nov 2001  Rich Ulrich [EMAIL PROTECTED] wrote:
 
  On Thu, 8 Nov 2001  Mark T [EMAIL PROTECTED] wrote:

   What are the formulae for calculating the mean to z, larger
   proportion and smaller proportion of a z-score (standardised score) 
   on a standard normal distribution?  I know about tables listing 
   them all, but I want to know how to work it out for myself :o)

  Do you want the calculus, or just a numerical approximation?
  
  For starters, in my stats-FAQ, see
  
  http://www.pitt.edu/~wpilib/statfaq/gaussfaq.html

 Thanks for your reply.  Ummm, unfortunately I don't understand this :o) 

Not surprising.

 I am by no means a mathematician.  I am studying psychology and 1/4 of 
 my course is statistics *for psychology*, ie it's pretty basic without 
 any of the advanced stuff (I hope!). 

A pity, if true.  Adequate practice of psychology requires considerably 
more than a minimum knowledge -- and understanding! -- of statistics.

 All I want to know, for interest's sake, is how one calculates the 
 mean to z, 

Yes, you said that before.  In the same words.  For the sake of 
(possibly) furthering the conversation, I will assume that what you 
meant was something like Given a value x of a variable X, which has a 
known mean, how does one convert x to z?  (Your language admits of 
several other possible meanings, but I'll leave it to you to clarify what 
you intended, if it wasn't what I've conjectured (and if you can).)

The formula you request, for this purpose, converts x to z:

z = (x - mean)/sd

where sd is the known standard deviation of the variable X.
 Now, I'm sure your statistics instruction includes this equation;  it 
follows that the question you really want to ask is (probably) something 
else.  In which case we all await with interest your clarification. 

 larger proportion and smaller proportion of a standardised score, 
 without having to read through a long list of numbers. 

Hmm.  Numbers scare you, do they?  

There are essentially three ways 
of going about this part:

1.  Look the proportions up in a table of the standard normal 
distribution, which by your account you are apparently too lazy to do. 
Sounds as though you're being inefficient, by the way:  there's no need 
to read through a long list of numbers, only to look up a single 
number in the table (the other proportion you can get by subtracting 
from 1.) 

2.  Use convenient statistical software (MINITAB, SAS, SPSS, a TI-83 
calculator, etc.) to calculate the proportions by numerical 
approximation.  This of course does not satisfy your request for 
the formulae.

3.  Start with the mathematical expression for the density function of a 
standard normal distribution, and integrate it from minus infinity to z. 
Which is what Rich was referring to when he asked if you wanted the 
calculus.  Again, by your account you haven't the mathematics for this; 
especially as the integral in question does not exist in closed form.
(Which, of course, is precisely the reason why tables were constructed 
in the first place, to avoid a _very_ tedious computational chore every 
time one had a value of z for which proportions, or probabilities, were 
needed.)

 Forgive me if that was covered in your FAQ, but I couldn't see
 it!  Perhaps you could point me in the direction of the formulae? 

Forgive me if my candour is uncomfortable, but this sounds to me very 
like asking a sorcerer for the spell(s) you think he uses.  Do you want a 
magic wand also, and perhaps a cloak of invisibility?

I am reminded of the time, years ago, when the mother of a high-school 
student telephoned me for assistance in a problem the boy had been set by 
his math teacher.  (I noticed at the time that it wasn't the _boy_ who 
called me.)  He'd been asked to figure out the possible scores one could 
get in a hand at cribbage (or perhaps to explain why a score of 19 is not 
possible -- I don't remember precisely).  Mother was sure there must be a 
formula for doing this (she evidently looked on mathematics as you do, 
as a domain wholly of magic and populated by sorcerers), and was audibly 
disappointed to be told The only way to do this is to enumerate the 
possible hands.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Can I Use Wilcoxon Rank Sum Test for Correlated Clustered Data??

2001-11-01 Thread Donald Burrill

On Thu, 1 Nov 2001, Chia C Chong wrote:

 I am a beginner in the statistical analysis and hypothesis. I have 2 
 variables (A and B) from an experiment that was observed for a certain 
 period time.  I need to form a statistical model that will model these 
 two variables. 

Seems to me you're asking in the wrong place.  The _model_ cannot be 
determined statistically, nor (in general) by statisticians.  It arises 
from the investigator's knowledge of the substantive area in which the 
experiment was carried out, and of the reasons why the experiment was 
designed  conducted in the first place.  Given a model, or, better, a
series of more or less complex models, a statistician can help you decide 
among them, and can help you arrive at numerical values for (at least 
some of) the parameters of the models.

 As an initial step, I plot the histograms of A  B separately to
 see how the data were distributed. 

How would you (or the investigator) expect them to be distributed?  
In particular, why would you think they might follow any of the usual 
theoretical distributions?  (In other words, what's the theory behind 
your expectations -- or your lack of expectations?)

 However, it seems that both A  B can't be easily described by a simple 
 statistical distributions like Gaussian, uniform etc via visualisation.  
 Hence, I proceeded to plot the Quantile-Quantile plot (Q-Q plot) 

What did you think this would tell you?

 and trying to the fit both A and B with some theoretical distributions 
 (all distributions avaiable in Matlab!!).  Again, none of the 
 distributions seem can descibe then completely.  Then I was trying to 
 perform the Wilcoxon Rank Sum test. 

What hypothesis were you testing, and why was the Wilcoxon test relevant 
to it?

 From the data, it seems that A  B might be correlated in some sense.

You have not described a scatterplot of A vs. B (or B vs. A, whichever 
pleases you).  Why not?

 My question is, what can I purely rely on the Wilcoxon Rank Sum Test to 
 find the parameters of the distributions that can describe A  B??

Since the Wilcoxon is allegedly a distribution-free test, I'm quite 
bemused by the idea that it might help one _find_ parameters...

 How do perform test to see whether A  B are really correlated??  

Practically all pairs of variables are correlated, to one degree or 
another.  What will it signify to you if A and B are (or are not) 
really correlated (whatever really is intended to mean)?

 How if A or/and B are overlay of two or more distributions?? 

Hmm.  By overlay, do you mean mixture, perhaps?

 Can this test tell me?? What make thing more tricky is that clustering 
 was also observed in both A  B.
 
At the same times, or in the same places?

 I really hope to get an idea how to start with the statistical analysis 
 for this kind problem...#

I'm sorry, but I don't yet perceive precisely what the problem is that 
the data were intended (or designed?) to address.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: What is a confidence interval?

2001-10-30 Thread Donald Burrill

In reviewing some not-yet-deleted email, I came across this one, and have 
no record of its error(s) having been corrected.

On Sat, 29 Sep 2001, John Jackson wrote:

 How do describe the data that does not reside in the area
 described by the confidence interval?
 
 For example, you have a two tailed situation, with a left tail of .1, a 
 middle of .8 and a right tail of .1, the confidence interval for the 
 middle is 90%.

Well, no.  You describe an 80% C.I., not a 90% C.I.

 Is it correct to say with respect to a value falling outside of the 
 interval in the right tail:
 
 For any random inverval selected, there is a .05% probability that the 
 sample will NOT yield an interval that yields the parameter being 
 estimated and additonally such interval will not include any values in 
 area represented by the left tail. 

If you're still referring to the 80% C.I. introduced above, .05% 
probability is not applicable.  [Not even if you had stated it 
correctly, either as .05 probability or as 5% probability.  ;-) ]

 Can you make different statements about the left and right tail?

Not for the case you have described.  Had you chosen to compute an 
asymmetric C.I. (perfectly possible in theory, hardly ever done, so far 
as I am aware, in practice) it would be otherwise.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Comparing percent correct to correct by chance

2001-10-28 Thread Donald Burrill

On Sun, 28 Oct 2001, Melady Preece wrote:

 Hi.  I want to compare the percentage of correct identifications (taste 
 test) to the percentage that would be correct by chance 50%?  (only two
 items being tasted).  Can I use a t-test to compare the percentages?  
 What would I use for the s.d. for by chance percentage?  (0?)

Standard comparison would be the formal Z-test for a proportion;  see 
any elementary stats text.  If you have a reasonably large sample size, 
use the normal approximation to the binomial;  if you have a small 
sample, it may be necessary to use the binomial distribution itself, 
which is considerably more tedious unless you have comprehensive tables.

Sounds as though you'd wish to test  H0: P = .50  vs.  H1:  P  .50.
For the Z-test, use the S.D. of a proportion associated with the 
hypothesized value (.5):  SD = SQRT(pq/n)  where  p = the hyp. value 
(.5 in this case),  q = 1-p,  n = sample size.

You may want to examine the translation of chance into a proportion of 
.5.  I don't think I know what by chance means in the context of your 
investigation;  certainly .5 is a possible interpretation, but I can 
imagine situations where it would be incorrect.  (For example, if the two 
items are always presented in the same order, and there is a predilection 
in your population to identify the first correctly more frequently than 
the second, just because they're first and second, the chance 
hypothesis might be more properly represented by a number  .5.  This 
problem might be countered if the items were presented in counterbalanced 
order.)
Also, if the respondents know beforehand what the two items are 
(just not which one is which), the situation is different from one in 
which the two items might (so far as the respondents know) come from a 
long-ish array of items.  Thus if the task were to decide between 
chocolate and strawberry, the latter might be mis-identified more 
often if raspberry were [thought to be] a possible alternative.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Graphics CORRESPONDENCE ANALYSIS

2001-10-24 Thread Donald Burrill

On Wed, 24 Oct 2001, Rich Ulrich wrote in part:

 It has been my impression (from google) that CA is more popular 
 in European journals than in the US, so there might be better
 sites out there in a language I don't read.

(CA = correspondence analysis, 
 ou en francais  analyse des correspondances)

In Canada, and to a lesser extent in the U.S., correspondence analysis is 
also known under the name dual scaling.  For references consult 
Professor Emeritus Shizuhiko Nishisato of the University of Toronto:
Shizuhiko Nishisato [EMAIL PROTECTED].
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Final Exam story

2001-10-13 Thread Donald Burrill

The story is about six students who ...   The instructor ... tells them 
to report the next day for an exam with only one question.  If they all 
get it right they all pass. They were seated at corners of the room and 
could not communicate. 

Must have been an interesting room, with six corners :)

The one question was, Which tire?  I remember that the likelihood of 
all four pickng the same tire was quite small, but I forgot how to 
calculate it explicitly.

Assuming an ordinary vehicle with 4 tires, and that the students' 
responses are independent, (1/4)^6 = 1/4096.

I would particularly appreciate a general solution (N students, M 
tires).

For a room with N corners?
The generalization ought to be obvious.

On 12 Oct 2001, Dubinse wrote:

 I had promised a colleague a story that illustrates probability and 
 now I forgot how to solve it formally.  The story is about six
  students who go off on a trip and get drunk the weekend before
 their statistics final.  They return a few days late and beg for a
 second chance to take the final exam.  They tell a story about how
 they were caught in a storm and their car blew a tire and ended up
 in a ditch and they needed brief hospitalization etc.  The instructor 
 seems very easy going about the whole thing and tells them to report
 the next day for an exam with only one question.  If they all get it 
 right they all pass.  They were seated at corners of the room and could 
 not communicate.  The one question was, Which tire?  I remember that
 the likelihood of all four pickng the same tire was quite small, but I
 forgot how to calculate it explicitly (except for listing all the 
 possible outcomes).  
 
 I would particularly appreciate a general solution (N students, M 
 tires). 
 Thanks.
 Stephen Dubin VMD
 http://www.hometown.aol.com/dubinse
 [EMAIL PROTECTED]

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Bimodal distribution

2001-10-13 Thread Donald Burrill

On Fri, 12 Oct 2001, Desmond Cheung (of Simon Fraser University, 
Vancouver, BC) wrote:

 Is there any mathematical analysis to find how much the two peaks stand 
 out from the other data?  

Hard to answer, not knowing where you're coming from with the question. 
Any answer depends on the model(s) you wish to entertain that would 
generate a bimodal distribution.  The more usual question, I believe, 
is how much separation there is between the modes (peaks), which is 
a horizontal distance, rather than how much the modes stand out from the 
other data, which rather sounds like a vertical distance.
One suspects that you might usefully begin by consulting the 
literature on mixtures of normal distributions, or perhaps on mixtures 
more generally.

 Is there any formulas to find the variance/deviation/etc that's 
 similar to the unimodal distribution case? 

Formulas for variance, std. deviation, etc., do not depend on the shape 
of the distribution, except insofar as the functional form of the 
distribution may lead to a simpler formula, as in the case of a binomial 
distribution.  Otherwise, if you want/need the variance (etc.) of a bimodal 
distribution, use the same formulas you use for any other empirical  
distribution.

Incidentally, you write the unimodal distribution case as though there 
were only one unimodal distribution.  There are lots.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: ranging opines about the range

2001-10-06 Thread Donald Burrill

 William B. Ware [EMAIL PROTECTED] wrote:

 Anyway, more to the point... the add one is an old argument based on 
 the notion of real limits.  Suppose the range of scores is 50 to 
 89.  It was argued that 50 really goes down to 49.5 and 89 really 
 goes up to 89.5.  Thus the range was defined as 89.5 - 49.5... thus 
 the additional one unit...

I recall textbooks (in the late 1960s and 1970s) that defined both an 
exclusive range (= max - min) and an inclusive range 
(= max - min + 1), the latter invariably being illustrated with examples 
of data that came in integers.  (In fact, the examples _may_ always have 
been of variables that were counts.)

On Sat, 6 Oct 2001, Stan Brown replied:

 Perhaps a better argument is that if you count the numbers you get 
 forty of them: 50, 51, 52, ..., 59 makes ten, and similarly for the 
 60s, 70s, and 80s.

I see the argument, but I don't know as I'd call it better.  Seems to 
be confusing apples with oranges.  By the idea of range, does one want 
to mean the _distance_ between the largest and smallest values in the 
data, or the _number_of_different_values_ between those two extremes? 
(These are NOT equivalent concepts!)
 And if the latter is of interest, does one want the number of different 
values _in_this_data_set_, or the number of _possible_ different values 
that might have been observed (under what hypothetical conditions?)?
The inclusive range rule supplies the latter (under the assumption 
that the possible values can only be integers, which is an interesting 
restriction in itself) -- but not for all imaginable variables.
 [Counterexample:  What's the range of possible values of a hand in 
cribbage?  The smallest possible value is 0, the largest is 29.  The 
exclusive range (in a possibly artificial data set that includes all 
possible hands, or at least all possible values) is 29-0 = 29.  The 
inclusive range is 30, which is the number of integers between 0 and 
29 inclusive.  The number of _actual_values_ that can possibly be 
observed is 29 (of the integers from 0 to 29, 19 is not a possible 
value for a cribbage hand).]

Anyway:  one justification for arguing about how to calculate the range 
lies in not having decided whether one wants to mean range in the 
sense of distance in the measured variable, or range in the sense of 
number of [possible?] different values of the measured variable, and 
indeed in not having perceived that there _is_ such a distinction to be 
made.

As William Ware reminds us, in the idea of range as distance, there 
may still be a distinction to be made based on the size of the units of 
measurement to which the measured variable is reported, and on whether 
one wishes to include the (presumed) half-units at either end of the 
empirical distribution (or, for variables like age that are customarily 
truncated rather than rounded, the (presumed) whole unit at the right 
end).  The inclusive argument seems essentially to require 
(i) that the latent variable being measured be continuous, 
   (ii) that one knows the precision of measurement to which the measured 
variable is being reported, and 
  (iii) that one wishes not so much to describe the (empirical) sample in 
hand as to make inferences to the population from which one conceives it 
to have been drawn, under a specific (but usually only IMplicit) model 
under which the observed values are thought to have been derived from the 
latent values.

Hmph.  Didn't intend to be quite so long-winded.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Help with Minitab Problem?

2001-09-30 Thread Donald Burrill

Turns out the method I originally suggested is unnecessarily cumbersome. 
A more elegant method is described below.

On Sat, 29 Sep 2001, Donald Burrill wrote in part:

   COPY c1-c35 to c41-c75;   #  Always retain the original data
   OMIT c1 = '*';
   OMIT c2 = '*';
   . . . ;
   OMIT c35 = '*'.
 
 There is probably a limit on the number of subcommands that MINITAB 
 can handle (or on the number of OMIT subcommands that COPY can handle), 
 but I don't know offhand what it is.  

Well, the limit is one:  only one OMIT subcommand per COPY command. 
That makes this procedure distinctly tedious, for 35 columns.

A more efficient method:
ADD c1-c35 c36
 This puts the sum of c1-c35 in c36, but if any one (or more) of c1-c35 
are missing, the result is missing:  so c36 has '*' for every row where 
there is a missing datum in some column(s).  A reasonable next step is
to see how much data is left:
N c36
 reports the number of non-missing values in c36.  If that value is zero, 
or some other very small number, you might want to re-think your 
strategy before proceeding:
COPY c1-c35 c41-c75;
OMIT c36 '*'.
 Columns c41-c75 now contain only rows of the original c1-c35 for which 
all of the values are NON-missing.

 snip, the rest 
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: E as a % of a standard deviation

2001-09-29 Thread Donald Burrill

On Sun, 30 Sep 2001, John Jackson wrote:

 Here is my solution using figures which are self-explanatory:
 
 Sample Size Determination
 
 pi = 50%  central area 0.99
 confid level= 99% 2 tail area 0.5
 sampling error 2%  1 tail area 0.025
 z =2.58
 n1  4,146.82  Excel function for determining central interval
 NORMSINV($B$10+(1-$B$10)/2)
 n  4,147
 
 The algebraic formula for n was:  
 
   n = pi(1-pi)*(z/e)^2
 
 
 It is simply amazing to me that you can do a random sample of 4,147 
 people out of 50 million and get a valid answer. 

It is not clear what part of this you find amazing.  
(Would you otherwise expect an INvalid answer, in some sense?)
Thme hard part, of course, is taking the random sample in the first 
place.  The equation you used, I believe, assumes a simple random 
sample, sometimes known in the trade as a SRS;  but it seems to me 
VERY unlikely that any real sampling among the ballots cast in a 
national election would be done that way.  I'd expect it to involve 
stratifying on (e.g.) states, and possibly clustering within states; 
both of which would affect the precision of the estimate, and therefore 
the minimum sample size desired.
As to what may be your concern, that 4,000 looks like a small 
part of 50 million, the precision of an estimate depends principally 
on the amount of information available -- that is, on the size of the 
sample;  not on the proportion that amount bears to the total amount 
of information that may be of interest.  Rather like a hologram, in 
some respects;  and very like the resolving power of an optical 
instrument (e.g., a telescope), which is a function of the amount of 
information the instrument can receive (the area of the primary lens 
or reflector), not on how far away the object in view may be nor what 
its absolute magnitude may be.

 What is the reason for taking multiple samples of the same n - 
 to achieve more accuracy? 

I, for one, don't understand the point of this question at all.
Multiple samples?  Who takes them, or advocates taking them?

 snip, the rest 

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Help with Minitab Problem?

2001-09-29 Thread Donald Burrill

I second Dennis' question.  While indeed MINITAB recognizes the missing
values, what it does with them depends on the procedure being used: 
e.g., for CORRelation it uses all cases for which each pair of variables
is complete (pairwise deletion of missing data), and therefore, for a
data set like yours, the numbers of cases (as well as the particular set
of cases) used for each correlation coefficient are possibly different; 
whereas for REGRession, where any of the variables named on the REGRession
command is missing, the case is deleted (listwise deletion).  Whether it
is even useful to construct a subset of the data for which all variables
are non-missing depends on how badly infected the variables are with
missing data, and on whether the missing data occur in (useful?) patterns. 
If you have about 10% missing in each column, unsystematically spread
through the set of columns, you could end up with a subset containing zero
cases. 
To answer your question however, on the (possibly unjustified) 
assumption that it's a useful thing to do:

COPY c1-c35 to c41-c75;   #  Always retain the original data
OMIT c1 = '*';
OMIT c2 = '*';
. . . ;
OMIT c35 = '*'.

There is probably a limit on the number of subcommands that MINITAB 
can handle (or on the number of OMIT subcommands that COPY can handle), 
but I don't know offhand what it is.  (It is also imaginable that the 
OMIT subcommand permits naming more than one column, which would greatly 
simplify things, but I am inclined to suspect not.)  If 35 subcommands 
are too many, proceed in batches of, say, 10 (or whatever):  
copy c1-c35 to c41-c75, omitting '* in c1-c10;  
then copy c41-c75 to c81-c115, omitting '*' in c51-c60;  
then copy c81-c115 back to c41-c75, omitting '*' in c101-c110; 
then copy c41-c75 to c81-c115, omitting '*' in c71-c75.
 Finally, to check that no missing values have been retained, count the 
number of missing values in that set of columns:
NMISS c81
NMISS c82
. . . 
NMISS c115
To avoid having to inspect the result for each column, store the NMISSes 
in 35 constants:
NMISS c81 k1
NMISS c82 k2
. . .
NMISS c115 k35
 copy them into an unused column somewhere (e.g., c116):
COPY k1-k35 c116
 and verify that they're all zero by  
SSQ c116  
which will return 0 iff all values in the colunmn are 0.

An easier way of verifying that there are no missng values in c81-c115 
is to call for the INFO window (or give the INFO command:
INFO c81-c115 )
which will report, inter alia, the number of missing values in each 
column.  (I prefer the command in this situation, to avoid being 
confused by information about columns not relevant to the question.)

On Fri, 28 Sep 2001, John Spitzer wrote:

 I have a dataset which has about 35 column.  Many of the cells have
 missing values.  Since MINITAB recognizes the missing values, I can
 perform the statistical work I need to do and don't need to worry 
 about the missing values. 
Perhaps you don't need to, but you probably should.

 However, I would like to be able to obtain the subset of observations 
 which MINITAB used for its calculations. 
As remarked above, this subset may vary from one pair of columns 
to another, or from one list of columns to another, depending on the 
calculations being performed.  Yes, you definitely should worry about 
the missing values.

 I would like to be able to create a worksheet with only the rows from 
 my dataset which do NOT contain any missing values.
Which may or may not correspond to any particular subset of the 
data that MINITAB defined for its work.

 snip, hypothetical example 

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: E as a % of a standard deviation

2001-09-29 Thread Donald Burrill

On Fri, 28 Sep 2001, John Jackson wrote in part:

 My formula is a rearrangement of the confidence interval formula shown 
 below for ascertaining the maximum error.
E = Z(a/2) x SD/SQRT N
 The issue is you want to solve for N, but you have no standard 
 deviation value.
Oh, but you do.  In the problem you formulated, unless I 
misunderstood egregiously, you are seeking to estimate the proportion of 
defective (or pirated, or whatever) CDs in a universe of 10,000 CDs. 
There is then a maximum value for the SD of a proportion:  
SD = SQRT[p(1-p)/n]
where  p  is the proportion in question,  n  is the sample size.
This value is maximized for  p = 0.5  (and it doesn't change much 
between  p = 0.3  and  p = 0.7 ).  If you have a guess as to the value 
of  p,  you can get a smaller value of  SD,  but using  p = 0.5  will 
give you a conservative estimate.
You then have to figure out what that 5% error means:  it might 
mean +/- 0.05 on the estimated proportion p (but this is probably not a 
useful error bound if, say, p = 0.03), or it might mean 5% of the 
estimated proportion (which would mean +/- 0.0015 if p = 0.03). 
(In the latter case, E is a function of p, so the formula for n 
can be solved without using a guesstimated value for p until the last 
step.) 
Notice that throughout this analysis, you're using the normal 
distribution as an approximation to the binomial b(n,p;k) distribution 
that presumably really applies.  That's probably reasonable;  but the 
approximation may be quite lousy if  p  is very close to 0 (or 1).
Thbe thing is, of course, that if there is NO pirating of the CDs, p=0, 
and this is a desirable state of affairs from your clients' perspective. 
So you might want to be in the business of expressing the minimum  p 
that you could expect to detect with, say, 80% probability, using the 
sample size eventually chosen:  that is, to report a power analysis.

 The formula then translates into n = (Z(a/2)*SD)/E)^2   
   Note: ^2 stands for squared.
 
 You have only the confidence interval, let's say 95% and E of 1%.  
 Let's say that you want to find out how many people in the US have 
 fake driver's licenses using these numbers.  How large (N) must your 
 sample be?

Again, you're essentially trying to estimate a proportion.  (If it is 
the number of instances that is of interest, the distribution is still 
inherently binomial, but instead of  p  you're estimating  np,  with 
SD = SQRT[np(1-p)]
 and you still have to decide whether that 1% means +/- 0.01 on the 
proportion p or 1% of the value of np.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: One more time--Two Factor Kruskal-Wallis

2001-09-22 Thread Donald Burrill

Hi, Carol.  I'm taking the liberty of posting this to the Edstat 
(statistical education) list as well as the Minitab list.

On Fri, 21 Sep 2001, Carol DiGiorgio wrote:

 My question is:  I would like to run 2-way ANOVA on my data.  
 Unfortunately it doesn't meet the assumptions of normality or 
 homogeneity of variance.  I've worked with the data to find a 
 transformation, but have been unable to find one.

1.  Which assumption of normality?  The only one that comes close to 
being _required_ is the assumption that the _residuals_ from the model 
are normally distributed.  (I ask, because it seems often to be believed 
that the raw variable itself, infected by possible effects of the design 
factors, should be normally distributed;  this is not the case.)

2.  How badly unequal are you cell variances?  Unless they vary by at 
least an order of magnitude, unequal variances won't much affect your 
conclusions, and if your cell, n's are equal (or if not, if the cells 
with the larger variances have the larger n's), the size of the test 
(that is, the empirical P-level) will be not far from the nominal value.

3.  Unequal variances will affect the sensitivity of post hoc 
comparisons, however. 

 I want to run a non-parametric 2-way ANOVA using Minitab, and determine
 whether the factors or the interaction are significant (I'm guessing a 2
 factor Kruskal-Wallis, but I don't know what tests exist).  If any of
 the factors were significant I would like to run a non-parametric
 multiple comparison test to determine where there are significant
 differences.  Is it possible to do this in Minitab (or any other
 statistical program)? 

If I were doing it, I'd run an ordinary two-way ANOVA, using either 
TWOWAY  or  ANOVA;  or, if the design were unbalanced, using GLM (since 
neither TWOWAY nor ANOVA will handle unbalanced data).  Then inspect the 
pattern(s) among the means, probably displaying them graphically, with an 
eye toward possible useful interpretations.

If I were really concerned that the unequal variances might represent 
something real in the population of interest (rather than an 
inconvenience of sampling, in this particular sample), I'd convert the 
dependent variable to ranks (in another column of the worksheet!) and 
repeat the two-way analysis on the ranks.  This would give you the 
equivalent of a two-way Kruskal-Wallis, or a Friedman, test.

YOu haven't described your data well enough for me to tell whether a 
Friedman test is appropriate (see FRIEDMAN in the MINITAB Reference 
Manual).  If it is not, you can ALWAYS simulate a two-way analysis in the 
framework of a one-way analysis by identifying each cell separately:  
e.g., a 3x4 two-way ANOVA can be analyzed as a one-way ANOVA with 12 
levels.  (This would apply to KRUSKAL-WALLIS (q.v.) as well as to 
ONEWAY.)  You just have to be clever, afterwards, in defining the 
particular contrasts (or sets of contrasts) that identify what a two-way 
analysis would have reported as main effects and interactions -- but, 
again, that's just a matter of displaying the cell means (or medians) 
in the form of a two-way layout.

 Thank you in advance.  Carol

HTH.-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: effect size/significance

2001-09-13 Thread Donald Burrill

On Thu, 13 Sep 2001, Paul R. Swank wrote in part:

 Dennis said
 
 other than being able to say that the experimental group ... ON AVERAGE ...
 had a mean that was about 1.11 times (control group sd units) larger than
 the control group mean, which is purely DESCRIPTIVE ... what  can you say
 that is important?
 
 However, can you say even that unless it is ratio scale?

Yes, well, Dennis was referring to a _difference_.  When the underlying 
scale is interval, differences ARE ratio scale:  zero means zero.
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-07 Thread Donald Burrill

On Sat, 8 Sep 2001, Magenta wrote in part:

  (responding to Rich Ulrich's remark:)
  Michelle, I hope that  you now know that you got  tangled up in
  hypothetical illustrations which you now regret.
 
 Sure do, I think that if you redid it so that the scale was now:
 
 don't agreestrongly agree
  |___|
 
 that would give you a ratio scale between no agreement and strong 
 agreement. 

Well, in SOME circumstances, perhaps it might;  but I don't see a 
persuasive rationale for it WOULD give you a ratio scale [emphasis, 
obviously, added].

 You would then be able to use, e.g. ANOVA, on your test results, which 
would be numeric in millimeters.

Or other units of length -- sixteenth-inches, micro-furlongs, etc.
But really, you don't need a ratio scale for ANOVA, you know.  
At most you need an interval scale, and even then approximately 
(that is, approximately interval) works very well much of the time.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-30 Thread Donald Burrill

On Tue, 28 Aug 2001, Dennis Roberts wrote in part:

 however ... the flagging of outliers is totally arbitrary ... i 
 see no rationale for saying that if a data point is 1.5 IQRs away from 
 some point ... that there is something significant about that

If the data are normally distributed (or even approximately so, what 
seems to be called empirically distributed these days), the 3rd 
quartile + 1.5 IQR locates a point 2.0 std. devs. above the mean;  
symmetrically, the 1st quartile minus 1.5 IQR gets you 2.0 SDs below the 
mean.  Close enough to the central 95% of the distribution, for the 
precision of the 1.5.

Of course, the antique 5% standard is rather out of fashion nowadays, 
but this was, I believe, the underlying rationale for Tukey's choice of 
the region box +/- 1.5 IQR as a rule-of-thumb (or convention) for 
initial identificaiton of potential outliers.

On the question of whether the whiskers of a box--whisker plot should 
be made to cease at box +/- 1.5 IQR, note that some current 
undergraduate textbooks distinguish between a quick boxplot which 
shows the range but not outliers, and a full boxplot which uses the 
box +/- 1.5 IQR rule.  (Of course, if there are no outliers -- by that 
definition -- the two are identical.)

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: a problem.

2001-08-27 Thread Donald Burrill

On Sun, 26 Aug 2001 [EMAIL PROTECTED] wrote:

 I have trouble to solve this probability problem.  Hope get help here. 
 
 There is N balls, Pick up M1 balls with replacement from them.
 what is the expected value of different balls we pick up?

Expected value of what characteristic of the balls?  

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: adjusted r-square

2001-08-21 Thread Donald Burrill

On 21 Aug 2001, Atul wrote:

 How do we calculate the adjusted r-square when the error degrees of 
 freedom are zero ?  (Or in other words, number of samples is equal to 
 the number of regression terms including the constant.)
 Such a situation leads to a zero in the denominator in the expression
 for calculating adjusted r-square.

Depends in part on the expression you use, but in any case you also get 
a zero in the numerator.  Cf. Draper  Smith, Eq. (2.6.11b):  the 
right-hand expression indeed contains (n-p) in the denominator, but it 
also includes (1-R^2) in the numerator, which produces the indeterminate 
quotient 0/0.  In the middle expression of that equation, the quotient 
(residual SS)/(n-p) appears, which is also 0/0.

All of which only emphasizes that the result of any analysis for which 
the error d.f. = 0 is meaningless:  whether r-square, or regression 
coefficients, or error mean square, ... .  Statistical conclusions 
cannot, in general, be drawn from such an analysis.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: large N, categorical outcomes, significance?

2001-08-18 Thread Donald Burrill

One approach:  (I assume that by residual you mean (O-E)/sqrt(E) for 
each cell of a two-way frequency table, where O=observed frequency and 
E=expected frequency under the null hypothesis).  For the several (or 
the single) largest residual(s), report O and E as proportions (of total 
N).  Express the residual in terms of proportions, which will turn out 
to include N (or its square root) as a factor.  Show that the residual 
can be whatever it was (105.6, say) only if N is as large as it is in 
your dataset, and that the same proportions for some smaller (more 
reasonable?) N would _not_ produce a significant residual.

For purposes of this exercise, you could express the total chi-square 
in terms of proportions and N, and show that for the observed proportions 
only values of N larger than some value would produce a significant 
result;  or you could take, for any single cell, a critical value for 
chi-square with one d.f.  
 (One could argue for d.f. = (r-1)(c-1)/(rc), since the table has rc 
cells but only (r-1)(c-1) d.f., but 1 d.f. is arguably conservative, 
and finding critical values for fractional d.f. may be difficult.) 

On 17 Aug 2001, JDriscoll wrote:

 I have a large dataset (N can be 2,000-9,000) with
 mostly categorical outcome variables.  Any
 chi square is significant with residuals of 100+
 for tiny differences.  I  know one can determine
 effect size for continuous variables and show
 result is sign only due to size of the N, but...how
 do I do this for categorical outcome variables?
 Thanks!

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Presenting results of categorical data?

2001-08-15 Thread Donald Burrill

On 14 Aug 2001, Nolan Madson wrote:

 I have a data set of answers to questions on employee performance. 
 The answers available are:
 
 Exceeded Expectations
 Met Expectations
 Did Not Meet Expectations
 
 The answers can be assigned weights  [that is, scores -- DFB]
 of 3,2,1 (Exceeded, Met, Did Not Meet).
 
 Our client wants to see the results averaged, so, for example, we see 
 that all employees in all Ohio offices for the year 2001 have an
 average performance rating of 1.75 while all employees in all Illinois 
 offices have an average performance rating of 2.28.
 
 One of my colleagues says that it is not valid to average categorical
 data such as this.  His contention is that the only valid form of
 representation is to say that 75% of all respondents ranked Ohio
 employees as having Met Expectations or Exceeded Expectations.

Your colleague is correct about categorical data.  It is not clear 
whether he be correct about data such as this.  Your responses are 
clearly at least ordinal (in the order you gave them, from most effective 
to least effective).  The question is whether the differences between 
adjacent values are both approximately equal:  that is, whether 
Exceeded Expectations is roughly the same distance (in some 
conceptual sense) from Met Expectations as Did Not Meet Expectations 
is.  (And whether this be the case for all the variables in question.) 
These are difficult questions to argue in the abstract, either on 
theoretical or empirical grounds -- although for empirical data you 
could always carry out a scaling analysis and see if the scale values 
thus derived are approximately equidistant.

Probably more important than arguing about whether your data are only 
nominal (i.e., categorical), or only ordinal or of interval quality 
is, what do your clients (and/or the publics to whom they report) 
understand of various styles of reportage?  I suspect that some folks 
would be much happier with 75% of respondents in Ohio met or exceeded 
expectations, while only 60% of respondents in Illinois did so, 
together with a statement that the difference is significant (or not), 
than with a statement like all employees in all Ohio offices ... had an
average performance rating of 1.75 while all employees in all Illinois 
offices had an average performance rating of 2.28, also with a statement 
about the statistical value of the distinction.  OTOH, some people prefer 
the latter.  No good reason not to report in both styles, in fact.

 Can anyone comment on the validity of using averages to report on
 categorical data?  

Well, now, as the question is put, the answer is (of course!) that 
averages are NOT valid for categorical data (unless the categories are 
at least ordinal and more or less equally spaced).  But that begs the 
question of whether categorical data be an adequate description of YOUR 
data.  I'd judge it is not:  it appears to be at least ordinal.  The 
question whether it be also interval, at least approximately, depends on 
the internal representations your respondents made of the questions and 
the possible responses, which is a little hard to find out at this point. 
However, if (as is often the case) the response medium depicted the three 
possible responses on a linear dimension and at equal intervals, it's a 
reaosnably good bet that most of your respondents internalized that 
dimension accordingly.

 Or point me to reference sources which would help
 clarify the issue?--  Nolan Madson

I doubt that references would help much in dealing with the facts of the 
matter, although they might provide you some information and help you to 
sound more erudite to your clients...  This is essentially a measurement 
issue, so appropriate places to look are in textbooks on educational or 
psychological measurement.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: sampling/subsample query

2001-08-01 Thread Donald Burrill

Some clarification would help.  See below.

On Wed, 1 Aug 2001, Teen Assessment Project wrote:

 I have an overall sample of 5000+  from 40+ different towns and 6 
 different grades. 

In approximately equal numbers per town/grade, or not?  
Are all 6 grades (which grades?) represented in each town?
Do they always coexist within schools, or are they divided (e.g., 
between junior-high schools and high schools)?  (Etc.)
How were these cases sampled from the population?  
(And possibly relevant:  how large is the population?)
[Bluntly:  How do you know the overall sample is worth using as a 
standard of comparison, as appears to be desired?]

 One person wants to look at a subsample of 200 from specific towns and 
 grades and compare this subsample with the rest of the group on outcome 
 variables.   Advice appreciated here.

Why 200?  (Arbitrary round number?  Result of power calculation?  
 Maximum size dictated by constraints you haven't bothered to mention? 
 Outcome of consulting the entrails of a NH Red chicken?)
 Why not _all_ the respondents from the specified towns?  

 The only demographics are self-reported family structure and
 maternal/paternal education. 

If you know the names of the towns (which seems to be implied by the 
description of the desired subsample), you also know the population of 
the towns and some (admittedly rather general) other demographic 
information:  e.g., whether the school is located in a community (and 
what kind of community, e.g., Manchester HS West) or in a wilderness 
(Vox clamanti in deserto, as they say at Dartmouth, which would also 
be appropriate for John Stark Regional HS, in the wilds of western 
Weare).  In the larger towns, do you also know the names of the schools 
containing the members of your sample?  That might provide additional 
detail.  (Or not, for city schools that draw students from outlying more 
rural or suburban areas.)

 Ideas:  1) I could try to match the demographics/grades of the 
 selected 200 with 200/5000 other subjects.

Why would you wish to do that?  You write above, One person wants to 
look at a subsample of 200 ... and compare this subsample with the rest 
of the group on outcome variables.  The matching you propose would 
seem, on the face of it, to invalidate any comparison between the groups 
to be compared. 

 2) I could randomly  select 200/5000 other subjects and test to see if
 there is a sign difference in the demographics.  
[One presumes sign here is a contraction of significant,
 and does not (necessarily) imply a sign test.]

True.  This does not appear to be what One person wants to do, though, 
which is to compare (= test?) for differences in the outcome variables.
Something's missing here.  What does One person really want to do (or 
say s/he wants to do), when not constrained to speak in a kind of 
pseudo-statistish language?  What theory informs the intent of the 
proposed study (or, if no theory, what kinds of practical decisions 
might it be reasonably expected to lead to?)?

 3)??  4)??  Alternative sampling procedures aren't useful to 
contemplate in the absence of design or purpose information.

 Outcome variables are all categorical --

By this do you mean that they are all of yes/no or true/false form 
(or equivalent)?  Or are some of them a choice of one from among several 
named categories?  Or multiple choices among multiple categories?  
 Are any of these sets of ordered categories (such as one might elicit 
from Likert-type items)?
 Do the variables come in sets (or dimensions) that lend themselves to 
any kind of summary scoring?  (E.g., total # of categories of this kind 
that are true or yes (or whatever) for this particular case.)
 Ought you to be doing some sort of scaling analysis on the categories, 
to produce interval-level scaled variables?  (Search on dual scaling 
and correspondence analysis.)

 assuming chi-squares testing here. 

There are a variety of kinds of chi-square tests.  If you are (as one 
suspects) referring to two-dimensional cross-classification tables, and 
testing the independence of classifications, this is of course possible.  
It may not be optimal:  depending in large part on what the _real_ 
questions are that One person wants to address, and on the nature(s) 
of the variables of interest.  Scaling of the category systems would 
yield variables you could subject to various linear models -- multiple  
regression, analysis of variance/covariance, Hotelling's T-square, etc. 


 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/

Re: log

2001-07-31 Thread Donald Burrill

On 31 Jul 2001, ToM wrote:

 what is the opposite of a log?[logarithm]

An antilog [properly, antilogarithm].  Equivalently, 10 to that power 
(if, as in your example, you are taking logarithms to the base 10);  or 
e to that power (if you are taking natural logarithms), which is also 
called the exponential function, exp( ).

 If you do lg10 of 3 in spss, it gives you a number.  how can i take
 this number and have as a solution the initial one (3)?

lg10(3) = 0.47712.  In SPSS-speak, 10**(0.47712) = 2.9

For natural logarithms,  ln(3) = 1.0986,  exp(1.0986) = 2.6

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: web page to help use normal table

2001-07-28 Thread Donald Burrill

Use the table twice -- for P(0Zz1) and P(0Zz2) -- and then subtract 
or add, depending on whether the desired signs of z1 and z2 are the same 
or different.   -- DFB.

On Sat, 28 Jul 2001, Cantor wrote:

 I did not try to examine your work thoroughly but at the very beginning 
 I try to count  P(z1Zz2) up, but there is only z1 which can be 
 changed. What about z2?

in response to EAKIN MARK E [EMAIL PROTECTED], who had written:

  I have just finished creating an ASP web page that will help students 
  use a normal table that gives probabilities for ranges of the 
  standard normal that start at 0 up to a Z value. If you wish to try 
  it, go to 
  http://www2.uta.edu/eakin/busa3321/normaltable/p2.asp

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: confidence interval

2001-07-28 Thread Donald Burrill

If you don't happen to have a convenient  r -- Z  conversion table 
handy, it may be helpful to know, for step 1. below, that
Z = 0.5 log((1+r)/(1-r))  or, equivalently, 
Z = tanh^(-1)r = the hyperbolic arctangent of r.
 (log is the natural logarithm.)

It follows that, given a value of  Z  (for step 4.), 
r = (exp(2Z)-1)/(exp(2Z)+1)  
  where exp(2Z) is  e  to the power  2Z,  or, equivalently,
r = tanh(Z) = the hyperbolic tangent of Z.

The standard error of  Z  is 1/sqrt(n-3)  (for step 2.),
  where sqrt(n-3) is the square root of (n-3).

On Sat, 28 Jul 2001, dennis roberts wrote:

 one way is:
 
 1. convert sample r to Fisher's BIG Z (consult conversion table)
 2. find standard error of Fisher's Z ... (find formula in good stat book)
 3. for 95% CI ... go 1.96 standard error (from #2) units on either side of 
 Z (from #1)
 4. convert EACH end of the CI in Fisher Z units back to r values (use table 
 from #1 in reverse)
 
 At 05:28 AM 10/22/99 -0200, Alexandre Moura wrote:

 how can I construct a confidence interval for a Pearson correlation?
 
 Thanks in advance.
 
 Alexandre Moura.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: need help with SAS

2001-07-27 Thread Donald Burrill

On Fri, 27 Jul 2001, Nadine Wells wrote in part:

 Does anyone know what the power link function does in SAS?  [...] when 
 I plot the equation based on the parameter estimates, the model doesn't 
 seem to look like I want it to.  [...]  I am trying to get SAS to run a 
 model that resembles exponential growth.  Any suggestions would be 
 greatly appreciated.

Nadine, if you want to model exponential growth, why are you trying to 
use a power function?  For simple exponential growth, 

Y = a e^(bT)   where T is time (years, in your case?) and  Y  is
the variable whose growth is modelled.  Then

log(Y) = log(a) + bTwhich is a simple linear regression.

It is easy to show that the doubling time is  log(2)/b  (all logs are 
natural logs, of course).  It then remains to adapt the model to include 
your habitat complexity.  How best to do this is not clear to me;  but 
at least you could start by breaking that variable into (probably 
ordered?) categories, and try fitting a separate exponential function for 
each such category (like an ANOVA), perhaps subject to one or more 
constraints (e.g., a common value of  b  for all habitats -- that 
analysis would resemble, formally, an analysis of covariance but you 
might well prefer to model it in multiple regression terms). 
Hope this helps, some.
--Don.
Nadine's complete post:

Does anyone know what the power link function does in SAS?  I have to 
provide a parameter estimate in parenthases after the link=power command. 
I've been using -1 but when I plot the equation based on the parameter 
estimates, the model doesn't seem to look like I want it to.  Does anyone 
know exactly what the power link function does?  More specifically, I am 
trying to get SAS to run a model based on proportion data.  That is, my 
dependent variable is a proportion (# of beach seine hauls that catch 
fish over total # of beach seines hauled), my explanatory variable is a 
measure of habitat complexity.  I am also using year as a categorical 
variable.  I am trying to get SAS to run a model that resembles 
exponential growth.  Any suggestions would be greatly appreciated.

Nadine

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: vote counting

2001-07-25 Thread Donald Burrill

The answers to your questions depend heavily on structural information 
that you almost certainly don't have, else one would not bother to have 
arranged a voting process.  But consider two very different cases:
  A.  Voters are absolutely indifferent to candidates:  that is, all the 
candidates are equally attractive, or equally preferred by the voters. 
Then the identity of the candidate with the most votes is purely random, 
and the probability that the counted top N will correspond to the real 
top N will be very low indeed (in part because there IS no real top 
N;  but even in the sense that another vote taken tomorrow would be 
very unlikely to reproduce the same set of top N, let alone in the 
same order). 
  B.  Some candidates are strongly preferred to others (by the voters as 
a whole, that is, as a population), and exactly N such candidates are so 
preferred.  About the rest the voters are indifferent, on the whole.  In 
these circumstances, one would expect a large difference between the 
number of votes cast for the least of the N and the number of votes cast 
for the greatest of the remaining candidates, and the probability that
the counted top N will correspond to the real top N would be rather 
high (depending in part on how large 'p' is).
I do not see how to estimate such a probability in the absence 
of any information about the distribution of preferences.
I've assumed that by counting votes you mean that each voter 
casts exactly one ballot for (at most?) one candidate.  For other voting 
schemes (e.g., vote for K candidates, K .LE. N, and specify one's 
preferences among them by assigning each candidate a preference from 1 
(most favored) to K (least favored)) it is imaginable that answers to 
your questions might not differ, but showing that to be the case (or 
not) is another matter entirely.
It also occurs to me that a single probability 'p' of error in 
voting must be a global average and is an oversimplification almost 
certainly.  In case A above, the results of an election might be 
dominated by voters whose personal 'p' is large;  although, again, it is 
not clear to me how one might show such a thing formally.
-- DFB.

On Wed, 25 Jul 2001, Sanford Lefkowitz wrote:

 In a certain process, there are millions of people voting for thousands
 of candidates. The top N will be declared winners. But the counting
 process is flawed and with probability 'p', a vote will be miscounted.
 (it might be counted for the wrong candidate or it might be counted for
 a non-existent candidate.)

The latter would constitute a spoiled ballot, or not?

 What is the probability that the counted top N will correspond to the
 real top N?
 (there are actually two cases here: 1 where I want the order of the top
 N to be in the correct order and the other where I don't care if the
 order is correct)
 
 Thanks for any ideas,
 Sanford Lefkowitz

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: SRSes

2001-07-24 Thread Donald Burrill

Hi, Dennis!
Yes, as you point out, most elementary textbooks treat only SRS 
types of samples.  But while (as you also point out) some more realistic 
sampling methods entail larger sampling variance than SRS, some of them 
have _smaller_ variance -- notably, stratified designs when the strata 
differ between themselves on the quantity being measured.

On Tue, 24 Jul 2001, Dennis Roberts wrote:

 most books talk about inferential statistics ... particularly those 
 where you take a sample ... find some statistic ... estimate some error 
 term ... then build a CI or test some null hypothesis ...
 
 error in these cases is always assumed to be based on taking AT LEAST a 
 simple random sample ... or SRS as some books like to say ...
 
 but, we KNOW that most samples are drawn in a way that is WORSE than SRS 

I don't think _I_ know this.  I know that SOME samples are so drawn;  
but (see above) I also know that SOME samples are drawn in a way that 
is BETTER than SRS (where I assume by worse you meant with larger 
sampling variance, so by better I mean with smaller sampling 
variance).

 thus, essentially every CI ... is too narrow ... or, every test 
 statistic ... t or F or whatever ... has a p value that is too LOW  
 
 what adjustment do we make for this basic problem?

I perceive the basic problem as the fact that sampling variance is 
(relatively) easily calculated for a SRS, while it is more difficult 
to calculate under almost _any_ other type of sampling.  
 Whether it is enough more difficult that one would REALLY like to avoid 
it in an elementary course is a judgement call;  but for the less 
quantitatively-oriented students with whom many of us have to deal, we 
_would_ often like to avoid those complications.  Certainly dealing with 
the completely _general_ case is beyond the scope of a first course, so 
it's just a matter of deciding how many, and which, specific types of 
cases one is willing to shoehorn into the semester (and what previews 
of coming attractions one wishes to allude to in higher-level courses). 

Seems to me the most sensible adjustment (and of a type we make at 
least implicitly in a lot of other areas too) is 
 = to acknowledge that the calculations for SRS are presented 
   (a) for a somewhat unrealistic ideal kind of case,
   (b) to give the neophyte _some_ experience in playing this game,
   (c) to see how the variance depends (apart from the sampling scheme)
on the sample size (and on the estimated value, if one is 
estimating proportions or percentages),
   (d) in despite of the fact that most real sampling is carried out 
under distinctly non-SRS conditions, and therefore entails 
variances for which SRS calculations may be quite awry;  and
 = to have yet another situation for which one can point out that for 
actually DOING anything like this one should first consult a 
competent statistician (or, perhaps, _become_ one!).

Some textbooks I have used (cf. Moore, Statistics:  Concepts  
Controversies (4th ed.), Table 1.1, page 40) present a table giving the 
margin of error for the Gallup poll sampling procedure, as a function of 
population percentage and sample size.  Such a table permits one to show 
how Gallup's precision varies from what one would calculate for a SRS, 
thus providing some small emphasis for the cautionary tale one wishes to 
convey.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Multiple measurements

2001-07-20 Thread Donald Burrill

Hi, Ivan.
I think your problem may not be so simple as you've described it. 
But to begin with the simplest:  In terms of area in mm^2, simply 
multiplying length x width, all of the ultrasound (US) samples except one 
have smaller areas than any of the high-speed drill (AR) samples;  6 of 
the 10 AR samples have larger areas than the largest US sample, and if 
that sample were ignored (3.5 x 2.2 = 7.70 mm^2) ALL of the AR samples 
have larger areas than the remaining nine US samples.  This pattern would 
be significant (p  .001) by Tukey's Compact test (1959).
A similar pattern is true of the widths;  the pattern for the 
lengths is less compelling, but would still be significant (p  .01). 

Similar results would be expected from the parametric methods you mention 
in your message (quoted below).

But do you really desire to compare AR with US only on the raw dimensions 
of the cavities?  One could define some degree of departure from the 
nominal dimensions (2.0 mm x 3.0 mm), and one might even specify 
acceptable and unacceptable ranges of values for this measure.

I do not know what would be unacceptable for this exercise.  But when 
one is preparing a cavity in a person's tooth, the prepared cavity would 
be unacceptably small if some of the decayed matter remained in the 
tooth;  and the cavity would be unacceptably large if so much of the 
tooth had been removed that what remained was too weak to hold the dental 
filling.

You might also ask how far each prepared cavity departed from the 
intended rectangular shape.  But this may not be a realistic question.
(I've had dentists working on my teeth since about 1945, and I think 
that _none_ of the fillings they prepared were rectangular in shape!) 

In quoting your original message below, I have taken the liberty of 
supplying corrected English, in [square brackets].

On Fri, 20 Jul 2001, Ivan Balducci wrote:

 Dear members,
 I am an engineering brazilian. My job is to help researches in Dental
[ I am a Brazilian engineer.  My job is to help researchers ... ]
 School about Statistics.
 My doubt is...
[ My concern is: ]
 
 How can I to comaparing two instruments:
 Ultra Som ...versus...Alta Rotação (High Sound  High Rotation)
[ How can I compare two instruments:
  Ultra Som  versus  Alta Rotação (ultrasound vs. high-speed drill) ]
 
 Theses instruments are used in Operative Dentistry
 to perform  preparos cavitarios (cavity prepair) 
 [(cavity preparation)] 
 The shape of  the prepair is rectangule
{ The shape of the cavity is rectangular. ]
 
 WellThe situation isThe specificated area = 6mm2  (= 2mm x 3mm)
[ ... The specified area = ... ]
 width = 2mm;  length = 3mm
 
 Two samplessize sample is 10 (n = 10) for each instrument
 
 How can I aproach this problem?
 I can to do an Analysis Multivariate (T2 Hotteling) : instrument US x
 instrument AR ?
[ I can do a multivariate analysis (Hotelling's T^2) ... ]
Yes, this is possible.
 I can to do a IC (95%), or t-test, separately for each  variable (width
 and length)  and instrument ?
[ I can do a confidence interval (CI), or t-test ... ]
These are also possible.
 I can to compare the areas (width x length)...for instrument US against
 instrument AR ?
[ I can compare the areas ... ]
And so is this.
 Well...
 Which is the best, the correct way to approach a problem of this kind? 

Any of the ways mentioned above are possible and correct.  It is not 
clear whether any of them is best, because it is not clear how best 
may usefully be defined.  It is also not clear what the specific 
questions are that you really desire to address.  I have tried to 
indicate some of the range of interesting questions that you might be 
interested in.

 Data:[In the data below, I think you have interchanged the 
   labels width and length.]
 US:
 width:  2.8   2.9   2.9   3.03.0   3.12.7   2.5   3.5   3.2
 length: 1.9   2.0   1.9   1.92.0   2.02.0   1.9   2.2   2.0
 
 AR:
 width:  3.2   3.3   3.5   3.23.5   3.6   3.5   3.7   4.13.4
 length: 2.3   2.1   2.1   2.22.7   2.6   2.5   2.4   2.02.5
 
 very thanks for the attention and sorry my english
[ Thank you very much for your attention to my problem. ]
  (Alternatively, you could simply write  TIA, for Thanks in advance.)

-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: statistical similarity of two text

2001-07-18 Thread Donald Burrill

On Tue, 17 Jul 2001, Cantor wrote:

 Does anybody know where I can find program on the website which [can] 
 compare two texts/articles and settle whether or not they are similar 
 assuming any significant level.

Sorry, Cantor:  this is not possible, in general.  
 One can discover whether two (or more) things _differ_ (on some 
quantitative measure) at a specified significance level (when this is a 
reasonable thing to do -- it isn't always reasonable), but the formal 
definition of significant in statistical analysis does not permit 
discovering whether two (or more) things are _similar_.  
 However, it may suffice for your purposes to discover that two things 
are not different enough that you can tell them apart (which is not the 
same thing as discovering that they are the same), on whatever measure 
(or set of measures) you choose to analyze.  Whether this be a useful 
outcome or not depends heavily on how much information you have (that 
is, on the size of the sample available) on the things being compared. 

In any case, the hard part is defining the characteristics, or  
properties, or measures, on which the two texts/articles are to be 
compared. 

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Interpreting effect size.

2001-07-15 Thread Donald Burrill

On Sun, 15 Jul 2001, Melady Preece wrote:

 I have done a paired t-test on a measure of self-esteem before and 
 after a six-week group intervention.
 
 There is a significant difference (in the right direction!) between 
 the means using a paired t-test, p=.009.  The effect size is .29 if I 
 divide by the standard deviation of the pre-test mean, and .33 if I 
 divide by the pooled standard deviation.

This implies that the effect size would be larger than .33 if you were to 
divide by the s.d. of the post-test mean:  which is evidently smaller 
(although probably not significantly so?) than the s.d. of the pre-test 
mean. 

But if you have paired pre/post values, you are essentially calculating 
the difference score (post minus pre), and constructing a  t  ratio using 
the s.d. of those differences.  This would ordinarily be expected to be 
noticeably smaller than the s.d. of either pre-test or post-test means. 
Do you have a reason for not using _that_ s.d.?

 Question 1:  Which is the correct standard deviation to use? 
Well, you have a choice of four:  the s.d. of the pre-test mean, 
the s.d. of the post-test mean, the s.d. of the difference, and the 
pooled s.d. (resulting from pooling together the variances pre and post). 
The pooled s.d. would be (at least possibly) appropriate if you were 
performing a t-test for independent groups, but I cannot see how it could 
be thought suitable for paired differences (unless, perhaps, you and I 
mean different things by pooled s.d.).
Of the other three, and in the absence of other considerations 
which may apply to your situation that you haven't told us about, I'd be 
inclined to report all three;  unless circumstances (among the other 
considerations) led me to prefer one of them in particular.  Using the 
pre-test s.d. may make it possible for your readers to estimate what 
differences they might expect to find, based on pre-test information, 
before getting to the post-test stage;  this might be of value to some 
readers.  Similar interpretations can be made of effect sizes calculated 
from the other s.d.s.
I would also want to report the raw difference in means, if the 
raw scores are (as I assume to be the case) values that are more or less 
understood (e.g., number of right answers out of the number of items), 
since it provides something like a common-sensical measure...  I'd also 
be interested (as a potential reader) in some summary information about 
the difference scores, like what proportion were negative... 

 Question 2:  Can an effect size of .29 (or .33) be considered 
 clinically significant?

Not enough information for me to tell.  (And I just discovered my watch 
had stopped -- forgot to wind it this morning -- and am in danger of 
being late for today's next agendum.  Good luck!)
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: EdStat: Triangular coordinates

2001-07-11 Thread Donald Burrill

On Tue, 10 Jul 2001, Alex Yu wrote:

 I am trying to understand Triangular coordinates -- a kind of graph 
 which combines four dimensions into 2D 

You meant, condenses four dimensions into 3D, didn't you?  Your 
subsequent description indicates three dimensions all together, two 
of them used to represent 3 variables:

 by joining three axes to form a triangle while the Y axis stands up. 
 The Y axis can be hidden if the plot is depicted as a contour plot or a 
 mosaic plot rather than a surface plot.
 
 I have a hard time to follow how a point is determined with the three 
 axes as a triangle. 

There must be constraints on the values of the three variables.  
Commonly used for situations like a chemical mixture of 3 components. 
Each component can have a relative concentration between 0% and 100%, 
but if component A is at 100%, components B and C must both be at 0%, 
and the point (100%, 0%, 0%) falls at one apex of the triangle.  The 
formal restriction, of course, is that the sum of all three 
concentrations equals 100%, so that there are really only two dimensions' 
worth of information available:  (A, B, (100%-A-B)),  (A, (100%-A-C), C), 
or  ((100%-B-C), B, C).  Since there is usually no reason to treat any 
component as more (or less) important than any other, triangular 
coordinates are often displayed on an equilateral triangle, and special 
graph paper can be purchased that has such a grid.  In the absence of 
such paper, one can plot, say, A and B at right angles to each other and 
let the 45-degree line from (100,0) to (0,100) represent the C axis (and 
the upper boundary of the space of possible points).

When there is not some such constraint on the values of the three 
variables, triangular coordinates don't make a whole lot of sense and 
may be extremely misleading.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: SPSS

2001-07-06 Thread Donald Burrill

On Sat, 7 Jul 2001, David Schaefer wrote:

 My Stats professor is having us run some correlations and what not 
 through SPSS.  She has asked us to transform some raw scores to 
 z-scores for a reading achievement test.  The commands she has asked 
 us to type in the syntax editor is: 
   COMPUTE zread = (reading-52.23)/10.25.
   EXECUTE.
 
 The 52.23 and 10.25 are the mean and standard deviation of the data, 
 repectively.  Absolutely nothing happens when I highlight and Run 
 these commands. 

What were you expecting to happen?  If these are the only commands 
that you asked to be carried out, there would be no visible happening, 
because no output has been called for.  There will have been a variable 
named zread created by the COMPUTE/EXECUTE sequence and stored in the 
active data file, but if you have not asked for output you won't get any. 

You might have asked, for example, for the mean(s) and standard 
deviation(s) of this new variable (and perhaps other extant variables); 
or for a correlation matrix among several variables, including this 
variable;  or for a listing of the values of this variable (if the 
number of cases is not prohibitively large).

 Any slight alterations of them result in a variety of error messages.  

Yes, that sounds reasonable, since any alteration would probably result 
in misspelling one or more command name(s) or variable name(s).

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Help with stats please

2001-06-24 Thread Donald Burrill

On Sun, 24 Jun 2001, Melady Preece wrote in part:

 I am teaching educational statistics for the first time, and although I 
 can go on at length about complex statistical techniques, I find myself 
 at a loss with this multiple choice question in my test bank.  I 
 understand why the range of (b) is smaller than (a) and (c), but I 
 can't figure out how to prove that it is smaller than (d).
 
 1.  Which of the following classes had the smallest range in IQ scores? 
 
  A)  Class A has a mean IQ of 106 and a standard deviation of ll.
  B)  Class B has an IQ range from 93 to 119.
  C)  Class C has a mean IQ of 110 with a variance of 200.
  D)  Class D has a median IQ of 100 with Q1 = 90 and Q3 = 110.
 
 The test bank says the answer is b.

Right.  Since you're happy that  range(B)  range(A)  and 
range(B)  range(C),  I'll focus on  (B) vs. (D).
In (B), the entire _range_ is from 93 to 119:  26 (or 27, 
depending on how you choose to define range) points.
In (D), the central half of the distribution is from 90 to 110: 
the interquartile range (IQR) is 20 points, symmetric about the median;  
the full range must therefore be greater than 20.  Now, _if_ the 
distribution is normal (which may be what we were to assume from the 
allegation that these are IQ scores;  although as Dennis has pointed out, 
ille non sequitur -- unless these are rather large classes AND NOT 
SELECTED BY I.Q. (or by any variable strongly related to I.Q.)), then 10 
points from Q1 to median (or from median to Q3) represents 0.67 standard 
deviation, which implies a standard deviation of about 15, which is 
larger than the standard deviation in (A) and slightly larger than that 
in (C).
However, we need not invoke the normal distribution.  We observe 
that the distribution in (D) is at least approximately symmetric (insofar 
as the quartiles are equidistant from the median).  If we may assume also 
that the distribution is unimodal (which I should think reasonable), it 
then follows (from the tailing off of distributions as one approaches 
the extremes) that the distance from minimum to Q1 (and the distance from 
Q3 to maximum) is greater than the distance from Q1 to median (or median 
to Q3).  This implies that the range of the distribution exceeds twice 
the interquartile range:  that is,  range(D)  2*20 = 40.  Since the 
range in (B) is only 26, clearly the range of (B) is less than the range 
of (D).

If any part of this argument remains unclear, I'd be happy to attack it 
again.  A rough sketch should make things pretty obvious, but it's a bit 
of a nuisance to draw pictures in ASCII characters!
--DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: meta-analysis

2001-06-22 Thread Donald Burrill

On Fri, 22 Jun 2001, Marc Esser wrote:

 After a closer look at the trials which I want to summarize, I noticed 
 that not the means are reported, but the medians.
 Do you have an idea how to calculate an effect size with this 
 information, e.g. median change of hospitalization time.
 The p-values reported in the trials are derived from Mann-Whitney-Tests.

Sorry, Marc;  I don't know offhand how to deal with this situation.
Perhaps someone else on the list can help.
-- Don.

 On 17 Jun 2001, Marc wrote (edited):
 
  I have to summarize the results of some clinical trials.
  The information given in the trials contain:
 
  Mean effects (days of hospitalization) in treatment  control groups;
  numbers of patients in the groups;  p-values of a t-test (of the
  difference between treatment and control) .
  My question:  How can I calculate the variance of the treatment
  difference, which I need to perform meta-analysis?  Note that the
  numbers of patients in the groups are not equal.
  Is it possible to do it like this:
 
  s^2 = (difference between contr and treatm)^2/ ((1/n1+1/n2)*t^2)
 
  How exact would such an approximation be?

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: meta-analysis

2001-06-17 Thread Donald Burrill

On 17 Jun 2001, Marc wrote (edited):

 I have to summarize the results of some clinical trials.
 The information given in the trials contain:
 
 Mean effects (days of hospitalization) in treatment  control groups; 
 numbers of patients in the groups;  p-values of a t-test (of the 
 difference between treatment and control) .
 My question:  How can I calculate the variance of the treatment 
 difference, which I need to perform meta-analysis?  Note that the 
 numbers of patients in the groups are not equal.  
 Is it possible to do it like this:
 
 s^2 = (difference between contr and treatm)^2/ ((1/n1+1/n2)*t^2)

Yes, if you know t.  If all you know is that p  alpha for some alpha, 
you then know only that t  the t corresponding to alpha (AND you need to 
know whether the test had been one-sided or two-sided -- of course, you 
need to know that in any case), you can substitute that corresponding t 
to obtain an upper bound on s^2 -- ASSUMING that the t was calculated 
using a pooled variance (your s^2), not using the expression for separate 
variances in the denominator:  (s1^2/n1 + s2^2/n2).

Note that this s^2 is NOT the variance of the treatment difference, 
which you said you wanted to know;  it is the pooled variance estimate 
of the variance within each group.  
 The variance of the difference in treatment means, which _may_ be what 
you are interested in, would be 

(difference)^2 / t^2 

with the same caveats concerning what you know about t.

 How exact would such an approximation be?

Depends on the precision with which  p  was reported.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: individual item analysis

2001-06-15 Thread Donald Burrill

In response to Doug Sawyer's post:

 I am trying to locate a journal article or textbook that addresses 
 whether or not exam quesitons can be normalized, when the questions 
 are grouped differently.  For example, could a question bank be 
 developed where any subset of questions could be selected, and the 
 assembled exam is normalized?

on Fri, 15 Jun 2001, dennis roberts wrote in part:

 also, you can normalize a distribution that is not so normal but, i 
 would ask ... how come you want to do that? 

which is a good question.  But I would ask the prior question:  
What, precisely, does Doug want to mean by normalize?  And is that 
meaning congruent with Dennis's understanding of the word?
(THEN I would ask Dennis's question!)

(I note in passing that Doug is in a department of physical science, 
and in physical sciences normal often has the meaning perpendicular 
to a (line or) plane;  while Dennis is in a department of educational 
psychology, where normal nearly always refer to the probability 
distribution that in physics is often called Gaussian.  I can't tell 
from the rather spare context whether any such misunderstanding or 
miscommunication applies to the conversation, but if it does, 'twere 
better sorted out sooner than later.)
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: multivariate techniques for large datasets

2001-06-12 Thread Donald Burrill

On 11 Jun 2001, srinivas wrote:

   I have a problem in identifying the right multivariate tools to 
 handle datset of dimension 1,00,000*500.  The problem is still
 complicated with lot of missing data.

So far, you have not described the problem you want to address, nor the 
models you think may be appropriate to the situation.  Consequently, 
no-one will be able to offer you much assistance. 

 Can anyone suggest a way out to reduce the data set and also to 
 estimate the missing value. 

There are a variety of ways of estimating missing values, all of which 
depend on the model you have in mind for the data, and the reason(s) you 
think you have for substituting estimates for the missing data.

 I need to know which clustering tool is appropriate for grouping the
 observations ( based on 500 variables ).

No answer is possible without context.  No context has been supplied.

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Correction procedure

2001-06-03 Thread Donald Burrill

On 3 Jun 2001, Bekir wrote, in part:

 My aim was to compare groups 2, 3, 4, 5 with control (group 1). ... 
 
 The rewiever had written me:  Accordingly, a statistical penalty 
 needs to be paid in order to account for the increased risk of a Type 
 1 error due to multiple comparisons.  The easist way to achieve this
 goal to adjust the P value require to declare significance using the
 Bonferroni correction.
 
 1.  What is the correct meaning of the last sentence?  What must I do? 
 As you wrote, must I find the adjusted p values or declare the
 adjusted significance level alpha?

Your choice:  the two ways of approaching the problem are equivalent. 
Either divide the criterion significance level (alpha) by the number 
of comparisons, as Duncan Smith recommended, and compare the p-values 
reported by your statistical routine to this adjusted value;  or 
adjust the p-values by multiplying the reported values by the number of 
comparisons.  Thus p = 0.02  adjusted alpha = 0.125 for one of your 
comparisons, or adjusted p = 0.08  nominal alpha = 0.05.  If I 
understand your reviewer correctly, (s)he seems to be requesting the 
latter:  adjusting the p-value.  

 2.  There are apparently and exactly three groups; groups 1, 3, 5 that 
 had the same proportions of translocation. 

Wasn't it groups 1, 4, 5 that had the same proportions?

 Therefore, to compare only the group 2 and 3 with the control can be 
 appropriate, can it be?  

Such a comparison may be appropriate (but see below);  but this does not 
change the situation.  Had groups 4 and 5 NOT had proportion equal to 
group 1, you would surely have wanted to make those two comparisons also. 
The question is not, how many comparisons were useful or significant;  
but how many comparisons would you have chosen to consider before you 
observed the results of this particular experiment.  By your description, 
you certainly considered AT LEAST the four comparisons mentioned in your 
first paragraph above.

 Thus, there would be two comparisons and the p valus 0.008 (0.008 x 2 
 = 0.016) and 0.02 (0.02 x 2 = 0.04)would be significant.  Is it right?

As explained above, and as Duncan Smith responded, No.
Duncan mentioned Dunnett's test.  This might indeed be appropriate for 
your design, but not for the analyses you have so far done.  Dunnett's 
test would normally follow the finding of a significant F value in a 
one-way analysis of variance (testing the formal hypothesis that the 
true (population) proportions in the five groups are all identical.  
Such an analysis could be undertaken with your data, but some persons 
(possibly including your reviewer?  I don't know) would object to 
carrying out an analysis of variance (ANOVA) with dichotomous data.

One advantage to ANOVA is the possibility of drawing conclusions more 
complex, and possibly more interesting, than the pairwise comparisons 
that you had originally envisioned.  In particular, you could test the 
contrast between Groups 2 and 3 combined, with Groups 1, 4, and 5 
combined;  since it seems clear that this is the only thing that is 
going on in your data.  Testing that contrast by the Scheffe' method, 
which offers experimentwise protection against the null hypotheses for 
any imaginable contrast, might be useful:  that contrast, involving all 
100 cases, is more powerfully tested than the series of pairwise 
comparisons, and may well be significant even against the conservative
Scheffe' criterion.  
  Whether that is useful _for_your_purposes_ is another matter entirely. 
If there is some useful meaning and interpretation to be gained in 
observing that only groups 2 and 3 differ from the control group and 
that groups 4 and 5 are indistinguishable from the control group, then 
this contrast would be useful to test formally.  If that outcome does 
not lend itself to useful interpretation (and the advance of knowledge 
in the field), you would probably be better off staying with the four 
pairwise comparisons you started with.

 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: correction procedures

2001-06-02 Thread Donald Burrill

On 2 Jun 2001, Bekir wrote in part:

 I performed a study on  different enteral nutrients and bacterial
 translocation in experimental obstructive jaundice.
  
 There was 5 groups of rats. Each group consists of 20 rats. Occurred
 Translocation incidences in mesenteric lymph nodes were shown in
 following table. My aim was to compare groups 2, 3, 4 with
 control(group 1)
Data table deleted;  see the original posting. 
Summary of group definitions and comparison results:
 
 Group 1 sham ligation of bile duct (fed rat chow)
 Group 2 bile duct ligated (fed rat chow)   *p = 0.08
 Group 3 bile duct ligated (fed enteral diet)   **   p = 0.02
 Group 4 bile duct ligated (fed enteral diet 2)
 Group 5 bile duct ligated (fed enteral diet 3)

 By chi squared test I calculated this p values.

You did not specify, but presumably the chi-square test in question was 
of a series of 2x2 tables, comparing the numbers of translocations that 
occurred (vs. the numbers that didn't) in Group 1 (your control group) 
with each of the other groups.
 
 The reviewer commented that I should do bonferroni correction, find
 adjusted p value and according to this adjusted value, I should say
 significant or not. However, in no study have I read that the authors
 had written that they had adjusted bonferroni correction, especially
 in a comparision by chi square test.

The 1-degree-of-freedom chi-square test described above is exactly 
equivalent to a z-test comparing the proportion of translocations in 
Group 1 with the proportion of translocations in the other group, for 
each conmparison of interest.  You may perhaps find references to 
Bonferroni adjustments in studies where z- or t-tests were used.

 If bonferroni was performed then adjusted p value 
 [Here you must mean the adjusted significance level alpha,
  not the p-value?  -- DFB.]
 would be 0.05/10=0.005, 10 = nx(n-1) in our study. 

I do not think so.  The number of comparisons you say you were 
interested in is three, not ten:  
 Group 2 vs. Group 1, Group 3 vs. Group 1, and Group 4 vs. Group 1.
If indeed these are the only comparisons of interest, and in the sense 
that these comparisons (and no others!) were planned from the beginning, 
then the adjusted p-values would be 0.02*3 = 0.06 and 0.008*3 = 0.024.

But I do not believe this, either.  If these were the only three 
comparisons of interest, you would not have bothered to include Group 5 
in the experiment.  It looks to me as though the original design had 
envisioned comparisons of Groups 2, 3, 4, 5 vs. Group 1, and may also 
have intended comparisons of Groups 3, 4, 5 vs. Group 2;  so that the 
number of comparisons for the Bonferroni correction would be either 4, 
or 4+3 = 7.  The corresponding adjusted p-values would be 0.02*4 = 0.08 
and 0.008*4 = 0.032;  or 0.02*7 = 0.14 and 0.008*7 = 0.056.

 Thus our results would not be significant. 
 Is it appropriate to make bonferroni correction or simes correction in
 this situation?
 Indeed I want to compare groups 2, 3, 4 with group 1. So there would
 be 4 comparisons.

Then you must mean compare groups 2, 3, 4, 5 with group 1?

 Is simes procedure is correct?  How can I make Simes correction?

Sorry, I'm not familiar with this procedure, at least not by that name. 
I hope this has been helpful.

 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Ninety Percent above Median

2001-05-31 Thread Donald Burrill

On Thu, 31 May 2001, W. D. Allen Sr. wrote:

 Only from the education field do we hear the statement that over ninety 
 percent of students ranked above the median!  The statement was made on 
 TV. 

(1)  I take it that it was the keyword students that led you to suppose 
that the statement had anything to do with the education field (rather 
than, say, the field of study the students were pursuing). 

(2)  The statement appears, however, not to have been made by any agency 
of the education field, but on TV -- by which one supposes you mean 
broadcast television.  That's not education:  that's entertainment.  
Or, possibly, news, or the deliberate distortion thereof.

(3)  A couple of colleagues have already pointed out how the statement 
you so scornfully cite might in fact be true;  although whether in fact 
any such interpretation can be believed is impossible to tell, in the 
absence of any context.

 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Variance in z test comparing percenteges

2001-05-18 Thread Donald Burrill

yOn Sat, 12 May 2001, RD wrote, inter alia:

 The only approach to deal with z test for means that I have seen so 
 far was using  s^2 = s1^2/n1 + s2^2/n2 formula. 
 t test is always using pooled variance. 
I think not _always_.  _Usually_, because (i) there is seldom 
a strong need to insist that the two [sub]population variances be 
different, (ii) the distribution of the t statistic is easier to find 
(no fractional numbers of degrees of freedom, e.g.), and (iii) the 
computations are easier.  But if one were concerned about (i), as for 
instance when the two sample variances are quite different, one might 
take the alternative approach.  (But see below.)

 Both  z test and percentages comparison test are using normal 
 distribution.  Thus, intuitively I was considering them as basically 
 the same with only difference in variance calculations.
 My problem is that using weighted p for one and not using pooled s^2 
 for another seemed inconsistent with that idea.
This is where you begin to go astray.  In the z test for means, 
the sampling distribution of the sample means (or of their mean 
difference) is (at least approximately) normal with mean mu and standard 
deviation sigma;  and mu and sigma are mutually independent, either 
because that's true of normal distributions or because that tends to be 
true of empirical data (more or less regardless of the empirical 
distribution).  But in the case of proportions (or, equivalently, of 
percentages) the underlying distribution is binomial:  and the mean and 
standard deviation of a binomial distribution are NOT independent, being 
(for the simple count of the event in question) np and SQRT(np(1-p)), or 
(for the proportion) p and SQRT(p(1-p)/n).  The fact that for  n  large 
enough the binomial distribution may be well approximated by a normal 
distribution with the same mean and variance does not alter the fact that 
the true distribution IS binomial, and thus has this direct connection 
between mean and standard deviation.
It follows that in an ordinary z-test (or t-test), one can make 
whatever assumption one finds useful, desirable, or convenient with 
respect to the variance of the difference, without affecting the truth 
value of the null hypothesis about the mean (or the difference in means, 
etc.).  But in dealing with proportions, if the null hypothesis specifies 
that P = a given value, that hypothesis ALSO specifies what the variance 
must be.  Hence a null hypothesis that P1 = P2, or equivalently that 
P1-P2 = 0, specifies that the variance of the observed difference must be 
based on the assumed common P in the population.  And the best estimate 
available for that common P is the usual weighted P, as you put it.

 Now you are saying that pooled variance may be used in z test. 
Sometimes, anyway.  Admittedly, the point is debatable:  if one 
is using a z test at all, one is implicitly claiming to know what the 
corresponding variances are, and if they're different, they're different. 
But if one is skeptical about the state of one's knowledge (as one 
probably ought to be, else why test an hypothesis about means at all?), 
one may suspect that one's knowledge of variances is imperfect in some 
degree.  Then if the variances in question are not very far apart, it may 
be desirable to average them in some way, such as the usual pooling (or 
equivalently weighting by numbers of degrees of freedom).  But this does 
not really change anything except the particular mechanics of finding an 
average variance.  Summing the two sampling variances of the respective 
means and taking the square root of the sum produces an averaged standard 
error of the mean difference.  Pooling the two variances to obtain an 
average variance, then multiplying by the sum (1/n1 + 1/n2) and taking 
the square root of that sum, produces another averaged standard error of 
the mean difference.  The two averages are unlikely to differ much 
(except in pathological circumstances, perhaps), so it's rather splitting 
hairs to argue which one is proper.  (And there's always the question, 
proper for what purpose or circumstances?)

 When would you use pooled variance in z test instead of sum and vice 
 versa? 
I wouldn't bother to prescribe.  If the separate variances were 
different enough to worry about, I'd probably want to use both a standard 
formula (pooled or sum, I don't care which) AND a test using the LARGER 
variance, to be able to assert (if it be true) that the null hypothesis 
can be rejected even under quite conservative assumptions.  I can imagine 
wanting also to use the SMALLER variance, so as to produce a range of 
standardized effect sizes that one might reasonably believe to cover the 
true effect size.

 What are we really testing:  just two means or whether those two 
 samples come from the same population? 
Precisely.  What we are really testing, if we are testing at 
all, may very well differ from 

Re: Intepreting MANOVA and legitimacy of ANOVA

2001-05-18 Thread Donald Burrill

On Fri, 18 May 2001, auda wrote (slightly edited):

 In my experiment, [when] two dependent variables DV1 and DV2 [were] 
 analyzed separately with ANOVA, the independent variable [IV (with ]
 two levels IV_1 and IV_2) modulated DV1 and DV2 differentially:
 
 mean DV1 in IV_1  mean DV1 in IV_2
 mean DV2 in IV_1  mean DV2 in IV_2
 
 If analyzed with MANOVA, the effect of IV was significant, Rao
 R(2,14)=112.60, p0.000.  How to interpret this result of MANOVA? 

Not enough information to tell.  If, for example, DV2 = -DV1 + C, 
C a constant, you would get results of the kind you describe above. 
The question unanswered as yet is whether the second DV adds any 
information to the system.  It's been a longish while since I did any 
MANOVAs, but I seem to recall a section of output showing step-down 
analyses for each formal effect of the ANOVA structure, in which each 
DV was reported in the order in which it had been considered, and a test 
reported as to the degree to which the effect on this DV was implied by 
the effect on previous DVs.

You haven't mentioned anything about interpreting the significant 
univariate effects, which leads one to suspect that they are 
interpretable enough.  What more do you think you want from MANOVA?

 Can I go ahead to claim IV modulated DV1 and DV2 differentially based 
 [on] the result from MANOVA?  Or I have to do other tests?

THAT you can claim based on the univariate results, unless DV1 and DV2 
are so closely (if negatively!) related that there is only one phenomenon 
occurring, rather than two:  which would be one possible reason for 
carrying out a MANOVA.

 Moreover, can I treat DV1 and DV2 as two levels of a factor, say, 
 type of dependent variable, and then go ahead to test the data with
 repeated-measures ANOVA and see if there is an interaction between  
 IV and type of dependent variable?

Certainly.  Of course, this is not testing the same set of hypotheses as 
MANOVA, so the results might be somewhat different;  and you have (as you 
have in any case) the problem of explaining (if it needs explaining) why 
it is reasonable for the effect of IV to be in opposite directions on the 
two DVs.  It might be informative to repeat some of your analyses after 
transforming one of the DVs to  (constant - old DV) .  Then a 
repeated-measures ANOVA would tell you whether the interaction effect, 
present with the original DVs, involved a difference in magnitude as 
well as a difference in sign.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: A regressive question

2001-05-16 Thread Donald Burrill

If the mean of the predictor X is zero, the intercept is equal to the 
mean of the dependent variable Y, however steep or shallow the slope 
may be.  And as Jim pointed out, the standard error of a predicted value 
depends on its distance from the mean of X (being larger the farther 
away it is from the mean, the confidence band being described by a 
hyperbola).  It would seem to follow that a test such as Alan asks about 
would be unusable if the mean of X is too close to 0, and would be (too?) 
insensitive if the mean of X is too far from 0.  An intermediate region, 
where a test of intercept vs. mean Y might be useful, might perhaps be 
defined in terms of the coefficient of variation of X (or perhaps its 
reciprocal, if the mean of X were in danger of actually BEING zero). 

One rather suspects that any such test would be less powerful than the 
usual test of the hypothesis that the true slope is zero, which might 
be an interesting proposition (for someone else!) to pursue.
-- Don.

On Wed, 16 May 2001, Alan McLean wrote:

 The usual test for a simple linear regression model is to test whether
 the slope coefficient is zero or not. However, if the slope is very
 close to zero, the intercept will be very close to the dependent
 variable mean, which suggests that a test could be based on the
 difference between the estimated intercept and the sample mean.
 
 Does anybody know of a test of this sort?

 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Question

2001-05-10 Thread Donald Burrill

On Thu, 10 May 2001, Magill, Brett wrote, inter alia:

 How should these data be analyzed?  The difficulty is that the data 
 are cross level.  Not the traditional multi-level model however.  

Hi, Brett.  I don't understand this statement.  Looks to me like an 
obvious place to apply multilevel (aka hierarchical) modelling.  
(Have you read Harvey Goldstein's text on the method?)  You have persons 
within organizations (just as, in educational applications of ML models, 
one has pupils within schools for a two-level model, and pupils within 
schools within districts for a three-level model), and apparently want to 
carry out some estimation or other analysis while taking into account the 
(possible) covariances between levels.
If you want a simpler method than ML modelling, the method Dennis 
proposed at least lets you see some aggregate effects.  (This does, 
however, put me in mind of a paper of (I think) Brian Joiner's whose 
temporary working title was To aggregate is to aggravate -- though it 
was published under another title.)  ;-)
Along the lines of Dennis' suggestion, you could plot Y vs X2 
(or X2 vs Y) directly, which would give you the visual effect Dennis 
showed while at the same time showing the scatter in the X2 dimension 
around the organization average.  For larger data sets with more 
organizations in them (so that perhaps several organizations would have 
the same (or at any rate indistinguishable, at the resolution of the 
plotting device used) turnover rate), you could generate a letter-plot 
(MINITAB command:  LPLOT), using the organization ID in X1 as a labelling 
variable.

Brett's original post presented this data structure:

 A colleague has a data set with a structure like the one below:
 
 IDX1  X2  Y
 1 1   0.700.40
 2 1   0.800.40
 3 1   0.650.40
 4 2   1.200.25
 5 2   1.100.25
 6 3   0.900.30
 7 4   0.500.50
 8 4   0.600.50
 9 4   0.700.50
 
 Where X1 is the organization.  X2 is the percent of market salary an
 employee within the organization is paid -- i.e. ID 1 makes 70% of the 
 market salary for their position and the local economy.  And Y is the 
 annual overall turnover rate in the organization, so it is constant 
 across individuals within the organization.  There are different 
 numbers of employee salaries measured within each organization.  The 
 goal is to assess the relationship between employee salary (as percent 
 of market salary for their position and location) and overall 
 organizational turnover rates.

 How should these data be analyzed?  The difficulty is that the data are 
 cross level.  Not the traditional multi-level model however.  That 
 there is no variance across individuals within an organization on the 
 outcome is problematic.  Of course, so is aggregating the individual 
 results.  How can this be modeled both preserving the fact that there is 
 variance within organizations and between organizations?

As I understand it (as implied above), this is exactly the kind of 
structure for which multilevel methods were invented.

 I suggested that this was a repeated measures problem, with repeated 
 measurements within the organization, my colleague argued it was not. 

This strikes me as a possible approach (repeated measures can be treated 
as a special case of multilevel modelling).  But most software that I 
know of that would handle repeated-measures ANOVA would tend to insist 
that there be equal numbers of levels of the repeated-measures factor 
throughout the design, and this appears not to be the case (your sample 
data, at any rate, have different numbers of individuals in the several 
organizations).

 Can this be modeled appropriately with traditional regression models at 
 the individual level?  That is, ignoring X1 and regressing Y ~ X2. 

That was, after a fashion, what Dennis illustrated.  In a formal 
regression analysis, I should think it unnecessary to ignore X1;  
although it would doubtless be necessary to recode it into a series of 
indicator-variable dichotomies, ot something equivalent.

 It seems to me that this violates the assumption of independence. 

Not altogether clear.  By this do you mean regression analysis?  
Or, perhaps, the particular analysis you suggested, ignoring X1?  Or...? 
And what assumption of independence are you referring to?  (At any 
rate, what such assumption that would not be violated in other formal 
analyses, e.g. repeated-measures ANOVA?)

 Certainly, the percent of market salary that an employee is paid is 
 correlated between employees within an organization (taking into 
 account things like tenure, previous experience, etc.).

Well, would the desired model take such things into account? 
(If not, why not?  If so, where is the problem that I rather vaguely 
sense lurking between the lines here?)
-- Don.
 

Re: A question

2001-05-04 Thread Donald Burrill

On Fri, 4 May 2001, Alan McLean wrote:

 Can anyone tell me what is the distribution of the ratio of sample
 variances when the ratio of population variances is not 1, but some
 specified other number?

Depends.  If the two samples on which the variances are based are 
_independent_, s^2(1)/s^2(2) is distributed as (Var(1)/Var(2)) times the
usual F distribution. 
 (My reference for this is Glass  Stanley (1970), pp 303-306.)

If the sample variances are based on so-called dependent (= correlated) 
samples, the problem is, apparently, much more difficult (beyond the 
scope of this textbook, GS write).

 I want to be able to calculate the probability of getting a sample ratio
 of 1 when the population ratio is, say, 2.

As the above remarks imply, if the samples are independent, that 
probability is the same as the probability of getting a sample ratio of 
0.5 when the population variances are equal (population ratio = 1).
 (Since the distribution is continuous, the probability that the sample 
ratio _equals_ 1 -- or 0.5 -- is zero;  but presumably your interest 
would actually be in, e.g., the probability that the sample ratio lies 
in the interval from 0 to 1 (or its complement, the interval from 1 to 
infinity);  or in some other interval with 1 at one end.)

Actually doing the calculation would require either F tables rather more 
extensive than the usual abbreviated versions that have only six to ten 
cumulative relative frequencies, or software like Minitab that can 
calculate probabilities for the standard F distribution.  
 (Take your sample ratio, divide it by the hypothesized population ratio, 
and ask Minitab to evaluate the quotient as an F value.)
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Please help

2001-05-04 Thread Donald Burrill

I rather think the problem is not adequately defined;  but that may 
merely reflect the fact that it's a homework problem, and homework 
problems often require highly simplifying assumptions in order to be 
addressed at all.  See comments below.

On Fri, 4 May 2001, Adil Abubakar wrote:

 My name is Adil Abubakar and i am a student and seek help.  snip
   if anyone can help, please respond to [EMAIL PROTECTED]
 
 Person A did research on a total of 4500 people and got the following
 results:

 Q. 1.  How many hours do you spend on the web?
 0-7 8-15  15+
 18% 48%   34%

 Q. 2.  Do you read a privacy policy before signing on to a web site?
 
  1=Strongly Agree 2=Agree 3=Neutral 4=disagree 5=strongly disagree 
 9%  17% 20%   32%22% 

If this were a research situation, or intended to reflect practical 
realities, there would also be information about the relationship between 
the answers to Q. 1 and the answers to Q. 2.  This information might be 
in the form of a two-way table of relative frequencies, or (with suitable 
simplifying assumptions on the variables represented by Q.1 and Q.2) as a 
ccorrelation coefficient.  Without _some_ information about the joint 
distribution, I do not see how one can hope to address the questions 
posed below.
 
 Another person asked the same questions of 100 people and got the same 
 results in % terms.  Can it be shown via CI that the result is
 consistent with the expectations created by the previous survey?

If the % results were indeed the same (so that all differences in 
corresponding %s were zero), it would not be necessary to use a CI (by 
which I presume you mean confidence interval) to show consistency. 
(HOWEVER, even identical % results do not imply consistency, unless at 
the same time the joint distribution were ALSO identical;  and you do 
not report information on this point.)

OTOH, if the results were merely similar but not identical, you would 
want some means of assessing the strength of evidence that resides in the 
empirical differences.  That in turn depends on the assumptions you're 
willing to make about the two variables:  do you insist on treating the 
responses as (ordered) categories, or would you be willing, at least pro 
tempore, to assign (e.g.) codes 1, 2, 3 to the responses to Q. 1, use the 
codes 1, 2, 3, 4, 5 supplied for Q. 2, and treat those values as though 
they represented approximately equal intervals?

 Also can it be argued that the subjects have been subjected to the
 questions before?

Not sure what you mean by this question.  If you know that the Ss have 
indeed been asked these questions previously (are they perhaps a subset 
of the original 4500?), no arguing is needed;  although what this would 
imply about the results is unclear.  If you mean, do the identical (or at 
least consistent) results imply that the Ss must have encountered these 
same questions previously, I do not see how that can be argued, at least 
not without more information than you've so far provided.  Perhaps more 
to the point, why would such an argument be of interest?

 Can it be asserted with statistical significance, that if the survey 
 is repeated on at least 100 people the result will [be] in the same 
 proximity of the above survey??

No.  I suggest you look closely at the definition of statistical 
significance:  the term is quite incompatible with the assertion you 
propose.  (If you don't see that, you might bring a focussed version of 
the question back to the list.  If you do see that, you may still have 
some question that is more or less in the same ball-park as the question 
you've asked here, and you may wish to bring the revised question to our 
attention.)

  any help ... will be appreciated.  Just need the different 
 methodologies. 

Yes;  but for which questions, exactly?
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Orthogonality of Designs for Experiments

2001-05-04 Thread Donald Burrill

Short answers below;  which may or may not adequately address the lurking 
questions you had in mind.

On Fri, 4 May 2001, Jeff wrote:

 Would like to ask [for] help with the following questions:
 
 1. why designs for experiments should be orthogonal ?

So that results for each factor, and each interaction between factors, 
will be mutually independent.

 2. which problems may I encounter if I use non-orthogonal design ?

Same kinds of problems you encounter in the general multiple regression 
situation:  apparent size of effect of any predictor (or factor) will 
depend on the presence or absence of other predictors in the model, and 
also on the sequence in which the several predictors (factors and their 
interactions) are considered in the statistical model.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: (none)

2001-05-03 Thread Donald Burrill

Thanks, Rich.  My semi-automatic crap detector hits DELETE when it sees 
things like this anyway;  but...  did you notice that although SamFaz 
(or whoever, really) claims to cite a bill passed by the U.S. Congress 
he she or it is actually writing from Canada?
I'm not quite sure what to make of that...

On Wed, 2 May 2001, Rich Ulrich wrote:

 On 1 May 2001 16:14:28 -0700, [EMAIL PROTECTED] (SamFaz Consulting)
 wrote:
 
 
 Under the Bill s. 1618 title III passed by the 105th US congress
 this letter cannot be considered SPAM as long as the sender includes
 contact information and a method of removal. To be removed, hit reply
 and type ?remove? in the subject line.
 
 
 Here was a message posted, that my reader saw as an attachment.
 The lines above were at the start of the SPAM.
 
 Ahem.  I am about 100% sure that the above is a lie.  In multiple
 ways.  For instance, Is there a legal definition of SPAM?

snip, useful advice, because you've all already read it
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: probability and repeats

2001-05-01 Thread Donald Burrill

On Tue, 1 May 2001, Dale Glaser wrote in part:

  a colleague just approached me with the following problem at 
 work: he wants to know the number of possible combinations of boxes, 
 with repeats being viable...so, e.g,. if there are 3 boxes then
 what he wants to get at is the following answer (i.e, c = 10):

Let me try to rephrase this.  We have a store of boxes.  There are  k 
(say;  here k = 3) different kinds of boxes, and we have a sufficiently 
large supply of each kind that when we amass  m  (say) different boxes, 
they may all be of the same kind.  In the example, m = k = 3.  It is not 
clear whether m = k is necessary, or whether (e.g.) one might have 3 
types of boxes, but be selecting groups of 4 boxes (or some other number).

As you point out, for m = k = 3, the number  n  (say) of different 
collections (or combinations?) of boxes is 10;  when m = k = 4,  n = 35.

Since you specify that the collection 331 is the same as 133, I'd want  
to report each collection in monotonic increasing order of box number, 
and list the collections in lexicographic order, thus (2nd column): 

 111 111 This way I'd be less likely to omit
 222 112 one or more collections inadvertently.
 333 113 (I think.)  
 112 122 If k = 3 and m = 4, we have  n = 15:
 221 123
 332 133  1133 
 113 222 1112 1222 2223
 223 223 1113 1223 2233
 331 233 1122 1233 2333
 123 333 1123 1333 
 
 ...so there are 10 possible combinations (not permutations, since 331 = 
 133)...however, when I started playing around with various
 combinations/factorial equations, I realized that there really isn't a 
 pool of 3 boxes ... there has to be a pool of 9 numbers, in order to 
 arrive at combinations such as 111 or 333  so any assistance would 
 be most appreciated as I can't seem  to find an algorithm in any of my
 texts..thank you.dale glaser

I can't offer you a convenient algorithm for calculating n for given m 
and k, but the following line of thought may perhaps suggest something to 
you.  For m = k = 3, we have 1 combination [123] with no repeats, 6 
combinations with one pair [112, 113, 122, 133, 223, 233], and 3 triplets 
[111, 222, 333].  The 6 can be arrived at by taking the k = 3 pairs and 
multiplying by the 2 possible odd singles, and if course the number of 
triplets (or in general m-tuplets) is k = 3.

For m = k = 4, there is again 1 combination with no repeats (because 
m = k);  and now 12 combinations involving 1 pair (there are k = 4 pairs, 
and for each pair there are 3C2 = 3 odd pairs [e.g., 1123,1124,1134]; 
12 combinations involving 1 triplet (k = 4 triplets, and for each there 
are k-1 = 3 odd singletons);  6 combinations involving 2 pairs (4C2, I 
think);  and k = 4 quadruplets.  In lexicographical order, these 35 
combinations are:
   1123 1222 1244  2244 
  1112 1124 1223 1333 2223 2333 3334
  1113 1133 1224 1334 2224 2334 3344
  1114 1134 1233 1344 2233 2344 3444
  1122 1144 1234 1444 2234 2444 
In lexicographical order within numbers of repeats as listed above:
  1234 1224 2234 1114 2224 1122 3344
  1123 1233 2334 1222 2333 1133 
  1124 1244 2344 1333 2444 1144 
  1134 1334 1112 1444 3334 2233 
  1223 1344 1113 2223 3444 2244 
but they may make better sense in an order that emphasizes the repeats:
  1234 1223 2334 1112 2224 2444 1144 
   1224 1244 1113 1333 3444 2233 
  1123 2234 1344 1114 2333  2244 
  1124 1233 2344 1222 3334 1122 3344 
  1134 1334  2223 1444 1133 

Anyway, good luck in finding an appropriate algorithm or formula for  n 
in terms of  m  and  k  (or just in terms of  k,  if the conditions of 
the problem require that m = k).
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/

Re: Joining edstat

2001-04-28 Thread Donald Burrill

On Sat, 28 Apr 2001 [EMAIL PROTECTED] wrote:

 I just joined the listserv.  Our professor is giving us extra credit if 
 we join an email list re: stats.  I was able to pull up one of his 
 messages from last year.  Pretty cool.  Have a great day!

You might ask him whether additional extra credit is awarded if you also 
learn about the usual rules of conduct, sometimes called netiquette. 
One of them is that persons posting to the listserv are expected to 
include their proper names at least, preferably accompanied by their 
affiliations (e.g., college or place of employment or home address, or 
combinations of these).  You might start by visiting the web site 
mentioned in the trailer automatically appended to this message by 
edstat.
Your e-mail program almost certainly has the facility to include 
a signature file (sometimes called a .sig) automatically;  and even if 
you think you have valid reason(s) for not doing that as a routine 
courtesy for all your e-mail, you can easily import such a file into 
your message for polite communication with listservs and other 
correspondents, and ought to do so.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Help me an idiot

2001-04-28 Thread Donald Burrill

On Sat, 28 Apr 2001, Abdul Rahman wrote:

 Please help me with my statistics.
 
 If you order a burger from McDonald's you have a choice of the 
 following condiments:  ketchup, mustard , lettuce. pickles, and 
 mayonnaise.  A customer can ask for all these condiments or any subset 
 of them when he or she orders a burger.  How many different 
 combinations of condiments can be ordered?  No condiment at all counts 
 as one combination. 

 Your help is badly needed.

Why?  All you have to do is construct all the possibilities and count 
them.  Shouldn't be that hard.  If you want a method for dealing with 
more general cases, that might be another matter, of course.  But even 
that would yield to the same procedure, if you went about it in a 
systematic enough fashion.

So how have you approached the problem so far? 
 (I'm a New Englander, and we tend to disapprove of laziness.  If you 
haven't even tried to solve it yourself [and problems like this are 
almost certainly dealt with in your textbook!], I'm not interested in 
providing any help at all.) 
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: A disarmingly simple conjecture

2001-04-28 Thread Donald Burrill

On Wed, 18 Apr 2001, Giuseppe Andrea Paleologo wrote:

 I am dealing with a simple conjecture. Given two generic positive 
 random variables, is it always true that the sum of the quantiles (for 
 a given value p) is greater or equal than the quantile of the sum? 
 
 snip, technical translation of the question into algebra 
 
 Any insight or counterexample is greatly appreciated. I am sure this 
 is proved in some textbook, but independently from that, I think this
 should be doable via elementary methods...

If this were a theorem, perhaps it should be.  But it does not seem 
inherently reasonable to me.  (Herman Rubin has provided a mathematical 
response denying the conjecture;  but I'd like to look at it from a 
different perspective.  I'd be interested in opinions whether this line 
of reasoning is valid.)  
If I understand you correctly, you conjecture that for two random 
variables (X and Y, say) and their sum (Z, say, = X + Y), the sum of the 
third quartile of X and the third quartile of Y would be greater than or 
equal to the third quartile of Z.  But this would seem to imply, by 
symmetry, that the sum of the _first_ quartile of X and the first 
quartile of Y should be LESS than or equal to the first quartile of Z.  
There being nothing especially magical about quartiles (whether first, 
second, or third), these two statements together would imply that the 
sum of a quantile of X and the corresponding quantile of Y must be BOTH
less than or equal to, AND greater than or equal to, the corresponding 
quantile of Z:  that is, the sum of the quantiles must always EQUAL the 
corresponding quantile of the sum.  But for this proposition, I believe 
there exist lots of counterexamples.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: errors in journal articles

2001-04-27 Thread Donald Burrill

On Fri, 27 Apr 2001, Lise DeShea wrote in part:

 I teach statistics and experimental design at the University of 
 Kentucky, and I give  journal articles to my students occasionally with 
 instructions to identify what kind of research was conducted, what the 
 independent and dependent variables were, etc.  For my advanced class, 
 I ask them to identify anything that the researcher did incorrectly.
snip, description of defective article

 One of my students wrote on her homework, It is especially hard to 
 know when you are doing something wrong when journals allow bad 
 examples of research to be published on a regular basis.

Mmmm.  It isn't really any harder to _know_ when you're doing something 
wrong;  it may be somewhat more disheartening to realize that there may 
be no adequate check on one's own silly mistakes, later.
I'd have pointed out to your student that one instance (possibly 
selected by her professor with malice aforethought? -- and even if not, 
the student wouldn't necessarily know that) hardly supports the phrase 
published on a regular basis.  Just emphasizes the need to maintain a 
healthy skepticism, and to be prepared to proofread with a critical eye. 
(Just 'cause it's printed doesn't mean it's true...)

 I'd like to hear what other list members think about this problem and 
 whether there are solutions that would not alienate journal editors. 

Not to mention one's (you should pardon the expression) colleagues. 
Depends partly on sensitivity of editors and/or authors to criticism. 
Mainly, as TR once put it, speak softly (i.e., politely) and carry a 
big stick (i.e., evidence that, even if politely phrased, clearly 
illuminates the fact of an error).  But it is worth remembering that 
journal editors (at least, the ones I've known) are editors only for 
limited terms:  three years is not unusual, I think, and while an editor 
may be reappointed for a subsequent (second, third, ...) term, it seems 
to be more usual to serve for two terms and then let somebody else do it. 

So even if you get off on a wrong foot with one editor, that misfortune 
needn't carry over to the next editor.

Some years back I encountered a systematic error in a journal article.
The author had reported total scores from a series of Likert-like items, 
and showed a histogram.  The histogram displayed decided spikes, about 
twice as high as the surrounding landscape, at regular intervals:  scores 
of 20, 25, 30, 35, apparently.  (Maximum score was 40, minimum 10.) 
These were so interesting that the author spent a page or more 
intepreting them (as the results of patterned responses by the 
respondents, by which was meant responding with all 3's (e.g.) to all 
items).  And indeed, if such patterning were present to any great degree, 
it would have showed up in just this way.
Only thing was, the histogram program used had been allowed to 
set its own parameters, and in the range of, say, 20 to 30, where there 
should have been ten scores, there were only eight histogram bars.  The 
spikes were of course the bars that contained two scores:  20 and 21, 
25 and 26, 30 and 31, etc.
First thing I did was write to the author.  Wasn't polite enough, 
I guess (although I was trying to be), because he never acknowledged my 
letter.  Then I e-mailed the editor, who wanted a response from the 
author before he took any action (which I thought reasonable enough), 
and suggested that I write a letter to the editor identifying the 
problem, which he'd then ask the author to reply to.  Various things 
intervened about then, and I never got that letter written, I'm afraid.

But I've frequently used that article as an example in class (usually 
presenting it as a puzzle, to see if anyone is sharp-eyed enough to see 
what's wrong, and usually presenting only the histogram and the relevant 
paragraph or two in the article).  Helps to illustrate the points 
reported above:  be skeptical, and sharp-eyed.  And I take the 
opportunity to point out that this error, obvious as it is once one has 
seen it, eluded the author, the audience at the AERA session where the 
paper was presented, the audience at a European meeting where it was 
presented, at least two associate editors (that journal routinely farms 
papers out to at least two readers before publishing), and the journal 
editor himself.  (And, presumably, most of the journal readership -- 
I never saw a critical letter from anyone else on this point.)

snip, various economic concerns

(Of course, you could always suggest that your _student_ to write a naive 
little letter to the author, asking naive little questions...)
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264  

Re: ANCOVA vs. sequential regression

2001-04-23 Thread Donald Burrill

On Mon, 23 Apr 2001, jim clark wrote:

 On 22 Apr 2001, Donald Burrill wrote:
  If I were doing it, I'd begin with a full model (or augmented model, 
  in Judd  McClelland's terms) containing three predictors:
  y  =  b0 + b1*X + b2*A + b3*(AX) + error
   where A had been recoded to (0,1) and (AX) = A*X.[1]
 
 A number of sources (e.g., Aiken  West's Multiple regression:
 testing and interpreting interactions) would recommend centering X 
 first (i.e., subtracting out its mean to produce deviation scores). 

Yes, this is always an option.  Usually recommended to avoid certain 
computational problems that may arise if the distribution of X has a 
particularly low coefficient of variation, for example, and if the model 
contains many variables (and in particular interactions among them).  
Such problems are unlikely to arise in so simple a model as [1], and are 
more effectively dealt with when they do arise by deliberately
orthogonalizing the predictors.  I've never quite understood why 
deviations from a sample mean, which is after all a random function of 
the particular sample one has, should be preferred either to the original 
values of X (unless there ARE distributional problems) or to deviations 
from some value inherently more meaningful than a sample mean.

 You might also consider whether dummy coding (0,1), as recommended by 
 Donald, would be best or perhaps effect coding (-1, 1).

Also a possibility, of course.  Note that the interpretations of the 
several coefficients (b0, b2, and b3 in particular) change with changes 
in coding of the dichotomy A.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: normal approx. to binomial

2001-04-10 Thread Donald Burrill

On Tue, 10 Apr 2001, Gary Carson wrote:

 It's the proportion of success (x/n) which has approxiatmenly a normal
 distribution for large n, not the number of success (x).

Both are approximately normal.  

(If the r.v. W = (x/n) is (approximately) normally distributed, then 
the r.v. V = x = n*W must also be;  only with a mean and standard 
deviation each  n  times as large as for W.)
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



SAT z3 (Was: Re: (no subject))

2001-04-02 Thread Donald Burrill

Everything you need is in what you wrote.

You do understand that "z" is the usual shorthand for "a standard score", 
and that a standard score is the representation of a given raw score as 
its deviation from the population mean in standard-deviation units? 

The rest is merely a lookup in a table of the standard normal 
distribution.  (I find it to be somewhat less than 0.15%, though.)
-- DFB.

On Mon, 2 Apr 2001, Jan Sjogren wrote:

 SAT scores are approximately normal with mean 500 and a standard
 devotion 100.  Scores of 800 or higher are reported as 800, so a 
 perfect paper is not required to score 800 on the SAT.  What percent 
 of students who take the SAT score 800?
 
 The answer to this question shall be: SAT scores of 800+ correspond 
 to z3; this is 0.15%.
 
 Please help me understand this.  I don't understand how I get that 
 z3??? and that it is 0.15%?

 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: convergent validity

2001-03-30 Thread Donald Burrill

On Fri, 30 Mar 2001 [EMAIL PROTECTED] wrote:

 Donald Burrill writes:
 
  On Thu, 29 Mar 2001, H.Goudriaan wrote in part:
  
   - my questionnaire items are measured on 5- and 7-point Likert scales,  
  
   and consequently not (bivariate) normally distributed;
  
  Real data hardly ever is.  Do you need it to be?  Usually the 
  question of interest is whether it's close enough to be an adequate 
  approximation for guv'mint work.
 
 Ok, I understand and agree.  But isn't it a bit naive to think that a 
 group of variables with 5 categories may result in a good factor 
 analysis (or whatever other parametric analyses)? 

I frankly don't see the relevance of naivete to the question at 
hand.  It isn't, one gathers, as though you had any choice in the matter:  
either in the number of points on each item scale (since this is all, as 
you told Dennis, an existing scale) nor in the bivariate distribution of 
the two constructs in which (one gathers) you are interested.  (And you 
haven't said why you think you want these two constructs to be bivariate 
normal -- rather than, say, linearly related and unimodal.  Nor, for that 
matter, have you indicated whether you have examined the bivariate 
distribution in question and actually found it to depart worrisomely from 
a reasonable distribution.)
You also replied to Dennis that you have 16 items, 11 of which 
are alleged to measure one construct and 5 measure another.  That sounds 
to me like two variables, one with a potential range of 11 to 55 and the 
other with a potential range of 5 to 25 (for the 5-point scales;  where 
you have 7-point scales the potential range will be somewhat wider).  I 
should think that your interest would then lie in the validity of these 
two variables, not in the individual items that contribute to them;  
unless you want to do an item analysis of one kind or another.
You write also, "with 5 categories".  If you insist that the item 
responses must be treated as _categories_, rather than ordered points on 
a scale, then you ought, one would think, to be applying the methods of 
dual scaling (also known as correspondence analysis).  Or, if you allow 
that the responses are ordered, use the variation of dual scaling that 
applies to ordered categories.  (All this for dealing with data at the 
item level, of course.)  
You haven't explicitly said (that I recall), but you seem to be 
unwilling to treat the item responses as of approximately interval 
scale.  Why not?  Do you have evidence that the scale intervals are 
grossly unequal?  (That seems to me unlikely.)  Or are the distributions 
of responses for some items so peculiar as to generate serious doubt 
about the intervals?  (If so, you might wish to convert any such item to 
a series of 0/1 categories -- which brings us back to dual scaling.)
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: convergent validity

2001-03-29 Thread Donald Burrill

On Thu, 29 Mar 2001, H.Goudriaan wrote in part:

 - my questionnaire items are measured on 5- and 7-point Likert scales,  
 so they're not measured on an interval level

Non sequitur.

 and consequently not (bivariate) normally distributed;

Real data hardly ever is.  Do you need it to be?  Usually the question of 
interest is whether it's close enough to be an adequate approximation for 
guv'mint work.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: statistical errors

2001-03-23 Thread Donald Burrill

On Thu, 22 Mar 2001, Paul R Swank wrote:

 I prefer the ocular test myself.

Were you referring to the intraocular traumatic test? 
(It strikes you between the eyes.)
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Was: MIT Sexism statistical bunk

2001-03-15 Thread Donald Burrill

On Thu, 15 Mar 2001, dennis roberts wrote in part:

 ps ... a conclusion that lots of people don't agree with one another 
 will not be too helpful

Maybe not, but it sure would be realistic -- which might be reassuring 
to some of our students who have their own doubts on that score about our 
discipline.
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: One tailed vs. Two tailed test

2001-03-12 Thread Donald Burrill

On Tue, 13 Mar 2001, Will Hopkins wrote in part:

 Example:  you observe an effect of +5.3 units, one-tailed p = 0.04. 
 Therefore there is a probability of 0.04 that the true value is less 
 than zero.

Sorry, that's incorrect.  The probability is 0.04 that you would find an 
effect as large as +5.3 units (or more), if (a) the true value is zero 
and (b) the sampling distribution of the test statistic is what you think 
it is.  (The probability of finding an effect this large, in this 
direction, is less than 0.04 if the true value is less than zero (and 
your sampling distribution is correct).)

  snip  

 But why test at all?  Just show the 95% confidence limits for your 
 effects, and interpret them:  "The effect could be as big as upper 
 confidence limit, which would mean  Or it could be lower 
 confidence limit, which would represent...  Therefore... "  Doing it 
 in this way automatically addresses the question of the power of your 
 study, which reviewers are starting to ask about. If your study turns 
 out to be underpowered, you can really impress the reviewers by 
 estimating the sample size you would (probably) need to get a 
 clear-cut effect.  I can explain, if anyone is listening...

You had in mind, I trust, the _two-sided_ 95% confidence interval!
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression with repeated measures

2001-03-08 Thread Donald Burrill

Hi, Rich.  The only answer I recall having seen on the listserve was one 
suggesting multilevel (aka "hierarchical") modelling.  If one wanted to 
address the problem without ML modelling, I'd be inclined to proceed as 
follows:
  (1)  I assume, in the absence of commentary to the contrary, that the 
"strong spatial correlations" among the values in the 6x6 grids have much 
the same structure from grid to grid and from respondent to respondent.  
(Even if this is an oversimplification, it's a starting point.) 
  (2)  Use the 6 rows and the 6 columns of the grid as categorical 
variables in an ANOVA-like approach;  the contents of each cell being, as 
you write, the dependent variable.  You don't mention what varible(s), 
nor how many of them, you're using as predictor(s);  but specify the 
analysis as an ANCOVA, in a GLM routine if you're using MINITAB, with the 
predictor(s) as covariates and the rows  columns as ANOVA factors. 
You'll get a 5-df measure of "row" effect, a 5-df "column" effect, and a 
25-df "interaction" effect;  if these are large enough to be interesting, 
you can try your hand at fitting various models to the pattern of results 
using whatever you think is going on in the "spatial correlations".
  If the structure of the spatial correlations is not replicated across 
grids, or at least across SOME grids, this approach may not be fruitful; 
but it can't hurt to try it, in any case.
  I'm not sure what to make of the 14 grids.  They might represent 
another (14-level) ANOVA factor, but I can't tell from your description.

Hope this is helpful.  As you probably recognized, it's essentially the 
same kind of approach I suggested to Mike Granaas for his repeated 
measures problem.
-- Don.

On Wed, 28 Feb 2001, Rich Strauss wrote:

 I don't have an answer, but I'm very glad this question was asked because
 I'm having a similar problem.  I have 14 grids, values from which are to be
 used as the dependent variable in a regression.  Each 6x6 grid consists of
 36 observation points.  Their are some fairly strong spatial correlations
 among the values at each grid, so I certainly can't treat them as if they
 were independent, yet reducing each grid to a single mean value (the other
 extreme) seems like a foolish waste of power.  I'm trying to figure out how
 to use all of the observations, but also use the estimated spatial
 autocorrelations to weight them in the regression.  (The design was
 originally created to answer a very different question, which is how I got
 into this mess.)

 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Patenting a statistical innovation

2001-03-07 Thread Donald Burrill

In response to dennis roberts, who wrote in part:

  i see "inventing" some algorithm as  snip  not quite in the same 
  genre of developing a process for extracting some enzyme from a 
  substance ... using a particular piece of equipment specially 
  developed for that purpose 
  i hope we don't see a trend IN this direction ...

On Wed, 7 Mar 2001, Paige Miller replied:

 If it so happens that while I am in the employ of a certain company, I 
 invent some new algorithm, then my company has a vested interest in
 making sure that the algorithm remains its property and that no one
 else uses it, especially a competitor.  Thus, it is advantageous for my 
 employer to patent such inventions.  In this view, mathematical
 inventions are no different than mechanical, chemical or other
 inventions.

Yes.  And in another domain of discourse, statistical methods invented 
by statisticians like Abraham Wald, who worked on military problems 
during WWII, were military secrets until the war ended. 
 "Official secret" is the governmental/military equivalent of "patent".
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: norm curve template

2001-03-06 Thread Donald Burrill

Dennis also included  [EMAIL PROTECTED]  among his addressees,
but I am not on that list and therefore cannot reply to them...

On Tue, 6 Mar 2001, dennis roberts wrote:

 may eons ago ... 1974 to be precise ... i had this idea of making a 
 small plastic normal and skewed curve template ... that would help 
 students draw both types ... with information about the distributions 
 on the template ... that would help them work with problems by being 
 able to make a nice sketch ...

Yes.  I had such a template for years, and found it very useful, both for 
preparing handouts and overheads.  I don't know where it is now -- it got 
lost at some point -- and I don't remember where I came across it in the 
first place, nor exactly when (about 1980, I would guess).  Both curves 
side by side comprising most of the long edge of a template about 7 
inches long, an internal straight line in two segments representing the 
usual X-axis, with printed marks for mean and +/- 1, 2, 3, s.d.'s for 
both distributions.  

 if anyone is interested in a historical artifact (relic?) ... 
 have a look at
 
 http://roberts.ed.psu.edu/users/droberts/statmat.jpg
 
 i still think it WAS a good idea ... just didn't have the right 
 "marketing" team in place

Yes, it was.  But somebody evidently did.
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: power,beta, etc.

2001-03-05 Thread Donald Burrill

In response to Dennis's earlier statement, 
"that is ... power in many cases is a highly overrated CORRECT decision"

I wrote:

 Well, no.  Overrated it may be (that lies, I think, in the eye of the 
 beholder);  but a _decision_ it is definitely not.  Power is the
 _probability_ of making a particular decision -- which, of course, 
 like all decisions, may or may not be correct.

On Mon, 5 Mar 2001, dennis roberts replied:

 sorry  we don't MAKE this decision ... the only decision we make 
 in this case is to reject the null ...

Precisely.  We DECIDE to reject the null hypothesis.  Why do you say, of 
this decision, "we don't MAKE this decision" ???

 it is only the statisticians who overlay on TOP of this ... the 
 consequence OF that reject decision ... saying that IF the null had 
 been false (of which the S has no clue about) ... THEN the consequence 
 of that reject decision is called power 

Sorry, Dennis.  Power is defined AS A PROBABILITY.  In particular, it is 
the probability of rejecting the [null] hypothesis being tested.  Some 
people prefer to define it as a conditional probability (we've been over 
this once already), the condition being that the hypothesis being tested 
is false.  
But power is not a _consequence_ of a decision, in any sense of 
"consequence" that I know about.  If you know a sense in which 
"consequence" applies, I'd be interested in the formal definition, and 
where that definition can be found in a standard reference.
As you pointed out in your earlier note, power is sometimes 
defined as 1 - beta.  This definition, when used, can make sense only if 
power is understood as a probability, since beta is a probability.
(At least, I understand beta to be a probability.  Are you going to 
argue the contrary?)  

 this is one reason i raised this issue ... because, we only make 2 possible 
 decisions with respect to our investigation ... we retain ... we reject ... 
 we DON'T determine the consequence of that decision ... so, in this sense 
 ... saying that there is a consequence associated with a particular act ... 
 retaining or rejecting ... "power is the probability of MAKING (emphasis 
 added from don's comment) ... a particular decision ... " ... sounds like 
 WE did this ... when we did NOT DO this

Dennis:  the decision in question was to reject the null hypothesis.  If 
"we did NOT DO this" (your emphasis), who, pray tell, did??
  And if the decision you're trying to talk about is not "to reject", 
what IS that decision that, you claim, "we did NOT DO"?

 all we did was to reject the null
Yes.  Precisely.  That was a 
decision, and we made it.  And the probability of our making that 
decision, like all probabilities, depends on the state of nature -- in 
particular, on the value of the parameter in question -- when we made it. 
That probability is called "power".

 i still think there would be value ... in:
 
 1. making it clear that the S only makes decisions of the retain kind 
 ... and reject kind ... that's it!

But is this somehow not clear from the beginning?  Who alleges that any 
other kind of decision is made?  If one is testing an hypothesis, the 
result of the test is a decision to reject or to retain (or, in the 
classical mode, to fail to reject).  ... 
 (I do hope that your "S" is not one of the experimental subjects in the 
study whose data are being analyzed (although that's what "S" usually 
refers to).  Those decisions are made by the investigator (or analyst), 
not by any participating Subject.)
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: power,beta, etc.

2001-03-04 Thread Donald Burrill

On Sun, 4 Mar 2001, dennis roberts wrote in part:

 i know that sometimes power is "defined" as 1 - beta ... but, beta 
 could therefore (algebraically and logically) be defined as 1 - power 

Only for the conditional definition of power;  I would wish to add the 
conditional clause "when the null hypothesis is false".

 so, these are circular in a way 

Yes, of course:  in the same sense that a glass may be said to be 
one-third full or two-thirds empty.

 that is ... power in many cases is a highly overrated CORRECT decision 

Well, no.  Overrated it may be (that lies, I think, in the eye of the 
beholder);  but a _decision_ it is definitely not.  Power is the 
_probability_ of making a particular decision -- which, of course, like 
all decisions, may or may not be correct.
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Fisher's z-transformation

2001-03-03 Thread Donald Burrill

On Sat, 3 Mar 2001, Arenson, Ethan wrote:

 Would someone please remind me the formula for Fisher's 
 z-transformation of correlation coefficients? 

Z = 0.5 log[(1 + r)/(1 - r)]   (using the natural logarithm).

Its standard error is   1/sqrt(n - 3)  ("sqrt" = "square root of").

To convert back:r = (exp(2Z) - 1)/(exp(2Z) + 1)
 ("exp(2Z)" is the natural antilogarithm of 2Z, aka  e to the power 2Z).

Equivalently,   Z = tanh(r)   and   r = inverse tanh(Z)
 ("tanh" = hyperbolic tangent).
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Trend analysis question

2001-03-03 Thread Donald Burrill

On Sun, 4 Mar 2001, Philip Cozzolino wrote in part:

 However, after the cubic non-significant finding, the 4th and 5th 
 order trends are significant. 
 
 Intuitively, it seems that if there is no cubic trend of significance, 
 there will not be any higher order trend, but this is relatively new 
 to me.
Your intuition is, in this case, incorrect.  The five 
trends are mutually independent in the sense that any combination of them 
may be operating.  (I am for the moment accepting the implied premise 
that a power function of the IV is a reasonable function to try to fit to 
your data.  In most instances I know of, this is not "really" the case, 
and the power function is more usefully thought of as an approximation 
to whatever the "real" functionality is.)  This may be seen by 
considering the following relationships between Y and X (think of them as 
DV and IV if you wish):

I. +   * *
   -*   *
   Y   -
   -*   *
   -
   + *  *
   -
   -   *  *
   - *
   -
   +-+-+-+-+-+-  X

II.+   *
   -  * **
   -
Y  -  **   *
   -
   +   * *   *
   -
   - *  * *
   -
   -   * *
   +-+-+-+-+-+-  X

In I. above, the linear trend is approximately zero, and the quadratic 
component of X accounts for nearly all the variation in Y.  A "rule" 
that claimed "If the linear trend is insignificant there can be no 
significant quadratic trend" is clearly false in this case.
 In II. above, both the linear and quadratic components of trend are 
virtually zero -- certainly insignificant -- and the cubic component 
accounts for nearly all the varition in Y.  Similar situations can be 
imagined, where only the quartic, or only the quintic, or only the 
linear, quadratic, and quartic, or any other arbitrary combination of 
the basic trends are significant, and other components are not.

If you are carrying out your trend analysis by using orthogonal 
polynomials (as you probably should be), try constructing the model 
derived from your linear + quadratic fit only, and plot those as 
predicted values against X;  then construct the model derived from linear 
+ quadratic + quartic + quintic, and plot those predicted values against 
X.  You may find it illuminating also to plot the residuals in each case 
against X, especially if you force the same vertical scale on the two 
sets of residuals.

I note in passing that you haven't stated how much of the variance of Y 
is accounted for by each of the significant components, nor how much 
residual variance there is after each component is entered.  That also 
might be illuminating.
-- DFB.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: power,beta, etc.

2001-03-03 Thread Donald Burrill

On Sat, 3 Mar 2001, dennis roberts wrote:

 when we discuss things like power, beta, type I error, etc. ... we 
 often show a 2 by 2 table ... similar to

  null truenull false

 retain   correct  type II, beta

 reject   type I, alpha power

Similar, but not the same.
I usually present a table   correcterror:  Type II
of "states of affairs", 
without probabilities;  error:  Type I correct
see table at right.
(And usually with the rows interchanged, so that "Type I error" LOOKS 
like the first kind of error one encounters.)  It seems to me that to 
include the probabilities in the same 2x2 table as the "states of 
affairs" would be actively to invite rampant (or at least, and more 
alliteratively, couchant) confusion of the concepts.

I have another problem with writing "power" in the lower right cell, 
apart from the fact that it's a probability and not a state of affairs. 
I'm aware that many people think of power as a conditional probability 
(of rejecting the null when it's false);  but I came to understand it as 
an UNconditional probability (of rejecting the null, period).  This 
definition permits drawing power curves that include the parameter value 
specified by the null hypothesis:  the power at that point (or, in that 
case) is alpha.  For a symmetric two-sided alternative, this is also the 
minimum value of power.  Since the value of power approaches alpha as the 
parameter value approaches the value specified in the null hypothesis, it 
seems a little silly to omit that one point from the continuous curve.

 i think that we need a bit of overhaul to this typical way of doing 
 things ... 

 1. each cell needs to have a name ... label ... that reflects the
 consequence of the decision (retain, reject) that was made

 i propose something along the lines of

   null true  null false

 retaintype I correct, 1C  type II error, 2E

 rejecttype I error, 1Etype II correct, 2C

I've long been persuaded of the need to distinguish between the two 
different kinds of errors.  That there are two distinct kinds is not at 
all obvious, evidently;  some folks seem never to master the distinction. 
But I am not convinced that we need to distinguish between two kinds of 
correct decision.  After all, the decisions themselves are different:  
to reject, or to retain (though some folks prefer "accept" to "retain"). 
Knowing the decision, and that it is (at least hypothetically) correct, 
is surely all one needs to know.  "Correct rejection" or "correct 
retention" (or "acceptance") of the hypothesis being tested seems to me 
easier to handle and apprehend than "a Type I correct decision" or "a 
Type II correct decision".

 then, we have names or symbols for probabilities attached to each cell

null true  null false

 retain  WHAT NAME/SYMBOL FOR THIS??beta

 reject  alpha  power

If you want to construct such a table, I'd recommend including the 
marginal row, showing the column totals to be 1 (or, if one prefers, 
100%).  That helps to emphasize the conditional nature of the 
probabilities being displayed:  conditional on the state of nature, not 
on the decision.  And consistent with my understanding of power, I'd 
present such a table thus:

   State of nature
 null true null false

P{retain}1 - alpha   beta

Power alpha1 - beta
-- 
 Total 1  1

Sometime along about now one really ought to point out that a 2x2 table 
like this is grossly oversimplified.  Beta (and therefore power) cannot 
be evaluated for "null false".  It can be evaluated only for a specified 
particular value of the parameter that is different from the value 
specified in the null hypothesis.  And, ceteris paribus, the farther that 
parameter value is from the null-hypothetical value, the smaller is beta 
(and the larger is power).  This leads more or less directly to the idea 
of a power curve, and then to the variations in such a curve as a 
function of alpha and sample size.

 DOES ANYONE HAVE SOME SUGGESTION AS TO HOW THE UPPER LEFT CELL MIGHT BE 
 REFERRED TO via A SYMBOL??? OR, SOME NAME THAT IS DIFFERENT FROM POWER 
 BUT ... STILL GIVES THE FLAVOR THAT A CORRECT DECISION HAS BEEN MADE 
 (better than making an error)?

Do you have a reasoned objection to "1 - alpha"?  In other contexts we 
routinely use, e.g., "1 - Rsq" for the proportion of variance unexplained 
by the model being considered.  The "1 minus" construction shows the 
logical and arithmetical connection between two quantities, which can 
easily get lost if one uses very different-looking terms for those 
quantities.

 2. i think it would be helpful to first identify each cell with a
 

Re: Post-hoc comparisons

2001-03-02 Thread Donald Burrill

Hi, Esa!
You've had a couple of responses;  here's another. 
 You state "pairwise comparisons";  but it strikes me as at least 
possible that you might want (or might _also_ want) to consider more 
complex comparisons if any such comparisons seemed to offer a more 
parsimonious (or, perhaps, more theory-related?) explanation of the 
differences among the four conditions.  (E.g., conditions A  B vs. 
conditions C  D;  or, condition B vs. conditions A  C  D;  or,
condition A vs. conditions B  D and condition C vs. conditions B  D.)
I would ordinarily think of using the Scheffe' method (or the
Tukey method, if the sample sizes were equal in each condition and one's
interest really were _only_ in pairwise comparisons):  its experimentwise 
Type I error rate means no need for Bonferroni or similar calculations; 
just convert your binary response to a proportion passed (or proportion 
failed, if that be easier to interpret) and do a one-way ANOVA on that 
proportion in the four treatments. 
-- Don.

On Fri, 2 Mar 2001, Esa M. Rantanen wrote:

 I have a question concerning pairwise comparisons between four 
 treatment conditions.   snip   I have a single factor experiment with 
 four levels of the factor (treatment conditions) and a discrete 
 dependent measure (pass/fail), resulting in a 2 x 4 contingency table.  
 ... Chi-Sq. analysis [has found] a statistically significant difference 
 between  the (treatment) groups (all 4!).  snip  

 I would appreciate [it] if anyone would confirm my reasoning above and 
 offer any advice on how to proceed with the analysis of pairwise 
 differences in the case of categorical (dichotomous) data.  References 
 to relevant literature would also be welcome!

 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression with repeated measures

2001-03-01 Thread Donald Burrill

On Wed, 28 Feb 2001, Mike Granaas wrote in part (and 2 paragraphs of 
descriptive prose quoted at the end):

 ...  is there some method that will allow him to get the prediction
 equation he wants?

Probably the best approach is the multilevel (aka hierarchical) modelling 
advocated by previous respondents.  Possible problems with that approach: 
(1) you'll need purpose-built software, which may not be conveniently 
available at USD;  (2) the user is usually required (as I rather vaguely 
recall from a brush with Goldstein's ML3 a decade ago) to specify which 
(co)variances are to be estimated in the model, both within and between 
levels, and if your student isn't up to this degree of technical skill, 
(s)he may not have a clue as to what the output will be trying to say. 

For a conceptually simpler, if less rigorous, approach, the problem could 
be addressed as an analysis of covariance (to use the now old-fashioned 
language), using the intended predictor as the covariate and the 10 (or 
whatever number of) trials for each S as a blocking variable (as in 
randomized blocks in ANOVA).  This would at least bleed off (so to write) 
some of the excess number of degrees of freedom;  especially if one also 
modelled interaction between predictor and blocking variable (which might 
well require a GLM program, rather than an ANCOVA program), as in testing 
homogeneity of regression.  The blocking variable itself might be 
interpretable (if one were interested) as an (idiosyncratic?) amalgam of 
practice/learning and fatigue.
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128
 --

The situation as Mike desribed it:

 I have a student coming in later to talk about a regression problem.
 Based on what he's told me so far he is going to be using predicting
 inter-response intervals to predict inter-stimulus intervals (or vice
 versa).
 
 What bothers me is that he will be collecting data from multiple trials
 for each subject and then treating the trials as independent replicates.
 That is, assuming 10 trials/S and 10 S he will act as if he has 100
 independent data points for calculating a bivariate regression.



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: basic stats question

2001-02-26 Thread Donald Burrill

Perhaps jthis is too superficial -- no time to think more deeply just 
now.  But I suspect the difference between your two scenarios below is 
that with exactly 5 computers to deal with (i.e., population size = 5) 
you are sampling without replacement (which is only sensible, for the 
background scenario!);  whereas with the textbook problem you are 
assuming that the probabilities do not change (and in any case they 
aren't the probabilities that correspond to your N=5 situation!), which 
is equivalent to sampling _with_ replacement (or, what is much the same 
thing, assuming the number of entities available to sample from is 
infinite -- which is probably _not_ sensible for any real-life 
scenario!). 
-- Don.

On Mon, 26 Feb 2001, James Ankeny wrote in part:

 ... consider a problem where a manufacturer has five seemingly 
 identical computers, though two are really defective and three are 
 good.  ... we want the probability of the event A="order is filled with 
 two good computers." ... then
 S={D1D2,D1G1,D1G2,D1G3,D2G1,D2G2,D2G3,G1G2,G1G3,G2G3}. Thus, P(A)= 0.30.

  snip  

 ...  Yet, another similar problem in my textbook states that the 
 probabilities of a computer being good and defective (from a particular 
 manufacturer) are 0.90 and 0.10, respectively.  Then, if we want to test 
 five computers, we may construct the sample space S=S1xS2xS3xS4xS5, 
 where Si={G,D} for i=1,...,5. Hence, if A="all five computers tested are 
 good," P(A)=(0.90)^5.  Why is that we can use the Cartesian product in 
 this case but not in the other case? 

 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: pizza

2001-02-24 Thread Donald Burrill

On Sat, 24 Feb 2001, Mike Granaas wrote:

 Interesting point.  Yes, if the Ss do something other than a random guess
 the binomial model would be violated.  The question then becomes what
 would they do if they are uncertain?  I suspect that they would fall back
 on visual inspection...which piece appears to be different than the others
 (less green pepper, more browned, etc)  Such information is probably
 relevant often enough that "guessing" would be well above 1/3.

So what you would then have is evidence that Ss can in fact do better 
than "chance", but you might NOT know whether that improvement is due to 
their actually being able to perform as claimed, or to some other 
factor(s) relevant to identifying the "odd pizza out":  a human-cum-pizza 
version of "Clever Hans", pehaps?

 Using blindfolded Ss will deal with that problem, and gets us back to
 the question that Dennis is asking.  I'm guessing that rather than going
 through some sort of a systematic process (e.g. binary decision for the
 first piece, progress to second piece only if first piece was judged
 "same".) 
Umm:  Logical problems here. 
 (1) How can _first_ piece be judged "same"?  Same as what? 
 (2) Why would Ss not taste all three pizzas, given the ground rules 
Dennis specified (or implied) at the outset? 

 ... Ss will in fact do something more like guessing.  Only they
 will condition their guesses such that if they picked slice A as different
 on the previous trial they will first consider slices B and C on the
 current trial (they will actually avoid selecting the same slice position
 on sequential trials). 
How did "sequential trials" get into the 
scenario?  As I read Dennis' description, each S was to taste the three 
pizzas presented (perhaps tasting each more than once, but not attacking 
a whole 'nother SET of pizzas).

 Furthermore they will try to equalize the number
 of position choices they make across the experiment so that they choose
 each of A, B, and C three times and one of those a fourth time.

This sounds as though you thought each S were going to have ten separate 
trials at identifying the "odd pizza out", with a different set of three 
pizzas each time.  I don't see how else "choosing each of A, B, and C 
three times and one of those a fourth time" could mean anything else; 
but if I've misunderstood, doubtless your reply will explain.  
However interesting such an experiment might be, it's not the experiment 
that I thought Dennis described.
 
  snip,  the rest  
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: intermediate stats textbook?

2001-02-20 Thread Donald Burrill

A quick reply.  Looks somewhat like the second course ("Intermediate 
Statistics and Research Design") I taught for some years at OISE, 
Toronto, which was (and is) the Graduate Department of Education for 
the University of Toronto.  Ask for more later if you want...

On Tue, 20 Feb 2001, Lise DeShea wrote:

 I am looking for a textbook to use in a second-semester stats course in 
 a College of Education.  ...  Material covered in the class includes:
 
 -- Review of kinds of research; kinds of variables; hypothesis testing;
 errors; one- and two-sample stats; and simple regression
 -- one-way and two-way fixed-effects ANOVA
 -- Multiple comparison procedures (usually I provide supplemental
 material instead of relying on the text)
 -- Intro to power
 -- simple repeated measures design
 -- split-plot design
 -- Intro to multiple regression

I'd summarize this as ANOVA, multiple comparisons, and multiple 
regression.  I could usually find a textbook suitable for ANOVA (more on 
that below), which I could supplement with an intro to MR I'd compiled 
that was rather heavily based on Bottenberg  Ward;  or a textbook good 
for MR, which I could supplement with the "Rules of Thumb for Writing the 
ANOVA Table" (originally Millman  Glass, J. Educ. Meas. 1967, and 
reproduced in Glass  Stanley 1971 and Glass  Hopkins 1984:  "Stat'l 
Methods in Educ.  Psychology").

Textbooks for ANOVA:  Glass  Stanley (later Glass  Hopkins, 2nd ed.), 
esp. when I'd used it for the first course ("Elements of Statistics").
Keppel:  Design  Analysis:  A Researcher's Handbook.

Textbooks for MR:  Darlington:  Regression  Linear Models.
Judd  McClelland:  Data Analysis/A Model-Comparison Approach.
Pedhazur:  Multiple Regression in Behavioral Research.

Never did find a text that combined BOTH multiple regression  ANOVA 
with enough depth for my purposes.  (Good luck!)

 I am looking for a book that is conceptual so that my generally
 math-phobic grad students in Education don't freak out with
 "symbol-shock," yet I want careful coverage of assumptions and
 robustness. 

Your "yet" suggests you're aware of the inherent logical contradiction in 
that sentence :-).  In my view it is not useful to pander to math-phobia; 
one must deal with it, but in a strategy that helps the feckless student 
to cope with the phobia and, just maybe, eventually overcome it.  One of 
my stategies was to emphasize at the outset that this is NOT a course in 
mathematics (nor, really, a course in statistics, though I didn't usually 
say THAT), but a course in several foreign languages (algebra, computing, 
statistics, research come to mind) and in a quite foreign (or at least 
unaccustomed) style of thinking.  
(Just for one example:  for students like yours, it's virtually 
certain that no one has ever told them that algebra is mostly about using 
pronouns instead of proper nouns for talking about numbers;  so that 
treating the particular forms of algebra used in statistics as a kind of 
grammarian's approach to quantitation is wholly new, and might get their 
minds off the phobic stuff for a minute or two.  Thus "X" is a pronoun 
for "a particular value of a variable", where "variable" itself is a 
pronoun for "performance on the English test" (or whatever!);  the 
subscript "i" hung onto the "X" is a pronoun for "the individual who 
supplied that particular datum";  so "X_i" is "that particular datum 
whose value is 17.5 when the individual referred to is no. 4, who is Mary 
Smith".  Similarly with symbols for operations:  you want to add up X_1 
and X_2 and ... and X_n to get the Sum of all the X_i, or perhaps more 
simply "Sum(X_i)", but "Sum" is a long word (3 letters!), so we 
substitute its initial "S", but spell it in Greek (Sigma producing the 
same sound in Greek as S does in English) in ouir ongoing campaign to 
help convince the uninitiated that, really, it's all Greek to us...)

Incidentally, on ANOVA:  I've never been convinced that all that 
agricultural terminology was much help to anyone except agronomists and 
the like, so I _never_ used terms like "split plot".  By starting off 
with a generalizable symbology, one can focus on the _ideas_ of the 
details, and the carrying out of the details, without having to know 
rather artificial labels just to be able to look up the relevant design. 
Using  AxB  for two crossed factors,  C(D)  for a factor C nested in 
another factor D,  superscripts "r" and "f" for "random" and "fixed" 
(or just an asterisk * for "random", if you prefer), and subscripts for 
the number of levels of a factor, one can represent any complete 
balanced design in a single formula like
R*(S*(CxG)xM)
 for "Replications" (what someone has called "the ubiquitous nested 
factor", which in a design like this one might only have one level, so it 
has zero SS and zero df, but its presence helps one see what the proper 
denominator mean squares would be if one had them available) nested 
within "Subjects", which 

Re: citations journals (satire)

2001-02-17 Thread Donald Burrill

I note that in the literature cited, the word "nauseam" (in the Latin
phrase "ad nauseam") is misspelled both times it appears. 
-- DFB.

On Sat, 17 Feb 2001, Jeff Rasmussen wrote:

 a spoof on the glut of journals:
 
 http://psychology.iupui.edu/skew/milestn.htm
 
 "Writing a scientific paper and expecting an effect is like dropping a 
 lotus petal into the Grand Canyon and waiting to hear an echo" 

 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Two sided test with the chi-square distribution?

2001-02-10 Thread Donald Burrill

On Thu, 8 Feb 2001, jim clark wrote in part:

 We all agree that it is confusing, but I do believe that the use
 of one-tailed and two-tailed to refer to directional vs.
 non-directional hypotheses (rather than uniquely to one or two
 tails of a distribution) is very wide-spread and quite common.  

There would not be a problem if the hypotheses in question were STATED.  
It's this sloppy habit of saying "F test" or "chi square test", with no 
hint of WHICH "F test" or "chi square test"  one is talking about, that 
impedes communication.

 That is probably what led to the posting that initiated this
 thread.  

Yes.  "I thought the chi-square test was always two-sided", or words to 
that effect, the querent wrote.  He she or they have not, in all the 
correspondence since, said what the hypothesis being tested was.

I had written:
  It is still possible to use the F _statistic_ to test the null 
  hypothesis that Var1 = Var2, in circumstances where it is entirely 
  possible that Var1  Var2, Var1 = Var2, or Var1  Var2.  In such 
  cases _both_ tails of the F distribution are of interest, not just 
  the upper tail.
--- and Jim replied:

 Right, but if one calculates F_larger/F_smaller, then one is only
 looking at the upper tail of the F distribution even though one
 is doing a non-directional test (i.e., two-tailed in the
 vernacular).  The appropriate critical value for a
 non-directional test would be F_.05. 

Whoops!  Not if you want to test at the usual 5% level!  For a 
non-directional test of the null hypothesis that two variances are 
equal, the critical value would be  F_(alpha/2).

 If you made a directional hypothesis and predicted which variance was 
 going to be larger (as implied in F's use for anova and regression), 
 then you would compare the obtained value of F to F_.10, not F_.05.  

I'll agree with you if you halve those subscripts!  (Or acknowledge that 
you wanted to test at the 10% level...)

You state that using F in ANOVA and regression imply that one had a 
_prediction_ of which variance would be larger.  This is not how I 
understand the idea of "predicting", which I take to imply that one could 
have predicted something in the opposite direction.  In ANOVA the null 
hypothesis _of interest_ is commonly expressed as "all the means are 
equal" (in some language or other), vs. "some of the means differ", and 
the alternative hypothesis is indeed non-directional -- in the metric of 
the subgroup means.  But the hypothesis actually _tested_ (using F) is 
the null hypothesis that a particular variance component is zero, vs. the 
alternative that it isn't, and since a variance component cannot be 
negative, the alternative really is that the variance component in 
question is positive:  thus in the metric of variances the alternative 
hypothesis is one-sided.  This is a matter of algebra, not of 
"predicting" the direction of an effect.  
However, perhaps others are more willing to use "predict" in 
this rather sloppy (from my point of view ;-) fashion.

 You are using the upper tail (i.e., one-tail) of the distribution to 
 test a directional (i.e., "one-tailed") hypothesis.

Yes.  Because a result in the _lower_ tail would tend to confirm the 
null hypothesis, not reject it.

 Like Don, I hope that language can become clearer on these
 issues, but my suspicion is that it will be a long, long time
 before one- vs. two-tailed stops meaning directional
 vs. non-directional alternative hypotheses for most people.

I have no problem with that.  I just wish that people would say what 
they're talking about:  if it's a hypothesis test that is of concern, 
what is the hypothesis and what is the test statistic, for example. 
To say only "chi-square test" or "F test" or "z test" is simply 
insufficient. 
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: ANOVA : Repeated Measures?

2001-02-09 Thread Donald Burrill

If for each Subject you have 4 Measures in each of the 3 Conditions, then 
both Conditions and Measures are repeated-measures factors:  you design 
may be symbolized as   S x C x M  -- that is, Subjects (5 levels) are 
crossed with both Conditions and Measures.  This design is equivalent to 
  R(SxCxM)  where R (Replications) is "the ubiquitous nested factor" as 
one author has put it, random with one level.  (And since it only has one 
level, it has zero degrees of freedom and zero sum of squares;  but using 
it formally often helps one to see what the proper error mean square 
would be for each effect modelled in the design, even if no such mean 
square is actually available in the data.)
Your choices then are, for each of the three factors, whether to 
treat it as fixed or random.  Conditions are presumably fixed -- they 
usually are, because they usually represent all the conditions one is 
interested in considering.  (I can imagine wanting to treat them as a 
random sample of 3 drawn randomly from a population of possible 
experimental conditions, but that seems to me very unlikely.)
Measures might go either way.  If what they represent is a series 
of opportunities to observe the subjects' response to each condition, one 
might treat the factor as fixed, the levels representing the sequence 
(1st, 2nd, 3rd, 4th) in which the opportunities are presented.  This 
would permit examining differences among the 4 levels as possibly 
reflecting learning (one becomes a little more skilled each time one is 
asked to respond to a condition, perhaps?), or fatigue (after one has 
done it once, the action starts to become boring or otherwise wearisome), 
or a kind of resultant between learning and fatigue.  Or, if you really 
think it reasonable to model each encounter as equivalent to each other 
encounter (in the same Condition), and the only variation among levels of 
Measure is random replication variance, Measure might be treated as 
random. 
Subjects are usually treated as random, because one usually wants 
to generalize to a population of subjects "like these", and one may even 
have selected the Ss randomly from a pool of potential Ss for the 
experiment.  But you haven't very many Subjects, and perhaps you want to 
model individuial differences between them of some kind or other;  or, 
for some as yet unspecified reason, you are interested only in these 
particular Ss and not in a population of Ss which they might be argued to 
represent;  in either of which cases you may wish to treat Ss as fixed. 
Of course, to carry out _any_ tests of hypotheses, at least one 
of the three factors must be declared random, or you will have no 
legitimate error mean square against which to test the hypothesis mean 
square for any of the possible effects.
In terms of your three possibilities:
 (a) has C and S fixed, M random;
 (b) has C and M fixed, S random (although I don't think it correct to 
describe S as a "repeated-measure" factor:  in my lexicon, a "repeated 
measure" factor is any factor in a design that is _crossed with_  S);
 (c) has C fixed, S and M random.

It may be informative to carry out more than one formal analysis, 
using different fixed/random choices.  This would tell you what results 
are robust with respect to those choices, and what results depend on how 
you choose to treat one or another of the formal factors.  In case it's 
useful, here is a table of the proper error mean squares for each effect: 

  Error mean square under
 Source(a)  (b) (c)
C  CM   CS  (CS + CM - CSM)
S  SM   --  SM
M  --   SM  SM
CS CSM  --  CSM
CM --   CSM CSM
SM --   --  CSM
CSM--   --  --

(Where the entry is "--", the proper error mean square would be R(SCM), 
if it were available.  In its absence, one could use the mean square for 
CSM, making the assumption that there is no 3-way interaction -- that may 
or may not be a reasonable assumption to make.)
-- DFB.

On Fri, 9 Feb 2001, Sylvain Clment wrote:

 We have data from an experiment in psychology of hearing. There are 3
 experimental conditions (factor C). We have collected data from 5
 subjects (factor S). For each subject we get 4 measures of performance
 (M for Measure factor) in each condition. What is the best way to
 analyse these data?
 
 We've seen these possibilities :
 
 a)  ANOVA with repeated measures with 2 fixed factors : subjects 
 conditions  and the different measures as the repeated measure factor
 (random factor).
 
 b) ANOVA with two fixed factor (condition  measure) and a random
 factor (repeated measure- subject factor).
 
 c) ANOVA with one fixed factor (condition) and the other two as
 random.
  snip, arguments in favor of one or another of these

Re: Two sided test with the chi-square distribution?

2001-02-08 Thread Donald Burrill

On Tue, 6 Feb 2001, jim clark wrote in part:

 The problem is that one-tailed test is taken as synonymous with
 directional hypothesis (e.g., Ha: Mu1Mu2).  This causes no
 confusion with distributions such as the t-test, because
 directional implies one-tailed.  This correspondence does not
 hold for other statistics, such as the F and Chi2.  

The statement is not correct.  The correspondence certainly holds for 
F and chi-square _statistics_.  What it seems not to hold for is 
certain particular hypothesis tests for which those statistics are the 
commonly used test statistics.  The "large F" Jim speaks of below celarly 
refers to an analysis of variance (and one with only two groups, at 
that!).  In that context, while the hypotheses _of interest_ are the 
null hypothesis that the several means Mu_j are identical, vs. the 
two-sided alternative hypothesis that some of them are different, the 
formal hypothesis tested by the F statistic is the null hypothesis that a 
certain variance component equals zero, vs. the alternative hypothesis 
that it does not equal zero;  and since a variance component cannot be 
negative, the _test_ is one-sided, in the metric of variances:  one 
rejects only for F sufficiently greater than 1 for the result to be 
improbable under the null hypothesis. 

It is still possible to use the F _statistic_ to test the null hypothesis 
that Var1 = Var2, in circumstances where it is entirely possible that 
Var1  Var2, Var1 = Var2, or Var1  Var2.  In such cases _both_ tails of 
the F distribution are of interest, not just the upper tail.

Similarly, one may use Chi-square to test the null hypothesis that a 
variance has a specified value, and wich to reject if the evidence leads 
one to believe that the true value is LESS, OR if the true value is 
GREATER, than the value specified.

 One can get a large F by either Mu1Mu2 or Mu1Mu2 (or by positive or 
 negative R, ...).  Therefore the one-tail of the distribution 
 corresponds (normally) to a two-tailed or non-directional test.  
 However, there is absolutely nothing wrong with making the necessary
 adjustment to make the test directional (i.e., equivalent to the
 one-tailed t-test), and therefore referring to it (confusingly,
 of course) as a one-tailed test. 

On this point, one must agree with Thom:  such a use of language can only 
be confusing, as you acknowledge.  "Newspeak", it was called in "1984".

-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: unequal n's: quadratic weights

2001-01-31 Thread Donald Burrill

On Tue, 30 Jan 2001, Kathleen Bloom wrote:

 If you have unequal n's, and want to determine linear parameters, you can
 develop new coefficients by taking the normal unweighted coefficients
 (e.g., -1, 0, +1, for three group design) and the formula:
  n1(X1) + n2(X2) + n3(X3)/ n1+n2+n3   where the X's are 1, 2, and 3
 because you have 3 groups.  This gives you a new mean of the Xs... (i.e., no
 longer 1+2+3/3 = 2), and from there you calculate the new coefficients
 (e.g., 1 - ?, 2 - ?, 3 - ?, gives you the new linear coefficients) for the
 3-group design with unequal n's. 

You get the same results if you use X's corresponding to the unweighted 
coefficients (-1, 0, +1).  I should suppose that for quadratic estimates 
you'd play the same game with the quadratic unweighted coefficients 
 (+1, -2, +1).  However, I've never played around much with weighted 
trend analyses, so my supposition may possibly be incorrect.  It may be 
better to retain the grand mean calculated from (-1,0,+1), equal to your 
"?" minus 2 (let's call that ""), and generate coefficients from the 
unweighted quadratic coefficients as  1-, -2-, 1-.

I note in passing that your decision to pursue weighting-by-sample-size 
implies that you have decided to assign equal weight to individual cases 
and NOT to assign equal weight to each subgroup.  (Had you chosen equal 
weight for each subgroup, you'd use the "unweighted" (they're not really 
UNweighted, they're _equally_ weighted) coefficients directly.)  I've not 
encountered situations where it seemed necessary to give equal importance 
to each individual case, enough to make it worth the extra effort to 
weight the coefficients -- and think about what it means, that the "grand 
mean" for the data depends on which trend you're currently pursuing, and 
that the several trends (linear, quadratic, cubic, ...) are explicitly 
NOT orthogonal.

 From there you can do things like determine
 the the weighted linear estimated parameters.  They are given in the spss
 oneway printout... as I understand it... i.e., the weighted (for sample
 size) beta for the linear contrast.

Notice that all this does is change the distance between each group mean 
and the grand mean;  it does not change the relative distances between 
groups, which are still equally spaced.  It has never been very clear to 
me what advantage one gets from the weighted parameters, especially as 
those estimates depend on the accident of how many observations you were 
able to find for each group.  For this reason (among others) I am 
inclined to favor equal weighting in general.  If it turns out that the 
choice of weighting influences the conclusion(s) to be drawn, one has 
compelling arguments for repeating the experiment, this time with a 
proper (equal-numbers-of-cases) design and carefully random selection of 
cases.

  snip  

 ...  My means are 2.05, 6.38, and 12.08 for the three groups 
 respectively. In other words.. what does one calculate and how?

You might reasonably try using the regression module (rather than one-way 
anova) to compare output:  predict Y from X1 (X1 = 1,2,3 for the three 
groups, and X1 = -1,0,+1 if you want to confirm that the results are the 
same for this coding);  and for an alternate (quadratic) model, predict 
 Y from  X1 and X2 (X1 = -1,0,+1;  X2 = +1,-2,+1).

You have not, by the way, said what you're doing this analysis FOR, so 
it's a bit difficult to know whether one is offering useful advice.  Or 
not. 
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Margin Analysis Qstn

2001-01-30 Thread Donald Burrill

On Mon, 29 Jan 2001, Chris wrote in part:

 My current job requires me to analyze margins from the sales of various
 products and provide an average for each during the quarter. I am using a
 very large sample of all product sales by month. (Margin, i.e. not markup.
 For those not familiar, markup is what a business does to receive Margin.
 Margin is a measure of profitability.  A typical calculation for margin is,
 (Unit Resale Price - Unit Cost) / Unit Resale Price ).

  snip, sample data  
 Assuming a normal distribution, what method should I use to calculate 
 my averages?  Should I simply take the sample mean?  Should I remove 
 anomalies like 112% margins?  Should I calculate upper and lower 
 control limits and place my data into a normal curve?

As Rich Ulrich implied, why would you wish to assume a normal 
distribution? 

As to what kind of average to compute, what will you (or your superior, 
or client, as the case may be) _do_ with the average once you have it? 
If it is to be related to anything like total profit (or revenue?) during 
the period represented in the data, you'd about have to multiply by sales 
volume before averaging, for instance.  

As to the 112% margin, I take it that you don't have the underlying 
resale value or cost, else you could calculate the margin directly. 
Basically, you have two choices:  (1) discard the anomaly whenever you 
encounter one;  (2) guess what error in logic or arithmetic led to the 
anomalous value, and correct it (112% might not be so unreasonable if 
the denominator had been unit cost instead of unit resale price, for 
example).  Option (2) is _always_ chancy, but may be viable if you have 
something better than a guess to go on.
-- DFB.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: change scores

2001-01-28 Thread Donald Burrill

On Fri, 26 Jan 2001, Rich Ulrich quoted me:

 DB:  What most people who use "ordinal" and "disordinal" seem to mean 
  is a plot of the cell means (or of regression lines), with no 
  adjustment for main effects:  so, a display that includes the 
  interaction AND the main effects.  I take it that's what you mean 
  here.  
and replied:

 Yes.  Just like "most people,"  I use the definition that has draws a 
 distinction, instead of the one that does not.  Why do you prefer the
 one that does not?

Mostly because that was the formal definition given in the textbooks I 
learned from, donkey's years ago...  and because I think it useful to be 
able to distinguish between main effects and interactions (an interaction 
being a systematic effect among cell means (in ANOVA) that is not 
accounted for by the main effects in the design;  a corresponding 
definition can be written, mutatis mutandis, for regression contexts).

 DB:  Then:  a disordinal display -- of what plot?  (As remarked in a 
  thread a year or two ago, an interaction (displayed as a plot of cell 
  means or of regression lines) may appear ordinal from one direction 
  and disordinal from the other.)
 
  - I remember someone claimed that.   (Oui, moi.  -- DB)
 I remember an example that failed to make the point.  I don't 
 remember a valid example, or that the point was generally accepted.
  - I hope this is not a failure of my memory.  But if it's my problem,
 I hope you will reproduce the illustration, or cite it somewhere.

As requested.  Consider the two-way table of cell means below:  

 B1   B2
   A110   20
   A240   30

40 - 1 40 -  2
   -  -
30 - 2 30 - 2
   -  -
20 -  220 - 1
   -  -
10 -  110 -  1
   -  -
 0 ---+--+--- ---+--+---
 A1 A2   B1 B2

Plotting Y-bar vs. A, we have the left-hand diagram (plotting symbols 
are levels of B);  plotting Y-bar vs. B, we have the right-hand diagram 
above (symbols are levels of A).  The left-hand plot is disordinal 
 (B2  B1 at A1, but B1  B2 at A2), the right-hand plot is ordinal 
 (A1  A2 at both levels of B).

Rich continues:
 The only effect that is never potentially artifactual is the crossover
 of the means, the Disordinal interaction (as most of us define it).  

I take it you must mean, whenever an interaction plot is disordinal in 
any orientation?  I have some mild difficulty with this, since AFAIK the 
idea of "(dis)ordinal" has not been extended beyond two-way interactions, 
and more complex situations may well be of interest...

 That one that can't be explained as measurement error (such as, 
 strong regression owing to poor reliability);  scaling (such as,
 ceiling effects);  or "regression" towards the conditional expected
 values (such as, the real-life example I just cited).

 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: A-D in matlab

2001-01-27 Thread Donald Burrill

On Sun, 28 Jan 2001, Veeral Patel wrote in part:

 Out of curiousty i decided to write a small prog to perform the A-D test in
 matlab for the gumbel distribution. Obtaining the gumbel parameters is easy.
 however the difficulty is in the actual A-D computation formula as stated by
 Stephens(1977).
 A2=-[Sum(i=1,n)(2i-1){logzi+log(1-zn+1-i)}]/n-n
 
 whats the point of /n and -n since they both cancel out anyway. 

Uh-huh.  Perhaps you need to consider the difference between
A2 = - [long expression]/n - n  
and 
A2 = - [long expression]/(n - n).

The latter of course is infinitely large.

In your program, you appear not to have divided by  n,  which surely 
means that your value of  A2  will be  n  times too large, even if you 
have correctly computed  [long expression].
-- DFB.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



  1   2   >