Re: about a problem of khi2 test

2001-07-03 Thread Rich Ulrich

On Sun, 01 Jul 2001 14:19:31 +0200, Bruno Facon
[EMAIL PROTECTED] wrote:
 I work in the area of intelligence differentiation. I would like to know
 how to use the khi2 statistic to determine whether the number of
 statistically different correlations between two groups is due or not to
 random variations. In particular I would like to know how to determine
 the expected numbers of statistically different correlations due to
 “chance”.
 Let me take an example. Suppose I compare two correlations matrices of
 45 coefficients obtained from two independent groups (A and B). If there
 is no true difference between the two matrices, the number of
 statistically different correlations should be equal to 1.25 in favor of

Yes, that is the number.   But there is not a legitimate test that I
know of, unless you are willing to make a strong assumption that 
no pair of the variables should be correlated.

I never heard of the khi2 statistic before this.  I searched with
google, and found a respectable number of references, and here
is something that I had not seen with a statistic:  kh2 appears to be
solely French in its use.  Of the first 50 hits, most were in French,
at French ISPs (.fr).  The few that were in English were also from
French sources.  

One article had a reference (not available in my local libraries):
Freilich MH and Chelton DB, J Phys Oceanogr  16, 741-757. 


 
 group A and equal to 1.25 in favor of group B (in case of  alpha = .05).
 
 Consequently, the expected number of nonsignificant differences should
 be 42.75. Is my reasoning correct?

I would be nice to test the numbers, but I don't credit that reference
as a good one, yet.  

I don't remember for sure, but I think you might be able to compare
two correlation matrices with programs from Jim Steiger's site,

http://www.interchg.ubc.ca/steiger/multi.htm

On the other hand, you would be better off if you can compare 
the entire covariance structures, to keep from making accidental
assumptions about variances.  (Does Jim provide for that?)

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: cigs figs

2001-07-03 Thread Rich Ulrich

 - in respect of the up-coming U.S. holiday -

On Mon, 25 Jun 2001 11:49:47 GMT, mackeral@remove~this~first~yahoo.com
(J. Williams) wrote:

 On Sun, 24 Jun 2001 16:37:48 -0400, Rich Ulrich [EMAIL PROTECTED]
 wrote:
 
 
 What rights are denied to smokers?  
JW  
 Many smokers, including my late mother, feel being unable to smoke on
 a commerical aircraft, sit anywhere in a restaurant, etc. were
 violation of her rights.  I don't agree as a non-smoker, but that
 was her viewpoint until the day she died.

What's your point:  She was a crabby old lady, whining (or
whinging) about fancied 'rights'?  

You don't introduce anything that seems inalienable  or 
self-evident (if I may introduce July-4th language).
Nobody stopped her from smoking as long as she kept it away
from other people-who-would-be-offended.

Okay, we form governments to help assure each other of rights.   
Lately, the law sees fit to stop some assaults from happening, 
even though it did not always do that in the past. - the offender
still has quite a bit of leeway; if you don't cause fatal diseases,
you legally can offend quite a lot.  We finally have laws about
smoking.

But she wants the law to stop at HER convenience?

[ snip, various ]
JW  
 Talking about confused and/or politically driven,  what do Scalia and
 Thomas have to do with smoking rights?   Please cite the case law.

I mention rights  because that did seem to be a attitude you
mentioned that was (as you see) provocative to me.

I toss in S  T, because I think that, to a large extent, they
share your mother's preference for a casual, self-centered 
definition of rights.  And they are Supreme Court justices.
[ Well, they don't say, This is what *I* want  these two
translate the blame/ credit to Nature (euphemism for God).]

So: I don't fault your mother *too*  harshly, when Justices
hardly do better.  Even though a prolonged skew was needed,
to end up with two like this.


-- 
Rich Ulrich, [EMAIL PROTECTED]


http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Maximum Likelihood

2001-06-29 Thread Rich Ulrich

On 28 Jun 2001 20:39:18 -0700, [EMAIL PROTECTED] (Mark W. Humphries)
wrote:

 Hi,
 
 Does anyone have references to a simple/intuitive introduction to Maximum
 Log Likelihood methods.
 References to algorithms would also be appreciated.
 

Look on the Internet.

I used www.google.com to search on 
maximum likelihood tutorial  
(put the phrase in quotes to keep it together; 
or you can use Advanced search)

There were MANY hits, and the second reference 
was in a tutorial that begins at
http://statgen.iop.kcl.ac.uk/bgim/mle/sslike_2.html


The third reference was for some programs and examples in Gauss
(a programming language) by Gary King at Harvard, in his application
area.  If these aren't worthwhile (I did not try to download
anything),  there are plenty of other sites to check.

[ I am intrigued by G. King, a little.  This is the fellow who
putatively has a method, not Heckman's, for overcoming or
compensating for aggregation bias.  Which I never found available
for free.  But, too bad, the page says these programs go with 
his 1989 book, and I think his Method is more recent.]

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: cigs figs

2001-06-24 Thread Rich Ulrich

  - re: some outstandingly confused thinking.  Or writing.

On Sat, 23 Jun 2001 15:25:31 GMT, mackeral@remove~this~first~yahoo.com
(J. Williams) wrote:

[ snip;  Slate reference, etcetera ]
   ... My mother was 91 years
 old when she died  a year ago and chain smoked since her college days.
 She defended the tobacco companies for years saying, it didn't hurt
 me.  She outlived most of her doctors.   Upon quoting statistics and
 research on the subject, her view was that I, like other do gooders
 and non-smokers, wanted to deny smokers their rights.  

What statistics would her view quote?  to show that someone
wants to deny smokers 'their rights'?
[ Hey, I didn't write the sentence ]

I just love it, how a 'natural right'  works out to be *exactly*
what the speaker wants to do.  And not a whit more.
(Thomas and Scalia are probably going to give us tons 
of that bad philosophy, over the next decades.)

What rights are denied to smokers?  You know, you can't 
build your outhouse right on the riverbank, either.

Obviously,
 there is a health connection.  How strong that connection is, is what
 makes this a unique statistical conundrum.

How strong is that connection?  Well, quite strong.

I once considered that it might not be so bad to die 9 years
early, owing to smoking, if that cut off years of bad health 
and suffering.  Then I realized, the smoking grants you 
most of the bad health of old age, EARLY.  (You do miss 
the Alzheimer's.)  One day, I might give up smoking my pipe.

What is the statistical conundrum?  I can almost 
imagine an ethical conundrum.  (How strongly can
we legislate, to encourage cyclists to wear helmets?)
I sure don't spot a statistical conundrum.

Is this word intended?  If so, how so?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Marijuana

2001-06-22 Thread Rich Ulrich

On Fri, 22 Jun 2001 18:45:52 GMT, Steve Leibel [EMAIL PROTECTED]
wrote:

 In article [EMAIL PROTECTED],
  [EMAIL PROTECTED] (Eamon) wrote:
 
  (c) Reduced motor co-ordination, e.g. when driving a car
  
 
 Numerous studies have shown that marijuana actually improves driving 
 ability.  It makes people more attentive and less aggressive.  You could 
 look it up.

An intoxicant does *that*?  

I think I recall in the literature, that people getting 
stoned, on whatever, occasionally  *think*  that 
their reaction time or sense of humor or other 
performance is getting better.   

Improving your driving by getting mildly stoned 
(omitting the episodes of hallucinating)
seems unlikely enough, to me, 
that  *I*  think the burden of proof is the stranger named Steve.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: a form of censoring I have not met before

2001-06-21 Thread Rich Ulrich

On 21 Jun 2001 00:35:11 -0700, [EMAIL PROTECTED] (Margaret
Mackisack) wrote:

 I was wondering if anyone could direct me to a reference about the 
 following situation. In a 3-factor experiment, measurements of a continuous 
 variable, which is increasing monotonically over time, are made every 2 
 hours from 0 to 192 hours on the experimental units (this is an engineering 
 experiment). If the response exceeds a set maximum level the unit is not 
 observed any more (so we only know that the response is  that level). If 
 the measuring equipment could do so it would be preferred to observe all 
 units for the full 192 hours. The time to censoring is of no interest as 
 such, the aim is to estimate the form of the response for each unit which 
 is the trace of some curve that we observe every 2 hours. Ignoring the 
 censored traces in the time period after they are censored puts a huge 

Well, it certainly *sounds*  as if the time to censoring should be 
of great interest, if you had an adequate model.

Thus, when you say that ignoring them gives  a huge 
downward bias,  it sounds to me as if you are admitting that 
you do not have an acceptable model.

Who can you blame for that?  What leverage do you 
have, if you try to toss out those bad results?  (Surely, 
you do have some ideas about forming estimates
that *do*  take the hours into account.  The problem
belongs in the hands of someone who does.)

 - maybe you want to segregate trials into the ones
with 192 hours, or less than 192 hours; and figure two 
(Maximum Likelihood) estimates for the parameters, which
you then combine.


 downward bias into the results and is clearly not the thing to do although 
 that's what has been done in the past with these experiments. Any 
 suggestions of where people have addressed data of this or related form 
 would be very gratefully received.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Help me, please!

2001-06-19 Thread Rich Ulrich

On 18 Jun 2001 01:18:37 -0700, [EMAIL PROTECTED] (Monica De Stefani)
wrote:

 1) Are there some conditions which I can apply normality to Kendall
 tau?

tau is *lumpy*  in its distribution for N less than 10.

And all rank-order statistics are a bit problematic when 
you try to use them on rating scales with just a few discrete
scores -- the tied values give you bad scaling intervals, 
and the estimate of variance won't be very good,either.

For correlations, your assumption of 'normality' is usually
applied to the values at zero.

 I was wondering if x's observations must be
 independent and y's observations must be independent to apply
 asymptotically normal limiting
 distribution. 
 (null hypothesis = x and y are independent).
 Could you tell me something about?

 - Independence is needed for just about any tests.

I started to say (as a minor piece of exaggeration) that 
independence is needed absolutely;  
but the correct statement, I think, is that independence
is always demanded  relative to the error term.

[ snip, non-linear?]

Monotonic is the term.


[ snip, T(z):  I don't know what that is.]

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Probability Of an Unknown Event

2001-06-18 Thread Rich Ulrich

On Sat, 16 Jun 2001 23:05:52 GMT, W. D. Allen Sr.
[EMAIL PROTECTED] wrote:

 It's been years since I was in school so I do not remember if I have the
 following statement correct.
 
 Pascal said that if we know absolutely nothing
 about the probability of occurrence of an event
 then our best estimate for the probability of
 occurrence of that event is one half.
 
 Do I have it correctly? Any guidance on a source reference would be greatly
 appreciated!

I did a little bit of Web searching and could not find that.

Here is an essay about Bayes, which (dis)credits him and his
contemporaries as assuming something like that, years before Laplace.

I found it with a google search on 
 know absolutely nothing  probability .

 http://web.onetel.net.uk/~wstanners/bayes.htm

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: meta-analysis

2001-06-17 Thread Rich Ulrich

On 17 Jun 2001 04:34:26 -0700, [EMAIL PROTECTED] (Marc)
wrote:

 I have to summarize the results of some clinical trials.
 Unfortunately the reported information is not complete.
 The information given in the trials contain:
 
 (1) Mean effect in the treatment group (days of hospitalization)
 
 (2) Mean effect in the control group (days of hospitalization)
 
 (3) Numbers of patients in the control and treatment group
 
 (4) p-values of a t-test (between the differences of treatment
 and control)
 My question:
 How can I calculate the variance of treatment difference which I need
 to perform meta-analysis? Note that the numbers of patients in the

Aren't you going too far?  You said you have to summarize.
Well, summarize.  The difference is in terms of days.  
Or it is in terms of percentage of increase.

And you have the t-test and p-values.  

You might be right in what you propose, but I think
you are much more likely to produce a useful report 
if you keep it simple.

You are right; meta-analyses are complex.  And a 
majority of the published ones are (in my opinion) awful.
--
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Marijuana

2001-06-17 Thread Rich Ulrich

On 15 Jun 2001 02:04:36 -0700, [EMAIL PROTECTED] (Eamon) wrote:

[ snip, Paul Jones.  About marijuana statistics.]

 
 Surely this whole research is based upon a false premise. Isn't it
 like saying that 90%, say, of heroin users previously used soft drugs.
 Therefore, soft-drug use usually leads to hard-drug use - which does
 not logically follow. (A = B =/= B = A)
 
 Conclusions drawn from the set of people who have had heart attacks
 cannot be validly applied to the set of people who smoke dope.
 Rather than collect data from a large number of people who had heart
 attacks and look for a backward link, they should monitor a large
 number of people who smoke dope. But, of course this is much more
 expensive.

It is much more expensive, but it is also totally stupid to carry out
the expensive research if the *cheap* and lousy research didn't
give you a hint that there might be something going on.

The numbers that he was asking about do pass the simple
test.  I mean, there were not 1 million people contributing one
hour each, but we should still ask, *Would*  this say something?
If it would not, then the whole question is *totally*  arid.  The 2x2
table is approximately
(dividing the first column by 100; and subtracting from a total):
10687   and  124
   175   and  9

That gives a contingency test of 21.2 or 18.2, with p-values 
under .001.  The Odds Ratio on that is 4.4.
That is pretty convincing that there is SOMETHING
going on, POSSIBLY something that merits an explanation.  
The expectation for the cell with 9  is just 2.2 -- the tiny cell is
the cell that matters for contributions to the test -- which is why it
is okay to lop the hundreds  off the first column (to make it
readable).

Now, you may return to your discussion of why the table is
not any good, and what is needed for a proper test.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: individual item analysis

2001-06-17 Thread Rich Ulrich

On 15 Jun 2001 14:24:39 -0700, [EMAIL PROTECTED] (Doug
Sawyer) wrote:

 I am trying to locate a journal article or textbook that addresses
 whether or not exam quesitons can be normalized, when the questions are
 grouped differently.  For example, could a question bank be developed
 where any subset of questions could be selected, and the assembled exam
 is normalized?
 
 What is name of this area of statistics?  What authors or keywords would
 I use for such a search?  Do you know whether or not this can be done?


I believe that they do this sort of thing in scholastic achievement
tests, as a matter of course.  Isn't that how they make the transition
from year to year?  I guess this would be norming.

A few weeks ago, I discovered that there is a whole series of
tech-reports put out by one of the big test companies.  I would 
look back to it, for this sort of question.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: multivariate techniques for large datasets

2001-06-14 Thread Rich Ulrich

On 13 Jun 2001 20:32:51 -0700, [EMAIL PROTECTED] (Tracey
Continelli) wrote:

 Sidney Thomas [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]...
  srinivas wrote:
   
   Hi,
   
 I have a problem in identifying the right multivariate tools to
   handle datset of dimension 1,00,000*500. The problem is still
   complicated with lot of missing data. can anyone suggest a way out to
   reduce the data set and  also to estimate the missing value. I need to
   know which clustering tool is appropriate for grouping the
   observations( based on 500 variables ).
 
 One of the best ways in which to handle missing data is to impute the
 mean for other cases with the selfsame value.  If I'm doing
 psychological research and I am missing some values on my depression
 scale for certain individuals, I can look at their, say, locus of
 control reported and impute the mean value.  Let's say [common
 finding] that I find a pattern - individuals with a high locus of
 control report low levels of depression, and I have a scale ranging
 from 1-100 listing locus of control.  If I have a missing value for
 depression at level 75 for one case, I can take the mean depression
 level for all individuals at level 75 of locus of control and impute
 that for all missing cases in which 75 is the listed locus of control
 value.  I'm not sure why you'd want to reduce the size of the data
 set, since for the most part the larger the N the better.

Do you draw numeric limits for a variable, and for a person?
Do you make sure, first, that there is not a pattern?

That is -- Do you do something different depending on
how many are missing?  Say, estimate the value, if it is an
oversight in filling blanks on a form, BUT drop a variable if 
more than 5% of responses are unexpectedly missing, since 
(obviously) there was something wrong in the conception of it, 
or the collection of it  Psychological research (possibly) 
expects fewer missing than market research.

As to the N -  As I suggested before - my computer takes 
more time to read  50 megabytes than one megabyte.  But
a psychologist should understand that it is easier to look at
and grasp and balance raw numbers that are only two or 
three digits, compared to 5 and 6.

A COMMENT ABOUT HUGE DATA-BASES.

And as a statistician, I keep noticing that HUGE databases
tend to consist of aggregations.  And these are random
samples only in the sense that they are uncontrolled, and 
their structure is apt to be ignored.

If you start to sample, to are more likely to ask yourself about 
the structure - by time, geography, what-have-you.  

An N of millions gives you tests that are wrong; estimates 
ignoring relevant structure have a spurious report of precision.
To put it another way: the Error  (or real variation) that *exists*
between a fixed number of units (years, or cities, for what I
mentioned above) is something that you want to generalize across.  
With a small N, that error term is (we assume?) small enough to 
ignore.  However, that error term will not decrease with N, 
so with a large N, it will eventually dominate.  The test 
based on N becomes increasing irrelevant

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: About kendall

2001-06-12 Thread Rich Ulrich

On 12 Jun 2001 08:43:53 -0700, [EMAIL PROTECTED] (Monica De Stefani)
wrote:

 When I aplly Kendall tau or Kendall's partial tau to a time series do
 I have to calcolate ranks or not?
 In fact a time series has a natural temporal order.

 ... but you are not partialing out time.  Surely.

Your program that does the Kendall tau must do some
ranking, as part of the algorithm.  Why do you think you 
might have to calculate ranks?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Diagnosing and addressing collinearity in Survival Analysis

2001-06-11 Thread Rich Ulrich

On 06 Jun 2001 06:46:55 GMT, [EMAIL PROTECTED] (ELANMEL) wrote:

 Any assistance would be appreciated:  I am attempting to run some survival
 analyses using Stata STCOX, and am getting messages that certain variables are
 collinear and have been dropped.  Unfortunately, these variables are the ones I
 am testing in my analysis!  
 
If there are 3 groups (classes), then you can have only
two dummy variables to refer to their degrees of freedom.
You can code those in the most convenient and 
informative way.

If your problem arises otherwise, then you have a 
fundamental problem in the logic of what is being tested.
Google shows some examples of problems when I search
for  statistical confounding  (use the quotes for the search).
And  confounded designs seems to obtain discussions.


 I would appreciate any information or recommendations on how best to diagnose
 and explore solutions to this problem.  

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: please help

2001-06-11 Thread Rich Ulrich

On 10 Jun 2001 07:27:55 -0700, [EMAIL PROTECTED] (Kelly) wrote:

 I have the gage repeatability  reproducibility(gage RR) analysis
 done on two instruments, what hyphoses test can I use to test that the
 repeatability variance(expected sigma values of repeatability) of the
 two instruments are significantly different form each other or to say
 one has a lower variance than the other.
 Any insight will be greatly appreciated.
 Thanks in advance for your help.

I am not completely sure I understand, but I will make a guess.

There is hardly any power for comparing two ANOVAs that are
done on different samples, until you make strong assumptions 
about samples being equivalent, in various regards.  

If ANOVAs are on the same sample,
then a CHOW test can be used on the improved prediction
if one hypothesis consists of an extra d.f.  of prediction.
If ANOVAs are on separate samples, I wonder if you could
compare the residual variances, by the simple variance 
ratio F-test -- well, you could do it, but I don't know what arguments
should be raised against it, for your particular case.

There are criteria resembling the CHOW test that are used less
formally, for incommensurate ANOVAs (not the same predictors)
 - AKAIKE and others.

If your measures are done on the same (exact) items, you 
might have a paired test.  Instrument A gets closer values on 
how many of the measurements that are done.

Finally, if you can do a bunch of separate experiments, you
can test whether  A  or B  does better in more than half of them.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Need Good book on foundations of statistics

2001-06-04 Thread Rich Ulrich

On 1 Jun 2001 19:07:31 GMT, [EMAIL PROTECTED] wrote:

 
 Can anyone refer me to a good book on the foundations of statistics?

Stigler's The History of Statistics  is the most widely read of
recent popular histories.  It covers pre-1900.  His newer
book is Statistics on the Table  and I enjoyed that one, too.
It includes the founding of *modern*  statistics in, say, the 1930s,
in addition to much older anecdotes.


 I want to know of the limitations, assumptions, and philosophy
 behind statistics. A discussion of how the quantum world may have
 different laws of statistics might be a plus.

That last sentence makes me think that you don't know any
answers to the sentence just previous to it.
 ... have different laws  is certainly not the way statisticians
would put it.  Leptons *obey*  different laws than baryons do
(I think), but the laws are descriptions that were imagined by 
human beings.  

I suppose one way to describe the dilemma of physics might
be, It is trying to force all of these particles into fitting
descriptions that are less than ideal (or, so it keeps working out).


I think it is curious and interesting that the physicists at the 
highest levels of abstraction -- cosmology; and high-energy
particles/relativity -- are beginning to use fairly ordinary 
'statistical tests' to judge whether they have anything.  
IS there oscillation in the measured background of stars, near 4
degrees kelvin, across the whole universe?
IF they continued CERN for another 18 months, would there have
been another dozen or so  *apparent*  particles of the right type, 
so they could conclude that the number observed was 'significant'  
at the one-in-a-million level, instead of just one-in-two-hundred?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: fit-ness

2001-06-03 Thread Rich Ulrich

On Thu, 31 May 2001 12:05:24 +0100, Alexis Gatt [EMAIL PROTECTED]
wrote:

 Hi,
 
 a basic question from a MSc student in England. First of all, yeah I read
 the FAQ and I didnt find anything answering my question, which is fairly
 simple: I am trying to analyse how well several mathematical methods perform
 to modelize a scanner. So I have, for every input data, the corresponding
 output given by the scanner and the values given by the mathematical models
 I am using.
 First, given the distribution of the errors, I can use the usual mean-StdDev

I can think of two or 3 meanings of 'scanner'  and not a one of 
them would have a simple, indisputable measure of 'error.'
 1) Some measures would be biased toward one  'method'  
or another, so a winner would be obvious.
 2) Some samples to be tested would be biased (similarly)
toward a winner by one method or another.  So you select
your winner by selecting your mix of samples.

If you have fine measures, then you can give histograms of your
results (assuming  1-dimensional, as your alternatives suggest).

Is it enough to have the picture?
What would your audience demand?  What is your need?


 if the distro is normal, or median-95th percentile otherwise. Any other
 known methods to enhance the pertinence of the analysis? Any ideas welcome.

Average squared error (giving SD) is popular.  
Average absolute error de-emphasizes the extremes.
Count of errors beyond a critical limit sometimes fills a need.

A more complicated way is to build in a cost function.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: ONLY ONE

2001-05-29 Thread Rich Ulrich

FYI  - that piece of HTML code is a SPAM advertisement, 
which does seem to evoke other Web addresses.

On 27 May 2001 18:51:32 -0700, [EMAIL PROTECTED]
([EMAIL PROTECTED]) wrote:

 
 HTML
 SCRIPT LANGUAGE=JavaScript
 window.location=http://www.moodysoft.com;
 /SCRIPT
 BODY
 FONT FACE=Verdana SIZE=1
 B
 Best screen capture on earth and in cyberspace.BRIn fact the only one.BRAnything 
else is just a long learning process.BRBR
 FONT COLOR=redSPX® v2.0/FONTBREverytime you need to select a portion of 
screen, hold right-click longer than usual until the cursor turns into the cross 
graphical cursor.Make your selection and as soon as you release the mouse, SPX® will 
send it to the destination of your choice: Clipboard, File, Mail, Printer/FaxBRBR
 Very useful, no?BRA HREF=http://www.moodysoft.com;www.moodysoft.com/A
 /B
 /FONT
 /BODY
 /HTML
-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Standardized testing in schools

2001-05-25 Thread Rich Ulrich

On Thu, 24 May 2001 23:25:42 GMT, W. D. Allen Sr.
[EMAIL PROTECTED] wrote:

 And this proved to me , once again,
 why nuclear power plants are too hazardous to trust:...
 
 Maybe you better rush to tell the Navy how risky nuclear power plants are!
 They have only been operating nuclear power plants for almost half a century
 with NO, I repeat NO failures that has ever resulted in any radiation
 poisoning or the death of any ship's crew. In fact the most extensive use of
 Navy nuclear power plants has been under the most constrained possible
 conditions, and that is aboard submarines!
 
 Beware of our imaginary boogy bears!!

As I construct an appropriate sampling frame, one out of two 
nuclear navies has a good long-term record.  Admiral Rickover 
had a fine success.  The other navy was not so lucky, or suffered 
because it was more pressed for resources.

 
 You are right though. There is nothing really hazardous about the operation
 of nuclear power plants. The real problem has been civilian management's
 ignorance or laziness!
 [...]

I'm glad you see the problem - though I see it more as 'ordinary 
management'  than ignorance or laziness.  It might not even have
to be 'poor'  management by conventional terms; the conventions
don't take into account extraordinarily dangerous materials.  The
Japanese power plant's  nuke-fluke of last year was an illustration of
employee inventiveness and  'shop-floor innovation'.  Unfortunately
for them, they 'solved a problem'  that had been a (too-) cleverly
designed safety precaution.  

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The False Placebo Effect

2001-05-25 Thread Rich Ulrich

On 24 May 2001 21:39:17 -0700, [EMAIL PROTECTED] (David Heiser) wrote:

 
 Be careful on your assumptions in your models and studies!
 ---
 
 Placebo Effect An Illusion, Study Says
 By Gina Kolata
 New York Times
 (Published in the Sacramento Bee, Thursday, May 24, 2001)
 
 In a new report that is being met with a mixture of astonishment and some
 disbelief, two Danish researchers say that the placebo effect is a myth.

Do you think they will not believe in voudon/ voodoo, either?
 
 The investigators analyzed 114 published studies involving about 7,500
 patients with 40 different conditions. They found no support for the common
 notion that, in general, about one-third of patients will improve if they
 are given a dummy pill and told it is real.
 [ ... ]
The story goes on.  The authors look at studies where the placebo
effect is probably explained by regression-to-the-mean.  
 - I was a bit surprised by the newspaper coverage.   I tend to 
forget that most people, including scientists, do *not*  blame
regression-to-the-mean, as the FIRST suspicious cause 
whenever there is a pre-post design:  because they have 
scarce heard of it.

On the other hand, I have expected for a long time that 
the best that a light-weight placebo will do is a 
light-weight improvement.  



 ... 
 The researchers said they saw a slight effect of placebos on subjective
 outcomes reported by patients, like their descriptions of how much pain they
 experienced. But Hrobjartsson said he questioned that effect. It could be a
 true effect, but it also could be a reporting bias, he said. The patient
 wants to please the investigator and tells the investigator, 'I feel
 slightly better. ' 

Pain  is a hugely subjective report.   It is notorious.
I would not want to do a summary across the papers of 
the whole field of pain-researchers, since -- based on
difficulty, and not on knowing those researchers -- I expect 
an enormous amount of bad research in that area.
 - I don't know if the researchers are quite unwise here, of
if they only seem that way because of bad news reporting.
 - Oh, I did read a meta-analysis a while ago, that one from
Steve Simon.  It was based on pain research (and, basically,
only relevant to pain research), and the authors insisted 
that the vast majority of studies were not very good.

About the studies these authors found, using 3 groups:

 They found 114, published between 1946 and 1998. When they analyzed the
 data, they could detect no effects of placebos on objective measurements,
 like cholesterol levels or blood pressure.

 - That is interesting.  114 is a big enough number.  Controlled
medical research, however, seemed to undergo big changes
across those decades.  I expect that double-blind and triple-blind
studies did not get much use until halfway through that interval.

If someone does look into the original publication, and will
tell us about it -- 
I am interested, especially, in what the authors say about 
pain studies, and what they say about time trends.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Standardized testing in schools

2001-05-24 Thread Rich Ulrich

Standardized tests and their problems?  Here was a 
problem with equating the scores between years.

The NY Times had a long front-page article on Monday, May 21:
When a test fails the schools, careers and reputations suffer.
It was about a minor screw-up in standardizing, in 1999.  Or, since
the company stonewalled and refused to admit any problems,
and took a long time to find the problems, it sounds like it 
became a moderately *bad*  screw-up.

The article about CTB/McGraw-Hill starts on page 1, and covers
most of two pages on the inside of the first section.  It seems 
highly relevant to the 'testing' that the Bush administration 
advocates, to substitute for having an education policy.

CTB/McGraw-Hill  runs the tests for a number of states, so they
are one of the major players.  And this proved to me , once again,
why nuclear power plants are too hazardous to trust:  we can't
yet Managements to spot problems, or to react to credible  problem
reports in a responsible way.

In this example, there was one researcher from Tennessee who
had strong longitudinal data to back up his protest to the company;
the company arbitrarily (it sounds like) fiddled with *his*  scores, 
to satisfy that complaint, without ever facing up to the fact that 
they did have a real problem.  Other people, they just talked down.

The company did not necessarily lose much business from the 
episode because, as someone was quoted, all the companies
who sell these tests   have histories of making mistakes.  
(But, do they have the same history of responding so badly?)

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Variance in z test comparing purcenteges

2001-05-23 Thread Rich Ulrich

 - BUT, Robert, 
the equal N case is different from cases with unequal N -
 - or did I lose track of what the topic really is... -

On 22 May 2001 06:52:27 -0700, [EMAIL PROTECTED] (Robert J.
MacG. Dawson) wrote:

 and Rich Ulrich responded: 
  Aren't we looking at the same contrast as the t-test with
  pooled and unpooled variance estimates?  Then -
 
 Similar, but not identical. With the z-for-proportion we 
 have the additional twist that the amount of extra power
 from te unpooled test is linked to the size of the effect 
 we're trying to measure, in such a way that we get it 
 precisely when we don't need it. Or, to avoid being too 
 pessimistic, let's say that the pooled test only costs us 
 power when we can afford to lose some grin.
 

- Robert wrote on May 18,And, clearly, the pooled 
variance is larger; as the function is convex up, the 
linear interpolation is always less.

Back to my example in the previous post:  Whenever you 
do a t-test, you get exactly the same t if the Ns are equal.
For unequal N, you get a bigger t when the group with the 
smaller variance gets more weight.  I think your z-tests
on proportions have to work the same way.

I can do a t-test with a dichotomous variable as the criterion, 
testing 1 of 100  versus 3 of 6:  2x2 table is (1+99), (3+3).
That gives me a pooled t of 6 or 7, that is  p  .001; and  a
separate-variance t  that is p= 0.06.

 - I like that pooled test, but I do think that it has stronger
assumptions than the 2x2 table.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Elementary cross-sectional statistics

2001-05-21 Thread Rich Ulrich

On Mon, 21 May 2001 13:41:16 GMT, Sakke [EMAIL PROTECTED] wrote:

 Hello Everybody!
 
 We have a probably very simple question. We are doing cross-sectional
 regressions. We are doing one regression per moth for a period of ten years,
 resulting in 120 regressions. As we understood, it is possible to just take
 a arithmetic average for every coefficient. 

Well, sure, it is possible to take an arithmetic average
and then you can tell people, Here is the arithmetic average.
It's a lot harder to have any certainty that the average of a time
series means much.

   What we do not know, is how to
 calculate the t-statistics for these coefficients. Can we just do the same,
 arithmetic average? Can anybody help us?

No, you certainly can't compute an average of some t-tests
and claim that it is a t-test.  What you absolutely have to have
(in some sense) is a model of what happens over 10 years.

For instance:  If it is the same experience over and over again
(that is your model of 'what happens'),
*maybe* it would be proper to average each Variable over the
120 time points;  and then do the regression.

That is the easiest case I can think of --the mean is supposed
to represent something, and you conclude that it represents
the whole thing.  

Otherwise:  What is there?  What are you
trying to conclude?  Why? (Who cares?)

Are the individual regressions 'significant'?  Highly?
Are there mean-differences over time?  
 - variations between years or seasons?  
Are the lagged correlations near zero?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Intepreting MANOVA and legitimacy of ANOVA

2001-05-18 Thread Rich Ulrich

The usual problem of MANOVA, which is hard to avoid, is that
even if a test comes out significant, you can't say what you have
shown except 'different.'  

You get a clue by looking at the univariate tests and correlations.
Or drawing up the interesting contrasts and testing them to see
if they account for everything.

I have a problem, here, that might be avoidable -- I can't tell
what you are describing.  Part of that is 'ugly abbreviations,' 
part is 'I do not like the terminology, DV and IV, abbreviated or
not'  so I will not take much time at it.

On Fri, 18 May 2001 14:57:49 -0500, auda [EMAIL PROTECTED] wrote:

 Hi, all,
 In my experiment, two dependent variables were measured (say, DV1 and DV2).
 I found that when analyzed sepeartely with ANOVA, independent variable (say,
 IV and had two levels IV_1 and IV_2) modulated DV1 and DV2 differentially:
 
 mean DV1 in IV_1  mean DV1 in IV_2
 mean DV2 in IV_1  mean DV2 in IV_2
 
 If analyzed with MANOVA, the effect of IV was significant, Rao
 R(2,14)=112.60, p0.000. How to intepret this result of MANOVA? Can I go
 ahead to claim IV modulated DV1 and DV2 differentially based up the result
 from MANOVA? Or I have to do other tests?
 
 Moreover, can I treat DV1 and DV2 as two levels of a factor, say, type of
 dependent variable, and then go ahead to test the data with
 repeated-measures ANOVA and see if there is an interaction between IV and
 type of dependent variable?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Variance in z test comparing purcenteges

2001-05-18 Thread Rich Ulrich

On 18 May 2001 07:51:21 -0700, [EMAIL PROTECTED] (Robert J.
MacG. Dawson) wrote:

 [ ... ] 
   OK, so what *is* going on here?  Checking a dozen or so sources, I
 found that indeed both versions are used fairly frequently (BTW, I
 myself use the pooled version, and the last few textbooks I've used do
 so).
 
   Then I did what I should have done years ago, and I tried a MINITAB
 simulation. I saw that for (say) n1=n2=10, p1=p2=0.5, the unpooled
 statistic tends to have a somewhat heavy-tailed distribution. This makes
 sense: when the sample sizes are small the pooled variance estimator is
 computed using a sample size for which the normal approximation works
 better.
 
   The advantage of the unpooled statistic is presumably higher power;
 hoewever, in most cases, this is illusory. When p1 and p2 are close
 together, you do not *get* much extra power.  When they are far apart
 and have moderate sample sizes you don't *need* extra power. And when
[ snip, rest]

Aren't we looking at the same contrast as the t-test with 
pooled and unpooled variance estimates?  Then -

(a) there is exactly the same  t-test value when the Ns are equal; 
the only change is in DF.
(b) Which test is more powerful depends on which group is 
larger, the one with *small*  variance, or the one with *large*
variance.   -- it is a large difference when Ns and variances
are both different by (say) a fourfold factor or more.

If the big N has the small variance, then the advantage
lies with 'pooling'  so that the wild, small group is not weighted
as heavily.  If the big N has the large variance, then the 
separate-variance estimate lets you take advantage of the
precision of the smaller group.  

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: bootstrap, and testing the mean of variance ratio

2001-05-16 Thread Rich Ulrich



On Wed, 16 May 2001 11:50:07 + (UTC), [EMAIL PROTECTED]
(rpking) wrote:

 For each of the two variance ratios, A=var(x)/var(y) and
 B=var(w)/var(z), I bootstrapped with 2000 replications to obtain
 confidence intervals.  Now I want to test whether the means are
 equal, ie. E(A) = E(B), and I am wondering whether I could just use
 the 2000 data points, calculate the standard deviations, and do a
 simple t test.

This raises questions, questions, questions.

What do you mean by a data point?  by bootstrapping?
Why do you want ratios of the variances?  If you are concerned with
variances, why aren't you considering the logs of V?  If you are
concerned with ratios, why are you considering the logs of the ratios?

With 2000 replications each, there would seem to be 4000 points.
Or, what relation is there among x-y-z-w?If these give you 2000
vectors, then why don't you have a paired comparison in mind?

Bootstrapping is tough enough to figure what's proper, that I
don't want to bother with it.  Direct tests are usually enough:  So,
if you were considering a direct test, What would you be testing?
(I figure there is really good chance that you are wrong in what you
are trying to bootstrap, or how you are doing it.)

 I have concerns because A and B are bounded below at 0 (but not
 bounded above), so the distribution may not be asymptotically
 normal. 
 ... and that is relevant to what?  Distributions of raw data are
seldom (if ever) asymptotically normal.

But I also found the bootstrapped A and B are well away from
 zero; the 1% percentile has a value of 0.78.  

 ... well, I should hope they are away from zero.  Relevance?

   So could t test be
 used in this situation?  Or should I do another bootstrapping for
 the test?

Take your original problem to a statistician.  Bootstrap is 
something to retreat to when you can't estimate error directly,
and you have given no clue why you might need it.

-- 
Rich Ulrich,[EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: (none)

2001-05-16 Thread Rich Ulrich

 [ note, Jay:  HTML-formatting makes this hard to read ]

On 11 May 2001 00:30:06 -0700, [EMAIL PROTECTED] (Jay Warner) wrote:
[snip, HTML header]

 I've had occasion to talk with a number of educator types lately, at different
 application and responsibility levels of primary amp; secondary Ed.nbsp;
 Only one recalled the term, regression toward the mean.nbsp; Some (granted,
 the less analytically minded) vehemently denied that such could be causing
 the results I was discussing.nbsp; Lots of other causes were invoked.
 pIN an MBA course I teach, which frequently includes teachers wishing
 to escape the trenches, the textbook never once mentions the term.nbsp;
 I don't recall any other intro stat book including the term, much less
 an explanation.nbsp; The explanation I worked out required some refinement
 to become rational to those educator types (if it has yet :).

 - I am really sorry to learn that -
Not even the texts!  that's bad.  
By the way, there are two relevant chapters in the 1999 history,
Statistics on the Table by Stephen Stigler (see pages 157-179).

Stigler documents a big, embarrassing blunder by a noted 
economist, published in 1933.  Horace Secrist wrote a book with
tedious detail, much of it being accidental repetitions of regression
fallacy.  Hotelling panned it in a review in JASA.  Next, Secrist
replied in a letter, calling Hotelling wholly mistaken.  Hotelling
tromped back,  ... and when one version of the thesis is interesting
but false and  the other is true but trivial, it becomes the duty of
the reviewer to give warning at least against the false version.

Maybe Stigler's user-friendly anecdote will help to spread the
lesson, eventually.

 pSo I'm not surprised that even the NYT would miss it entirely.nbsp;
 Rich, I hope you penned a short note to the editor, pointing out its presence.nbsp;
 Someone has to, soon.

I did not write, yet.  But I see an e-mail address, which is not usual
in the NYTimes.  I guess they identify Richard Rothstein as
[EMAIL PROTECTED]  
because this article was laid out as a feature (Lessons) instead of an
ordinary news report.  I'm still considering what I should say, if 
someone else doesn't tell me that they have passed the word.


 pBTW, Campbell's text, A primer on regression artifacts mentions a
 correction factor/method, which I haven't understood yet.nbsp; Does anyone
 in education and other social science circles use this correction, and
 may I have a worked out example?

Since you mentioned it, I checked my new copy of the Campbell/ Kenny
book.  Are you in Chapter 5?  There is a lot going on, but I don't
grasp that there is any well-recommended correction.  Except, maybe, 
Structural-equations-modeling, and they just gesture vaguely in the
direction of that.  

Give me a page number?

I thought that they re-inforced my own prejudices, that when two
groups are not matched at Pre, you have a lot of trouble forming clear
conclusions.  You can be a bit assertive if one group wins by all
three standards (raw score, change score, regressed-change score), 
but you still can't be 100% sure.

When your groups don't match, you draw the graphs to help you 
clarify trends, since the eyeball is great at pattern analysis.
Then you see if any hostile interpretations can undercut your 
optimistic ones, and you sigh regrets when they do.

 pJay
 pRich Ulrich wrote:
 blockquote TYPE=
[ snip, my earlier note, with HTML format imposed. ]

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Variance in z test comparing percenteges

2001-05-13 Thread Rich Ulrich

On 11 May 2001 22:29:37 -0700, [EMAIL PROTECTED] (Donald Burrill)
wrote:

 On Sat, 12 May 2001, Alexandre Kaoukhov (RD [EMAIL PROTECTED]) wrote:
 
  I am puzzled with the following question:
  In z test for continuous variables we just use the sum of estimated
  variances to calculate the variance of a difference of two means i.e.
 s^2 = s1^2/n1 + s2^2/n2.

[ snip, Q and A,  AK and DB ... ]

  On the other hand the chi2 is derived from Z^2 as assumed by first 
  approach.

DB
   Sorry;  the relevance of this comment eludes me.

Well -- every (normal) z score can be squared, to produce a
chi-squared score.  One particular formula for a z matches the
Pearson product-moment chisquared test statistic.

  Finally, I would like to know whether the second formula is ever used
  and if so does it have any name.

DB
 Ever is a wider universe of discourse than I would dare pretend to. 
 Perhaps colleagues on the list may know of applications.
 I would be surprised if it had been named, though.

I don't remember a name, either.  I think I do remember seeing 
a textbook that presented that t as their preferred test for
proportions.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Question

2001-05-13 Thread Rich Ulrich

On 11 May 2001 07:34:38 -0700, [EMAIL PROTECTED] (Magill,
Brett) wrote:

 Don and Dennis,
 
 Thanks for your comments, I have some points and futher questions on the
 ussue below.
 
 For both Dennis and Don:  I think the option of aggregating the information
 is a viable one.  

I would call it unavoidable  rather than just viable.  The data
that you show is basically aggregated  already;  there's just one item
per-person.

  Yet, I cannot help but think there is some way to do this
 taking into account the fact that there is variation within organizations.
 I mean, if I have a organizational salary mean of .70 (70%) with a very tiny
 [ snip, rest]

 - I agree, you can use the information concerning within-variation.
I think it is totally proper to insist on using it, in order to
validate the conclusions, to whatever degree is possible.  
You might be able to turn around that 'validation'  to incorporate
it into the initial test;  but I think the role as validation  is
easier to see by itself, first.

Here's a simple example where the 'variance'  is Poisson.
(Ex.)  A town experiences some crime at a rate that declines 
steadily, from 20 000 incidents to 19 900 incidents, over a 5-year
period.  The linear trend fitted to the several points is highly
significant  by a regression test.  Do you believe it?

(Answer)  What I would believe is:  No, there is no trend, but it is
probably true that someone is fudging the numbers.  The 
*observed variation*  in means is far too small for the totals to
be seen be chance.  And the most obvious sources of error
would work in the opposite direction.  

[That is, if there were only a few criminals responsible for many
crimes each, and the number-of-criminals is what was subject 
to Poisson variation, THEN  the number-of-crimes should be 
even more variable.]

In your present case, I think you can estimate on the basis of
your factory (aggregate) data, and then you figure what you 
can about how consistent those numbers are with the 
un-aggregated data, in terms of means or variances.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: (none)

2001-05-10 Thread Rich Ulrich


 - selecting from CH's article, and re-formatting.  I don't know if 
I am agreeing, disagreeing, or just rambling on.

On 4 May 2001 10:15:23 -0700, [EMAIL PROTECTED] (Carl Huberty)
wrote:

CH:   Why do articles appear in print when study methods, analyses,
results, and conclusions are somewhat faulty?

 - I suspect it might be a consequence of Sturgeon's Law, 
named after the science fiction author.  Ninety percent of 
everything is crap.  Why do they appear in print when they
are GROSSLY faulty?  Yesterday's NY Times carried a 
report on how the WORST schools have improved 
more than the schools that were only BAD.  That was much-
discussed, if not published.  - One critique was, the 
absence of peer review.  There are comments from statisticians
in the NY Times article; they criticize, but (I thought) they 
don't get it  on the simplest point.

The article, while expressing skepticism by numerous 
people, never mentions REGRESSION TOWARD the MEAN
which did seem (to me) to account for every single claim of the
original authors whose writing caused the article.


CH:  []  My first, and perhaps overly critical, response  is that
the editorial practices are faulty[ ... ] I can think of two
reasons: 1) journal editors can not or do not send manuscripts to
reviewers with statistical analysis expertise; and 2) manuscript
originators do not regularly seek methodologists as co-authors.  
Which is more prevalent?

APA Journals have started trying for both, I think.  But I think
that statistics only scratches the surface.  A lot of what arises
are issues of design.  And then there are issues of data analysis.

Becoming a statistician helped me understand those so that I could
articulate them for other people;  but a lot of what I know was never
important in any courses.  I remember taking just one course or
epidemiology, where we students were responsible for reading and
interpreting some published report, for the edification of the whole
class -- I thought I did mine pretty well, but the rest of the class
really did stagger through the exercise.  

Is this critical reading  something that can be learned, and
improved?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: 2x2 tables in epi. Why Fisher test?

2001-05-10 Thread Rich Ulrich


 - I offer a suggestion of a reference.

On 10 May 2001 17:25:36 GMT, Ronald Bloom [EMAIL PROTECTED] wrote:

[ snip, much detail ] 
 It has become the custom, in epidemiological reports
 to use always the hypergeometric inference test --
 The Fisher Exact Test -- when treating 2x2 tables 
 arising from all manner of experimental setups -- e.g.
 
 a.) the prospective study
 b.) the cross-sectional study
 3.) the retrospective (or case-control) study
  [ ... ]

I don't know what you are reading, to conclude that this
has become the custom.   Is that a standard for some
journals, now?

I would have thought that the Logistic formulation was
what was winning out, if anything.

My stats-FAQ  has mention of the discussion published in
JRSS (Series B)  in the1980s.  Several statisticians gave 
ambivalent support to Fisher's test.  Yates argued the logic
of the exact test, and he further recommended the  X2 test
computed with his (1935) adjustment factor, as a very accurate 
estimator of Fisher's p-levels.

I suppose that people who hate naked p-levels will have to 
hate Fisher's Exact test, since that is all it gives you.

I like the conventional chisquared test for the 2x2, computed
without Yates's correction --  for pragmatic reasons.  Pragmatically,
it produces a good imitation of what you describe, a randomization
with a fixed N but not fixed margins.  That is ironic, as Yates
points out (cited above) because the test assumes fixed margins
when you derive it.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Analysis of a time series of categorical data

2001-05-04 Thread Rich Ulrich

On 3 May 2001 09:46:12 -0700, [EMAIL PROTECTED] (R. Mark Sharp; Ext.
476) wrote:

 If there is a better venue for this question, please advise me.

 - an epidemiology mailing list?
[ snip, much detail ] 
  Time point 1Time point 2Time point 3Time point 4  Hosts
  Inf  Not-InfInf  Not-InfInf  Not-InfInf  Not-Inf  Tested
 
 G1-S11  14   11  4   11 1   13 2   57
 G1-S27   8   12  3   14 2   15 8   69
 G1-S31  246 18815915   95
 
 G2-S43  12   12  4   10 4   14 2   61
 G2-S55  105  68 7   1114   57
 G2-S62  26   12 12   1116   1412  105
 
 The questions are how can group 1 (G1) be compare to group 2 (G2) and 
 how can subgroups be compared. I maintain that the heterogeneity 
 within each group does not prevent pooling of the subgroup data 
 within each group, because the groupings were made a priori based on 
 genetic similarity.

Mostly, heterogeneity prevents pooling.  
What's an average supposed to mean?

Only if the Ns represent naturally-occurring proportions, 
and so does your hypothesis, then you MIGHT want to
analyze the numbers that way.

How much do you know about the speed of expected onset,
and offset of the disease?  If this were real, It looks to me like you
would want special software.  Or special evaluation of a likelihood 
function.  

I can put the hypothesis in simple  ANOVA terms, comparing species
(S).  Then, the within-Variability of G1 and G2 -- which is big --
would be used to test the difference Between:  according to some
parameter.   Would that be an estimate of maximum number afflicted?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Simple ? on standardized regression coeff.

2001-04-18 Thread Rich Ulrich

On Tue, 17 Apr 2001 16:32:06 -0500, "d.u." [EMAIL PROTECTED]
wrote:

 Hi, thanks for the reply. But is beta really just b/SD_b? In the standardized
 case, the X and Y variables are centered and scaled. If Rxx is the corr matrix
 [ ... ]
No.  b/SD_b  is the t-test.
Beta is b, after it is scaled by the SD of X and the SD of Y.

Yes, beta is the b if X and Y  are  'scaled' to unit normal.
-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Simple ? on standardized regression coeff.

2001-04-17 Thread Rich Ulrich

On Mon, 16 Apr 2001 20:24:10 -0500, "d.u." [EMAIL PROTECTED]
wrote:

 Hi everyone. In the case of standardized regression coefficients (beta),
 do they have a range that's like a correlation coefficient's? In other
 words, must they be within (-1,+1)? And why if they do? Thanks!
 
There is no limit on the raw coefficient, b, so there is no limit on
beta= b/SD.
In practice, b gets large when there is a suppressor relationship, so
that the x1-x2  difference is what matters, e.g.,  (10x1-9x2).

Beta is about the size of the univariate correlation when the
co-predictors balance out in their effects.  I usually want to
consider a different equation if any beta is greater than 1 or 
has the opposite sign from its  corresponding, initial r -- for 
instance, I might combine (X1, X2) in a rational way.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: In realtion to t-tests

2001-04-09 Thread Rich Ulrich

On Mon, 09 Apr 2001 10:44:40 -0400, Paige Miller
[EMAIL PROTECTED] wrote:

 "Andrew L." wrote:
  
  I am trying to learn what a t-test will actually tell me, in simple terms.
  Dennis Roberts and Paige Miller, have helped alot, but i still dont quite
  understand the significance.
  
  Andy L
 
 A t-test compares a mean to a specific value...or two means to each
 other...
 [ ... ]

I remember my estimation classes, where the comparison was
always to ZERO for means.  To ONE, I guess, for ratios.
Technically  speaking, or writing.

For instance,   if the difference in averages X1, X2  is expected to
be zero, then  "{(X1-X2) -0 }"  ... is distributed as t .   It might
look like a lot of equations with the 'minus zero'  seemingly tacked
on, but  I consider this to be good form.  It formalizes as
 term  minus Expectation of term

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: rotations and PCA

2001-04-08 Thread Rich Ulrich

 - Intelligence, figuring what it might be, and categorizing it, and
measuring it... I like the topics, so I have to post more.

On Thu, 05 Apr 2001 22:09:33 +0100, Colin Cooper
[EMAIL PROTECTED] wrote:

 In article [EMAIL PROTECTED],
  Rich Ulrich [EMAIL PROTECTED] wrote:
 
  I liked Gould's book.  I know that he offended people by pointing to
  gross evidence of racism and sexism in 'scientific reports.'  But he
  has (I think) offended Carroll in a more subtle way.  Gould is 
  certainly partial to ideas that Carroll is not receptive to; I think
  that is what underlies this critique.
  
  ===snip
 
 I've several problems with Gould's book.
 
 (1)  Sure - some of the original applications of intelligence testing 
 (screening immigrants who were ignorant of the language using tests 
 which were grossly unfair to them) were unfair, immoral and wrong.  But 
 why impugn the whole area as 'suspect' because of the 
 politically-dubious activities of some researchers a century ago?  It

I think Gould to "impugned"  more than just one area.  The message, 
as I read it, was, "Be leery of social scientists who provide
self-congratulatory and self-serving, simplistic conclusions."

In recent decades, I imagine that economists have been bigger 
at that than psychologists.  Historians have quite a bit of 20th
century history-writing to live down, too.

 
 seems to me to be exceptionally surprising to find that ALL abilities - 
 musical, aesthetic, abstract-reasoning, spatial, verbal, memory etc. 
 correlate not just significantly but substantially.

Here is one URL  for references to Howard Gardner, who has
shown some facets of independence of abilities (and who you 
mention, below).
http://www.newhorizons.org/trm_gardner.html


 (2)  Gould's implication is that since Spearman found one factor 
 (general ability) whilst Thurstone fornd about 9 identifiable factors, 
 then factor analysis is a method of dubious use, since it seems to 
 generate contradictory models.  There are several crucial differences 

 - I read Gould as being more subtle than that.

 between the work of Spearman and Thurstone that may account for these 
 differences.  For example, (a)  Spearman (stupidly) designed tests 
 containing a broad spectrum of abilities: his 'numerical' test, for 
 example, comprised various sorts of problems - addition, fractions, etc.  
 Thurstone used separate tests for each: so Thurstone's factors 
 essentially corresponded to Spearman's tests. (b) Thurstone's work was 
 with students where the limited range of abilities would reduce the 
 magnitude of correlations between tests. (c)  More recent work (e.g., 
 Gustafsson, 1981; Carroll, 1993) using exploratory factoring and CFA 
 finds good evidence for a three-stratum model of abilities: 20+ 
 first-order factors, half a dozen second-order factors, or a single 
 3rd-order factor.
 
 (3)  Interestingly, Gardner's recent work has come to almost exactly the 
 same conclusions from a very different starting point.  Gardner 
 identiied groups of abilities which, according to the literature, tended 
 to covary - for example, which tend to develop at the same age, all 
 change following drugs or brain injury, which interfere with each other 
 in 'dual-task' experiments and so on.  His list of abilities derived in 
 this was is very similar to the factors identified by Gustaffson, 
 Carroll and others.

 - but Gardner has "groups of abilities" that are, therefore, distinct
from each other.  And also, only a couple of abilities are usually
rewarded (or even measured) in our educational system.  When I read
his book, I thought Gardner was being overly  "scholastic" in his
leaning, and restrictive in his data, too.

 I have a feeling that we're going to get on to the issue of whether 
 factors are merely arbitrary representations of sets of data or whether 
 some solutions are more are more meaningful than others - the rotational 
 indeterminacy problem - but I'm off to bed! 

Well, how much data can you load into one factor analysis? 
How much virtue can you assign to one 'central ability'?
 - I see the problem as philosophical instead of numeric.
What you will  *identify*  as a single factor (by techniques 
of today) will be more trivial than you want.

Daniel Dennett, in "Consciousness Explained," does a clever
job of defining consciousness.  And trivializing it; what I was
interested in (I reflect to myself) was something much grander, 
something more meaningful.  But intelligence and self-awareness 
are separate topics, and big ones.  Julian Jaynes's book was
more useful on the bigger picture -- setting a framework, so to 
speak, and establishing the size of the problem.



-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIAT

Re: attachments

2001-04-06 Thread Rich Ulrich

On Fri, 06 Apr 2001 13:34:03 GMT, Jerry Dallal
[EMAIL PROTECTED] wrote:

 "Drake R. Bradley" wrote:
 
  While I agree with the sentiments expressed by others that attachments should
  not be sent to email lists, I take exception that this should apply to small
  (only a few KB or so) gif or jpeg images. Pictures *are* often worth a
  thousand words, and certainly it makes sense that the subscribers to a stat
 
 It's worth noting that some lists have gateways to Usenet groups. 
 Usenet does not support attachments, so they will be lost to Usenet
 readers.  [ break ]

 - my Usenet connection seems to give me all the attachments.
But if I depended on a modem and a 7-bit protocol, I would be 
pleased if my ISP  filtered out the occasional, 100 kilobyte 8-bit
attachment.  (Some folk still use 7-bit protocols, don't they?)

 Also, even in the anything-goes early 21-st Century climate
 of the Internet, one big no-no remains the posting of binaries to
 non-binary groups.

Right; that's partly because of size.  My vendor has the practice,
these days, of saving ordinary groups for a week, binary groups
(which are the BULK of their internet feed) for 24 hours.  Binary
strings may be treated as screen-commands, if your Reader doesn't 
know to package them as an 'attachment' or otherwise ignore them.

Some attachments are binary, some are not.  
Standard HTML files are ASCII, with the added 'risk' 
(I sometimes look at it that way) of invoking an
immediate internet connection.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Fw: statistics question

2001-04-06 Thread Rich Ulrich

I reformatted this.

Quoting a letter from Carmen Cummings to himself,
On 6 Apr 2001 08:48:38 -0700, [EMAIL PROTECTED] wrote:

 The below question was on my Doctorate Comprehensives in
 Education at the University of North Florida.
 
 Would one of you learned scholars pop me back with 
possible appropriate answers.

 the question
An educational researcher was interested in developing a
predictive scheme to forecast success in an elementary statistics
course at a local university. He developed an instrument with a
range of scores from 0 to 50. He administered this to 50 incoming
frechmen signed up for the elementary statistics course, before
the class started. At the end of the semester he obtained each of
the 50 student's final average. 

Describe an appropriate design to collect data to test the
hypothesis. 
= end of cite.

I hope the time of the Comprehensives is past.  Anyway, this
might be better suited for facetious answers, than serious ones.

The "appropriate design" in the strong sense:  

Consult with a statistician  IN ORDER TO "develop an instrument".  
Who decided only a single dimension should be of interest?  
(How else does one interpret a score with a "range" from 0 to 50?)

Consult with a statistician BEFORE administering something to --
selected?  unselected? -- freshman; and consult (perhaps) 
in order to develop particular hypotheses worth testing.  
I mean, the kids scoring over 700 on Math SATs will ace 
the course,  and the kids under 400 will have trouble.  

Generalizing, of course.  If "final average"  (as suggested) 
is the criterion, instead of "learning."
But you don't need a new study to tell you those results.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: 1 tail 2 tail mumbo jumbo

2001-03-25 Thread Rich Ulrich

On Mon, 19 Mar 2001 13:14:39 -0500, Bruce Weaver
[EMAIL PROTECTED] wrote:

 On Fri, 16 Mar 2001, Rich Ulrich wrote:
[ snip, including earlier post ] 
  That ANOVA is inherently a 2-sided test.  So is the traditional 2x2
  contingency table.   That is because,  sides  refer to  hypotheses.
 
 snip
 
 
 I agree with you Rich, except that I don't find "2-sided" all that
 appropriate for describing ANOVA.  For an ANOVA with more than 2 groups,
 there are MULTIPLE patterns of means that invalidate the null hypothesis,
 not just 2. With only 3 groups,for example:
 
   A  B  C
   A  C  B
   B  A  C
 [ ... ]

 And then if you included all of the cases where 2 of the means are equal
 to each other, but not equal to the 3rd mean, there are several more
 possibilities.  And these ways of departing from 3 equal means do not
 correspond to tails in some distribution.
 
 There's my attempt to add to the confusion.  ;-)

If I convince people that they want only one *contrast*  for their
ANOVA, then it is just two-sided.  I've been talking people out
of blindly testing multiple-groups and multiple periods, for years.

Then I have to start over on the folks, to convince them about 
MANOVA.  If there are two groups and two variables, 
there are FOUR sides -- and that's if you just count what is
'significant' by the single variables.  Most of the possible results
are not useful ones; that is, they are not easily interpretable, when
no variable is 'significant' by itself, or when logical directions
seem to conflict.

We can interpret "group A is better than B."  And we analyze 
measures that have the scaled meaning, where one end is better.
So the sensible analysis uses a defined contrast, the 'composite 
score';  and then you don't have to use the MANOVA packages, 
and you have the improved power of testing just one or two sides.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk

2001-03-15 Thread Rich Ulrich

On 14 Mar 2001 21:55:48 GMT, [EMAIL PROTECTED] (Radford Neal)
wrote:

 In article [EMAIL PROTECTED],
 Rich Ulrich  [EMAIL PROTECTED] wrote:
 
 (This guy is already posting irrelevant rants as if 
 I've driven him up the wall or something.  So this 
 is just another poke in the eye with a blunt stick, to see
 what he will swing at next)
 
 I think we may take this as an admission by Mr. Ulrich that he is
 incapable of advancing any sensible argument in favour of his
 position.  Certainly he's never made any sensible response to my
 criticism.  

 - In a new thread, I have now provided a response that is sensible, 
or, at least, somewhat numeric.

I notice that Jim C.  has taken up the cudgel, in trying to explain
the basics of t-tests to Jim S, and that  "furthers my position."

I figure that after I state my position in one post, explicate it in
another, and try that again while refining the language -- then
I may as well call it quits with JS, when he still doesn't get the
points from the first (or from the couple of other people who
were posting them before I was).

I may not be saying it all that well, but I wasn't inventing the
position.

You and I are in agreement, now, on one minor conclusion:  
"The t-test isn't good evidence about a difference in averages."
But for me, that's true because the numbers are crappy 
indicators of performance -- which was clued *first*  by the 
distribution.

Whereas, you seem to have much more respect for crude
averages, compared to the several of us who object.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk

2001-03-15 Thread Rich Ulrich

 - I hate having to explain jokes -

On 14 Mar 2001 15:34:45 -0800, [EMAIL PROTECTED] (dennis roberts) wrote:

 At 04:10 PM 3/14/01 -0500, Rich Ulrich wrote:
 
 Oh, I see.   You do the opposite.  Your own
 flabby rationalizations might be subtly valid,
 and, on close examination,
 *do*  have some relationship to the questions
 
 
 could we ALL please lower a notch or two ... the darts and arrows? i can't 
 keep track of who started what and who is tossing the latest flames but ... 
 somehow, i think we can do a little better than this ... 

Dennis,
Please, where is YOUR sense of humor?   

My post was a literary exercise -- I intentionally posted his lines
immediately before mine, so the reader could follow my re-write 
phrase by phrase. 
I'm still hoping "Irving" will lighten up.

You chopped out the original that I was paraphrasing, and you did
*not*  indicate those important [snip]s -- You would mislead the
casual reader to think someone other than JimS is originating lines
like that, or intend them as critique in this group.
 - I'm not always kind, but I think I am never that wild.  
 - It's probably been a dozen years since I purely flamed like that.

(Or maybe I never flamed, if you talk about the really empty ones.  
In the olden days of local Bulletin Boards, with political topics, I
discarded 1/3 of my compositions without ever posting, because of 
poor content or tone.  I still use some judgment in what I post.)


Compare his original line about  'little or no ... relationship'  with
my clever reversal,   "... on close examination, *do*  have some
relationship to the questions."

Well, I was trying for humor, anyway.  Sorry, if I missed.
-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk

2001-03-08 Thread Rich Ulrich

On Thu, 08 Mar 2001 10:38:59 -0800, Irving Scheffe
[EMAIL PROTECTED] wrote:

 On Fri, 02 Mar 2001 16:28:53 -0500, Rich Ulrich [EMAIL PROTECTED]
 wrote:
 
 On Tue, 27 Feb 2001 07:49:23 GMT, [EMAIL PROTECTED] (Irving
 Scheffe) wrote:
 
 My comments are written as responses to the technical 
 comments to Jim Steiger's last post.  This is shorter than his post,
 since I omit redundancy and mostly ignore his 'venting.'
 I think I offer a little different perspective on my previous posts. 
 
 [ snip, intro. ]
 
 Mr. Ulrich's latest post is a thinly veiled ad hominem, and
 I'd urge him to rethink this strategy, as it does not
 present him in a favorable light. 

 - I have a different notion of ad-hominem, since I think it is
something directed towards 'the person'  rather than at the
presentation.  Or else, I don't follow what he means by 'thinly
veiled.'

When a belligerent and nasty and arrogant tone seems to be
an essential part of an argument, I don't consider myself to be
reacting 'ad-hominem' when I complain about it -- it's not that I
hate to be ad-hominem, but I don't like to be misconstrued.

I'm willing, at times, to plunk for the 'ad-hominem'.   
For instance, since my last post on the subject, I looked at those
reports. Also, I searched with google for the IWF -- who printed the
anti-MIT critiques.  I see the organization characterized as an
'anti-feminist' organization, with some large funding from Richard
Scaife.  'Anti-feminist'  could mean a reasoned-opposition, or a
reflex opposition.  Given these papers, it appears to me to qualify as
'reflex' or kneejerk opposition.  Oh, ho! I say,  this explains where
the arguments came from, and why Jim keeps on going --  
Now, THIS PARAGRAPH   is what I consider an ad-hominem argument.  
And I'll give you some more.

Scaife is a paranoid moneybags and publisher who infests this
Pittsburgh region (which is why I have noticed him more than a
westerner like Coors).  His cash was important in persecuting Clinton
for his terms in office.   For example, Scaife  kept alive Victor
Foster's suicide for years.  He held out money for anyone willing to
chase down Clinton-scandals.  Oh, he funded the chair at Pepperdine
that Starr had intended to take.

Now:  My comment on the original reports:  I am happy to say that it
looks to me as if MIT is setting a good model for other universities
to follow.  The senior administrator listens to his faculty,
especially his senior faculty, and responds.  

MIT makes no point about numbers in their statements, and it 
does seem to be wise and proper that they don't do so.  

I see now, Jim is not really arguing with MIT.  They won't argue back.

Jim's purpose  is to create a hostile presence, a shadow to threaten 
other administrators.  He goes, like, "If you try to 'cut a break'
for women, we'll be watching and threatening and undermining,
threatening your job if we can."  

I suppose state universities are more vulnerable than the private
universities like MIT.  On the other hand, with the numbers that Jim
has put into the public eye, the next administrator can point to the
precedent of MIT and assert that, clearly, the simple numbers on
'quality' are substantially irrelevant to the issues, since they were
irrelevant at MIT.

Hope this helps.

-- 
Rich Ulrich, [EMAIL PROTECTED]

http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Trend analysis question: follow-up

2001-03-06 Thread Rich Ulrich

On 5 Mar 2001 16:41:22 -0800, [EMAIL PROTECTED] (Donald Burrill)
wrote:

 On Mon, 5 Mar 2001, Philip Cozzolino wrote in part:
 
  Yeah, I don't know why I didn't think to compute my eta-squared on the 
  significant trends. As I said, trend analysis is new to me (psych grad
  student) and I just got startled by the results.
  
  The "significant" 4th and 5th order trends only account for 1% of the
  variance each, so I guess that should tell me something. The linear 
  trend accounts for 44% and the quadratic accounts for 35% more, so 79% 
  of the original 82% omnibus F (this is all practice data).
  
  I guess, if I am now interpreting this correctly, the quadratic trend 
  is the best solution.
DB 
   Well, now, THAT depends in part on what the 
 spectrum of candidate solutions is, doesn't it?  For all that what you 
 have is "practice data", I cannot resist asking:  Are the linear  
 quadratic components both positive, and is the overall relationship 
 monotonically increasing?  Then, would the context have an interesting 
 interpretation if the relationship were exponential?  Does plotting 
 [ snip, rest ]

"Interesting interpretation" is important.  In this example, the
interest (probably) lies mainly with the variance-explained: 
in the linear and quadratic.

It's hard for me to be highly interested in an order-5 polynomial,
and sometimes a quadratic seems unnecessarily awkward.

What you want is the convenient, natural explanation.  
If "baseline" is far different from what follows, that will induce 
a bunch of high order terms if you insist on modeling all the 
periods in one repeated measures ANOVA.  A sensible
interpretation in that case might be, to describe the "shock effect"
and separately describe what happened later.

Example.
The start of Psychotropic medications has a huge, immediate,
"normalizing"  effect on some aspects of sleep of depressed patients
(sleep latency, REM latency, REM time, etc.).  Various changes 
*after*  the initial jolt can be described as no-change;  continued
improvement;  or  return toward the initial baseline.  

In real life, linear trends worked fine for describing the on-meds
followup observation nights (with - not accidentally - increasing
intervals between them).
-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Census Bureau nixes sampling on 2000 count

2001-03-04 Thread Rich Ulrich

On Fri, 02 Mar 2001 12:16:42 GMT, [EMAIL PROTECTED] (J. Williams)
wrote:

 The Census Bureau urged Commerce Secretary Don Evans on Thursday not
 to use adjusted results from the 2000 population count.  Evans must
 now weigh the recommendation from the Census Bureau, and will make the
 decision next week.  If the data were adjusted statistically it  could
 be used to redistribute and remap political district lines. William
 Barron, the Bureau Director, said in a letter to Evans that he agreed
 with a Census Bureau committee recommendation "that unadjusted census
 data be released as the Census Bureau's official redistricting data."
 Some say about 3 million or so people make up a disenfranchising
 undercount.  Others disagree viewing sampling as a method to "invent"
 people who have not actually been counted.  Politically, the stakes
 are high on Evans' final decision.

People may wonder, 
"Why did the Census Bureau say this, and why is there little criticism
of them?"

According to the reports of a few weeks ago, the inner-city counts,
etc.,  of this census were quite a bit more accurate than they were 10
years ago.  That means that we couldn't be so sure that adjustment
would make a big improvement, or any improvement.

This frees Republicans of some blame, for this one instance, of
pushing specious technical arguments for short-term GOP gain.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk

2001-03-02 Thread Rich Ulrich
hould have pointed Gene and
Dennis politely to the details, instead of blundering around and 
making it appear that "this one is huge"  is your whole basis.
My commentary is devoted to your presentation, here.

[ snip, "importance of issue" and more redundancy.]

Hope that helps.
-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Post-hoc comparisons

2001-03-02 Thread Rich Ulrich

On 2 Mar 2001 07:27:16 -0800, [EMAIL PROTECTED] (Esa M. Rantanen)
wrote:

[ snip, detail ]
 contingency table.  I have used a Chi-Sq. analysis to determine if there is
 a statisitcally significant difference between  the (treatment) groups (all
 4!), and indeed there is.  I assume, however, that I cannot simply do
 pairwise comparisons between the groups using Chi-Sq. and 2 x 2 matrices
 without inflating the probability of Type 1 error, (1-alpha)^4 in this
 case.  As far as I know, there are no equivalents to Duncan's or Tukey's
 tests for the type of data (binary) I have to deal with.

Well, if you want to do the ANOVA on the dichotomous variable, 
I won't complain.  My reaction is, you are assuming that, somewhere,
great precision matters.  But being precise in your thinking will gain
you most, so that you do and report just ONE important test, that you 
figured out beforehand,  instead of trying to cope with 6 tests that
happen to fall into your lap.

I would probably 
  (a) Let the Overall test justify all my followup testing, where the
followup testing is descriptive, among categories of equal N and
equivalent importance; or  
  (b) Do a few specified tests with Bonferroni correction, and report
those tests.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Cronbach's alpha and sample size

2001-02-28 Thread Rich Ulrich

On Wed, 28 Feb 2001 12:08:55 +0100, Nicolas Sander
[EMAIL PROTECTED] wrote:

 How is Cronbach's alpha affected by the sample size apart from questions
 related to generalizability issues?

 - apart from generalizability, "not at all."
 
 Ifind it hard to trace down the mathmatics related to this question
 clearly, and wether there migt be a trade off between N of Items and N
 of sujects (i.e. compensating for lack of subjects by high number of
 items).

I don't know what you mean by 'trade-off.'   I have trouble trying to
imagine just what it is, that you are trying to trace down.
But, NO.  

Once you assume some variances are equal, Alpha can be seen 
as a fairly simple function of the number of items and the average
correlation -- more items, higher alpha.   The average correlation has
a tiny bias  by N, but that's typically, safely ignored.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk

2001-02-26 Thread Rich Ulrich

 - I want to comment a little more thoroughly about the lines I cited:
what Garson said about inference, and his citation of Olkey.


On Thu, 22 Feb 2001 18:21:41 -0500, Rich Ulrich [EMAIL PROTECTED]
wrote:

[ snip, previous discussion ]

me 
 I think that Garson is wrong, and the last 40 years of epidemiological
 research have proven the worth of statistics provided on non-random,
 "observational"  samples.  When handled with care.
 
 From G. David Garson, "PA 765 Notes: An Online Textbook."
 
 On Sampling
 http://www2.chass.ncsu.edu/garson/pa765/sampling.htm
 
 Significance testing is only appropriate for random samples.
 
 Random sampling is assumed for inferential statistics
 (significance testing). "Inferential" refers to the fact
 that conclusions are drawn about relationships in the data
 based on inference from knowledge of the sampling
 distribution. Significance tests are based on a sampling
 theory which requires that every case have a chance of being
 selected known in advance of sample selection, usually an
 equal chance. Statistical inference assesses the
 significance of estimates made using random samples. For
 enumerations and censuses, such inference is not needed
 since estimates are exact. Sampling error is irrelevant and
 therefore inferential statistics dealing with sampling error
 are irrelevant. 

 - I agree with most of what he says, throughout; there will be a
matter of nuances on interpretation and actions.

For enumerations and censuses, a limited sort of statistics on 'finite
populations,' he says sampling error is irrelevant.  Irrelevant is a
good and fitting word here.  This is not 'illegal  and banned,'  but
rather 'unwanted and totally beside the point.'

Garson 
  Significance tests are sometimes applied
 arbitrarily to non-random samples but there is no existing
 method of assessing the validity of such estimates, though
 analysis of non-response may shed some light. The following
 is typical of a disclaimer footnote in research based on a
 non random sample: 

Here is my perspective on testing, which does not match his.
 - For a randomized experimental design,  a small p-level on 
a "test of hypothesis" establishes that *something*  seemed 
to happen, owing to the treatment; the test might stand 
pretty-much by itself.
 - For a non-random sample, a similar test establishes that
*something*  seems to exist, owing to the factor in question 
*or*  to any of a dozen factors that someone might imagine.  
The test establishes, perhaps, the  _prima facie_  case  but the
investigator has the responsibility of trying to dispute it.  

That is, it is an investigator's responsibility (and not just an
option) to consider potential confounders and covariates.  
If the small p-level stands up robustly, that is good for the 
theory -- but not definitive.  If there are vital aspects or factors
that cannot be tested, then opponents can stay unsatisfied, 
no matter WHAT the available tests may say.


Garson  
 "Because some authors (ex., Oakes, 1986) note the use of
 inferential statistics is warranted for nonprobability
 samples if the sample seems to represent the population, and
 in deference to the widespread social science practice of
 reporting significance levels for nonprobability samples as
 a convenient if arbitrary assessment criterion, significance
 levels have been reported in the tables included in this
 article." See Michael Oakes (1986). Statistical inference: A
 commentary for social and behavioral sciences. NY: Wiley. 
 

Garson is telling his readers and would-be statisticians  a way to
present p-levels,  even when the sampling doesn't justify it.
And, I would say, when the analysis doesn't justify it.
I am not happy with the lines -- The disclaimer does not assume 
that a *good*  analysis has been done, nor does it point to what 
makes up a good analysis.  

 '... if the sample seems to represent the population'  
seems to be a weak reminder of the proper effort to overcome 
'confounding factors';  it is not an assurance that the effects 
have proven to be robust.  

So, the disclaimer should recognize that the non random sample 
is potentially open to various interpretations; the present analysis
has attempted to control for several possibilities;  certain effects
do seem robust statistically, in addition to being supported by 
outside chains of inference, and data collected independently.

I suggested earlier that this is the status of epidemiological,
observational studies.  For the most part, those studies have 
been quite fruitful.  But not always.  They have been especially
likely to mislead, I think, when the designs pretend that binomial
variability is the only source of error in a large survey, and attempt
to interpret small effects.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html



Re: Sample size question

2001-02-23 Thread Rich Ulrich

On 23 Feb 2001 12:08:45 -0800, [EMAIL PROTECTED] (Scheltema,
Karen) wrote:

 I tried the site but received errors trying to download it.  It couldn't
 find the FTP site.  Has anyone else been able to access it?

As of a few minutes ago, it downloaded fine for me, when I clicked on
it with  Internet Explorer.  The  .zip  file expanded okay.  I used
right-click (I just learned that last week) in order to download the
 .pfd  version of the help.

[ ... ]

 Earlier Q and Answer 
"Can anyone point me to software for estimating ANCOVA or regression
sample sizes based on effect size?"
  Look here:
  http://www.interchg.ubc.ca/steiger/r2.htm


Hmm.  Placing limits on R^2.  I have't read the 
accompanying documentation.  

On the general principal that you can't compute power
if you don't know what power you are looking for, I suggest reading
the relevant chapters in Jacob Cohen's book (1988+ edition).

-- 
Rich Ulrich, [EMAIL PROTECTED]


http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=