edstat-l@jse.stat.ncsu.edu

2001-03-16 Thread EugeneGall

>
>Irving Scheffe wrote:
>
>> Original MIT Report on the Status of Women Faculty:
>>  http://mindit.netmind.com/proxy/http://web.mit.edu/fnl/
>
>
>It is frustrating to keep getting errors when I try to access a
>printable version of the report, whether by using IE or Netscape. Is
>there a known workaround?
>
Try:
http://web.mit.edu/fnl/women/women.html#The Study




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: On inappropriate hypothesis testing. Was: MIT Sexism & sta

2001-03-17 Thread EugeneGall

>Subject: RE: On inappropriate hypothesis testing.  Was: MIT Sexism &   sta
>From: [EMAIL PROTECTED]  (Simon, Steve, PhD)
>Date: 3/16/2001 7:23 PM Eastern Standard Time
>Message-id: 
>

>Also, has anyone looked at a log transformation of the data? The  > "huge"
difference doesn't look so huge on a log scale.

I posted this link to log transformed plots >1 month ago:

http://www.es.umb.edu/edg/ECOS611/iwflnfigs.pdf

I also posted the data from the IWF report at:
http://www.es.umb.edu/edg/ECOS611/mit-iwf.zip

These data were copied from the IWF report
and are provided in Excel and SPSS format.
The data in support of gender differences in
grant support weren't provided in the IWF report.

Gene Gallagher 


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: Help on treating non-detects

2001-03-22 Thread EugeneGall

From: [EMAIL PROTECTED]  (Simon, Steve, PhD)
>For a lognormal distribution, the left tail can often be approximated by a
>triangular distribution. The median of a triangular distribution is the
>upper limit divided by the square root of 2.
>
>There are more sophisticated approaches, of course, to handle non-detects,
>but this is simple. It seems to work better, at least in some cases, than
>dividing the detection limit by 2.
>
>There are some references for this approach, but I cannot tell you what they
>are.
>
>Steve Simon, [EMAIL PROTECTED], Standard Disclaimer.
>STATS: STeve's Attempt to Teach Statistics. http://www.cmh.edu/stats

Thanks, Reg Jordan sent me an email with a reference & explanation.

"(Ref: Hornung, Richard W., and Reed, Laurence D., Appl. Occup. Environ.
Hyg., 5(1), Jan 1990, pp. 46-51)

Simply stated, when the data are not highly skewed (GM < 3.0), L/(sq.rt.2) is
a better approximation of the value of the non-detect than L/2. (L is the
LOD). The sq rt 2 comes from the assumption that the lognormal distribution
is better approximated by a right triangle in the censoring area. This paper
explains the entire derivation."




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Help on treating non-detects

2001-03-22 Thread EugeneGall

This may be a simple question on how to handle non-detects in analysis:

The CDC just released an important survey of chemicals in humans:
http://www.cdc.gov/nceh/dls/report/default.htm
One part of the data analysis piqued my interest, how non-detects were handled:
On the following page:
http://www.cdc.gov/nceh/dls/report/total%20report/DataSources.htm
under NHANES data analysis, is the following statement on how
concentrations below the limit of detection were handled:

"Analyte concentration levels less than the limit of detection
were assigned a value equal to the detection limit divided by the
square root of 2 for calculation of geometric mean values.
Geometric means are calculated by first taking the log of each
concentration, then calculating the mean of those log values, and
finally taking the antilog of that mean (the calculation can be done
using log base e or log base 10)."

There must be a simple reason for the sqrt(2), but I'm not seeing it.
Can someone help me out? 


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Inference by Bootstrapping

2001-05-05 Thread EugeneGall

>He also says "... we have to ensure that the residual errors are not
>correlated.
>If the errors exhibit some correlation, then a transformation of the
>residuals
>is in order."
>

This seems to be wrong.  Usually you analyze the residuals and if there is
serial correlation, consider alternate models including autocorrelation models
which transform both the response and explanatory variables.

>S-PLUS has functions for this, but MATLAB does not.  Can anyone provide an
>m-file, snippet, reference or discussion?

The resampling toolbox for Matlab implements several different resampling
procedures.

http://www.resample.com/matlab/


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: (none)

2001-05-10 Thread EugeneGall

>Subject: Re: (none)
>From: Rich Ulrich [EMAIL PROTECTED] 
>Date: 5/10/2001 5:15 PM Eastern 
CH:  " Why do articles appear in print when study methods, analyses,
>results, and conclusions are somewhat faulty?"
>
> - I suspect it might be a consequence of "Sturgeon's Law," 
>named after the science fiction author.  "Ninety percent of 
>everything is crap."  Why do they appear in print when they
>are GROSSLY faulty?  Yesterday's NY Times carried a 
>report on how the WORST schools have improved 
>more than the schools that were only BAD.  That was much-
>discussed, if not published.  - One critique was, the 
>absence of peer review.  There are comments from statisticians
>in the NY Times article; they criticize, but (I thought) they 
>don't "get it"  on the simplest point.
>
>The article, while expressing skepticism by numerous 
>people, never mentions "REGRESSION TOWARD the MEAN"
>which did seem (to me) to account for every single claim of the
>original authors whose writing caused the article.

>Rich Ulrich, [EMAIL PROTECTED]
>http://www.pitt.edu/~wpilib/index.html
>

The link to the NY Times story, Rich cites is below.  The design of this study
certainly appears to be a candidate for the regression fallacy.  After vouchers
were introduced in FL, the failing schools improved faster than the almost
failing schools:  "On the eve of Congressional debate over President Bush's
plan to give students at low-performing schools federal money for private
school tuition vouchers, Dr. Greene announced that Mr. Bush's proposal would
work as well." ..."That's not a theory," Dr. Greene stated, "but proven
fact."..." [Dr. Greene] showed that after failing one time, higher-scoring F
schools posted greater gains than lower-scoring D schools. Because these
schools were otherwise alike, Dr. Greene stated that a threat of vouchers must
have made F schools improve more rapidly." 

Regression to the mean can be difficult to control, but in this case there was
an internal control.  In a reanalysis of the Florida school test data, Harris
found that the greater improvement of the worst schools between grading periods
was just as great during the pre-voucher period:  "Dr. Harris found that before
1999, higher-scoring schools in the failing group also gained more than
lower-scoring schools in the next group. The subsequent voucher policy
apparently had no added effect."

http://www.nytimes.com/2001/05/09/national/09LESS.html?searchpv=site01

I wonder if "regression to the mean" will make it into the Congressional debate
of the education bill in the coming weeks.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: The False Placebo Effect

2001-05-27 Thread EugeneGall

Rich Ulrich wrote:
> - explanation:  whole experiment is conducted on patients
>who are at their *worst*  because the flare-up is what sent 
>them to a doctor. 

Gina Kolata mentions regression to the mean in her NYTimes Week in Review
article on the placebo effect today:

http://www.nytimes.com/2001/05/27/weekinreview/27KOLA.html?searchpv=nytToday


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: About kendall

2001-06-13 Thread EugeneGall

>Subject: Re: About kendall
>From: Rich Ulrich [EMAIL PROTECTED] 
>Your program that does the Kendall tau must do some
>ranking, as part of the algorithm.  Why do you think you 
>might have to calculate ranks?
>Rich Ulrich, [EMAIL PROTECTED]
Actually, algorithms for calculating Kendall's tau don't require
ranks.  You compare each pair of observations from variable A
and each pair of observations from variable B and determine
whether the differences are concordant or discordant.
Here's my Matlab m.file for calculating Kendall's tau (I left
out the calculations of the probabilities):

function TAU=KENDALL(A,B);
A=A(:);
B=B(:);
N=length(A);
S=0;
for J=1:N
A1=A(J+1:N)-A(J);
A2=B(J+1:N)-B(J);
AA=A1.*A2;
i=find(AA>=0);
lni=length(i);
k=find(~AA(i));% find the zero products.
dis=length(AA)-lni;% discordant pairs.
con=lni-length(k); % concordant pairs
S=S+con-dis;
end
TAU=S/sqrt(N1*N2);


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



cigs & figs

2001-06-17 Thread EugeneGall

On Slate, there is quite a good discussion of the meaning and probabilistic
basis of the statement that 1 in 3 teen smokers will die of cancer.  It is
written by a math prof and it is one of the most effective lay discussions I've
seen of the use of probabilities in describing health risks.

http://slate.msn.com/math/01-06-14/math.asp


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=