Re: Mean of Standard deviations

2001-05-17 Thread Paul Swank

That's the pooled sd. If all the data were in one column then you would
have to consider the mean differences between sets or you will
underestimate the variance. To find the standard deviation for the combined
groups use

{[[sum(n(i)(M(i)**2 + var(i))] / sum(n(i))] - grand means**2} and then take
the square root.

formula from McNemar, 1969


At 11:33 PM 5/17/01 -0400, dennis roberts wrote:


sounds like you want the overall sd ... as though you had ALL the data in 
ONE column and were calculating the sd on THAT one column

the formula for TWO groups would be:

variance (weighted or pooled)=

[(n1-1)* var1] + [(n2-1)*var2] all divided by ... n1 + n2 -2

then take the square root to get the overall sd

if you have more than two groups ... just follow the same pattern



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=


Paul R. Swank, PhD.
Professor
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033
soon to be moving to the Department of Pediatrics 
UT Houston School of Medicine


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



post-doc position

2001-05-07 Thread Paul Swank
My apologies if you receive this more than once. Please pass on to any soon-to-be or recently graduated students who may be interested. Many thanks.

The Division of Developmental Pediatrics is accepting applications for a post-doctoral fellowship in quantitative methods with emphasis or interest in psychometric issues. The successful candidate will work with a diverse team of psychologists, physicians, and other health care professionals on a wide variety of behavioral science projects related mainly to development in children. One such project is the creation of a Center to study psychometric issues in the assessment of children particularly for longitudinal research. Methodological areas of focus are psychometric models of growth, particularly Rasch models, longitudinal data analysis, mixed models, and structural equation modeling. Requirements include a Doctoral degree in quantitative methods, psychometrics, or statistics and experience with SAS. The successful candidate will collaborate with investigators on existing projects and may develop an independent research program in a related area. Salary for the position will be $39,500 with excellent fringe benefits. Accepting applications immediately with position to begin no later than September 1.
Send vita and three letters of reference to:

Dr. Susan Landry
Department of Pediatrics 
University of Texas Houston Health Science Center
7000 Fannin, Suite 2401
Houston, Texas 77030 
Or e-mail to:
[EMAIL PROTECTED]




Paul R. Swank, PhD.
Professor  Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033
soon to be moving to the Department of Pediatrics 
UT Houston School of Medicine

Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ 

Re: ANCOVA vs. sequential regression

2001-04-23 Thread Paul Swank
It's also called a test of homogeneity of regression slopes, but it is really just a an interaction. There is also a test of parallelism in profile analysis which tends to confuse the issue. I sometimes wonder if it is worth it to try and give all these tests names. An interaction is always a test of parallel lines whether it is factoral anova, ancova, regression, or profile analysis.

At 01:40 AM 4/22/01 GMT, you wrote:
>Paul Swank [EMAIL PROTECTED]> wrote:
>: I agree. ...
>
>It's usually called a test of parallelism.  The Ancova test is a test of
>separation only if the lines are parallel
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>  http://jse.stat.ncsu.edu/
>=
>

Paul R. Swank, PhD.
Professor  Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033
soon to be moving to the Department of Pediatrics 
UT Houston School of Medicine

= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = 

Re: ANCOVA vs. sequential regression

2001-04-20 Thread Paul Swank
I agree. ANCOVA is often said to be used when the groups are not experimental. It is a linear model of the same nature as ACOVA but technically ANCOVA refers to the experimental situation. The general linear model would allow comparing the groups while controlling for X but as William says, you should check for the interaction first to make sure the simpler model is really appropriate.


At 04:16 PM 4/20/01 -0400, William B. Ware wrote:
>On Fri, 20 Apr 2001, William Levine wrote:
>
>> A study was conducted to assess whether there were age differences in memory
>> for order independent of memory for items. Two preexisting groups (younger
>> and older adults - let's call this variable A) were tested for memory for
>> order information (Y). These groups were also tested for item memory (X).
>> 
>> Two ways of analyzing these data came to mind. One was to perform an ANCOVA
>> treating X as a covariate. But the two groups differ with respect to X,
>> which would make interpretation of the ANCOVA difficult. Thus, an ANCOVA did
>> not seem like the correct analysis.
>
>Here's my take on it... The ANCOVA model can be implemented with
>sequential/hierarchical regression as you note below... however, ANCOVA
>has at least two assumptions that your situation does not meet.  First, it
>assumes that assignment to treatment condition is random.  Second, it
>assumes that the measurement on the covariate is independent of
>treatment.  That is, the covariate should be measured before the treatment
>is implemented.  Thus, I believe that you should implement the
>hierarchical regression... but I'm not certain what question you are
>really answering...
>
>I guess it is whether there is variabilty in memory for order that is
>related to age, that is independent of variability in memory for
>items... So, I would not call it an ANCOVA... You might also consider the
>possibiltiy of interaction... That is, is the relationship between memory
>for order and memory for items the same for younger and older
>participants...
>
>WBW
>
>__
>William B. Ware, Professor and Chair	   Educational Psychology,
>CB# 3500		   Measurement, and Evaluation
>University of North Carolina	  	 PHONE  (919)-962-7848
>Chapel Hill, NC  27599-3500		 FAX:   (919)-962-1533
>http://www.unc.edu/~wbware/  EMAIL: [EMAIL PROTECTED]
>__
>
>> 
>> A second analysis option (suggested by a friend) is to perform a sequential
>> regression, entering X first and A second to
>> test if there is significant leftover variance explained by A.
>> 
>> This second option sounds to me like the same thing as the first option. In
>> an ANCOVA, variance in Y that is predictable by X is removed from the total
>> variance, and then variance due to A (adjusted) is tested against variance
>> due to S/A (adjusted). In
>> the sequential regression, variance in the Y that is predictable by X is
>> removed from the total variance, and then the leftover variance in Y is
>> regressed on A. Aren't these two analyses identical? If not, what is it that
>> differs? Finally, does anyone have any suggestions?
>> 
>> Many thanks!
>> --
>> William Levine
>> Department of Psychology
>> University of North Carolina-Chapel Hill
>> Chapel Hill, NC 27599-3270
>> [EMAIL PROTECTED]
>> http://www.unc.edu/~whlevine
>> 
>> 
>> 
>> 
>> =
>> Instructions for joining and leaving this list and remarks about
>> the problem of INAPPROPRIATE MESSAGES are available at
>>   http://jse.stat.ncsu.edu/
>> =
>> 
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>  http://jse.stat.ncsu.edu/
>=
>

Paul R. Swank, PhD.
Professor  Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033
soon to be moving to the Department of Pediatrics 
UT Houston School of Medicine

= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = 

Re: Student's t vs. z tests

2001-04-19 Thread Paul Swank
However, rather than do that why not right on to F? Why do t at all when you can do anything with F that t can do plus a whole lot more?

At 10:58 PM 4/19/01 -0400, you wrote:
>students have enough problems with all the stuff in stat as it is ... but, 
>when we start some discussion about sampling error of means ... for use in 
>building a confidence interval and/or testing some hypothesis ... the first 
>thing observant students will ask when you say to them ...
>
>assume SRS of n=50 and THAT WE KNOW THAT THE POPULATION SD = 4 ... is: if 
>we are trying to do some inferencing about the population mean ... how come 
>we know the population sd but NOT the mean too? most find this notion 
>highly illogical ... but we and books trudge on ...
>
>and they are correct of course in the NON logic of this scenario
>
>thus, it makes a ton more sense to me to introduce at this point a t 
>distribution ... this is NOT hard to do ... then get right on with the 
>reality case 
>
>asking something about the population mean when everything we have is an 
>estimate ... makes sense ... and is the way to go
>
>in the moore and mccabe book ... the way they go is to use z first ... 
>assume population is normal and we know sd ... spend alot of time on that 
>... CI and logic of hypothesis testing ... THEN get into applications of t 
>in the next chapter ...
>
>i think that the benefit of using z first ... then switching to reality ... 
>is a misguided order
>
>finally, if one picks up a SRS random journal and looks at some SRS random 
>article, the chance of finding a z interval or z test being done is close 
>to 0 ... rather, in these situations, t intervals or t tests are almost 
>always reported ...
>
>if that is the case ... why do we waste our time on z?
>
>
>
>At 08:52 PM 4/18/01 -0300, Robert J. MacG. Dawson wrote:
>>David J Firth wrote:
>> >
>> > : You're running into a historical artifact: in pre-computer days, 
>> using the
>> > : normal distribution rather than the t distribution reduced the size 
>> of the
>> > : tables you had to work with.  Nowadays, a computer can compute a t
>> > : probability just as easily as a z probability, so unless you're in the
>> > : rare situation Karl mentioned, there's no reason not to use a t test.
>> >
>> > Yet the old ways are still actively taught, even when classroom
>> > instruction assumes the use of computers.
>>
>> The z test and interval do have some value as a pedagogical
>>scaffold with the better students who are intended to actually
>>_understand_ the t test at a mathematical level by the end of the
>>course.
>>
>> For the rest, we - like construction crews - have to be careful
>>about leaving scaffolding unattended where youngsters might play on it
>>in a dangerous fashion.
>>
>> One can also justify teaching advanced students about the Z test so
>>that they can read papers that are 50 years out of date. The fact that
>>some of those papers may have been written last year - or next-  is,
>>however, unfortunate; and we should make it plain to *our* students that
>>this is a "deprecated feature included for reverse compatibility only".
>>
>> -Robert Dawson
>>
>>
>>=
>>Instructions for joining and leaving this list and remarks about
>>the problem of INAPPROPRIATE MESSAGES are available at
>>   http://jse.stat.ncsu.edu/
>>=
>
>_
>dennis roberts, educational psychology, penn state university
>208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
>http://roberts.ed.psu.edu/users/droberts/drober~1.htm
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>  http://jse.stat.ncsu.edu/
>=
>

Paul R. Swank, PhD.
Professor  Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033
soon to be moving to the Department of Pediatrics 
UT Houston School of Medicine

= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = 

Re: Student's t vs. z tests

2001-04-19 Thread Paul Swank
I agree. I still teach the t test also because of this, but at the same time I realize that what goes around, comes around, so what we are doing is ensuring that we will continue to see t tests in the literature. However, I find linear models easier to teach (once I erase the old stuff from their memories) than the basic inference course. It is so much more logical.

At 12:41 AM 4/20/01 -0400, you wrote:
>At 10:39 AM 4/19/01 -0500, Paul Swank wrote:
>>However, rather than do that why not right on to F? Why do t at all when 
>>you can do anything with F that t can do plus a whole lot more?
>
>
>don't necessarily disagree with this but, i don't ever see in the 
>literature in two group situations comparing means ... F tests done ...
>
>so, part of this has to do with educating students about what they will see 
>in the journals, etc.
>
>
>

Paul R. Swank, PhD.
Professor  Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033
soon to be moving to the Department of Pediatrics 
UT Houston School of Medicine

= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = 

Re: Student's t vs. z tests

2001-04-19 Thread Paul Swank
They are more than just related. One is a natural extension of the other just as chi-square is a natural extension of Z. With linear models, one can begin with a simple one sample model and build up to multiple factors and covariates using the same basic framework, which I find easier to make sense of logically and easier to teach.  

At 01:58 AM 4/19/01 -0300, you wrote:
>
>
>Paul Swank wrote:
>> 
>> However, rather than do that why not right on to F? Why do t at all when you can do anything with F that t can do plus a whole lot more?
>
>	Because the mean, normalized using the hypothesized mean and the
>observed standard deviation, has a t distribution and not an F
>distribution. I am aware that the two are algebraically related,(and
>simply) but trying to get through statistics with only one table (or
>only one menu item on your stats software) seems pointless - like trying
>to do all your logic with NAND operations just because you can.
>
>	-Robert Dawson
>

Paul R. Swank, PhD.
Professor  Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033
soon to be moving to the Department of Pediatrics 
UT Houston School of Medicine

= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = 

Re: Student's t vs. z tests

2001-04-19 Thread Paul Swank
I agree. I normally start inference by using the binomial and then then the normal approximation to the binomial for large n. It might be best to begin all graduate students with nonparametric statistics followed by linear models. Then we could get them to where they can do something interesting without taking four courses.


At 01:28 PM 4/19/01 -0500, you wrote:
>Why not introduce hypothesis testing in a binomial setting where there are
>no nuisance parameters and p-values, power, alpha, beta,... may be obtained
>easily and exactly from the Binomial distribution?
>
>Jon Cryer
>
>At 01:48 AM 4/20/01 -0400, you wrote:
>>At 11:47 AM 4/19/01 -0500, Christopher J. Mecklin wrote:
>>>As a reply to Dennis' comments:
>>>
>>>If we deleted the z-test and went right to t-test, I believe that 
>>>students' understanding of p-value would be even worse...
>>
>>
>>i don't follow the logic here ... are you saying that instead of their 
>>understanding being "bad"  it will be worse? if so, not sure that this 
>>is a decrement other than trivial
>>
>>what makes using a normal model ... and say zs of +/- 1.96 ... any "more 
>>meaningful" to understand p values ... ? is it that they only learn ONE 
>>critical value? and that is simpler to keep neatly arranged in their mind?
>>
>>as i see it, until we talk to students about the normal distribution ... 
>>being some probability distribution where, you can find subpart areas at 
>>various baseline values and out (or inbetween) ... there is nothing 
>>inherently sensible about a normal distribution either ... and certainly i 
>>don't see anything that makes this discussion based on a normal 
>>distribution more inherently understandable than using a probability 
>>distribution based on t ... you still have to look for subpart areas ... 
>>beyond some baseline values ... or between baseline values ...
>>
>>since t distributions and unit normal distributions look very similar ... 
>>except when df is really small (and even there, they LOOK the same it is 
>>just that ts are somewhat wider) ... seems like whatever applies to one ... 
>>for good or for bad ... applies about the same for the other ...
>>
>>i would be appreciative of ANY good logical argument or empirical data that 
>>suggests that if we use unit normal distributions  and z values ... z 
>>intervals and z tests ... to INTRODUCE the notions of confidence intervals 
>>and/or simple hypothesis testing ... that students somehow UNDERSTAND these 
>>notions better ...
>>
>>i contend that we have no evidence of this ... it is just something that we 
>>think ... and thus we do it that way
>>
>>
>>
>>=
>>Instructions for joining and leaving this list and remarks about
>>the problem of INAPPROPRIATE MESSAGES are available at
>>  http://jse.stat.ncsu.edu/
>>=
>>
>>
> ___
>--- |   \
>Jon Cryer, Professor [EMAIL PROTECTED]   ( )
>Dept. of Statistics  www.stat.uiowa.edu/~jcryer \\_University
> and Actuarial Science   office 319-335-0819 \ *   \of Iowa
>The University of Iowa   dept.  319-335-0706  \/Hawkeyes
>Iowa City, IA 52242  FAX319-335-3017   |__ )
>---   V
>
>"It ain't so much the things we don't know that get us into trouble. 
>It's the things we do know that just ain't so." --Artemus Ward 
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>  http://jse.stat.ncsu.edu/
>=
>

Paul R. Swank, PhD.
Professor  Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033
soon to be moving to the Department of Pediatrics 
UT Houston School of Medicine

= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = 

Re: calculating reliability

2001-03-22 Thread Paul Swank
As long as you consider relative error and not absolute error, that is, as long as the consistent time effect is not considered error, then the correlation between test 1 and test 2 is the same as the generalizability coefficient obtained from the ANOVA mean squares.


At 08:06 AM 3/22/01 +0300, Awahab El-Naggar wrote:
>Dear Colleagues
>I have been using "test-retest" method for calculating reliability by
>applying the Pearson Product Moment (PPM) analysis. However, I have been
>told that this not the right way to calculate reliability, and I should use
>the ANOVA to calculate the reliability. Would you comment and advise me.
>Many Thanks.
>A'Wahab
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>  http://jse.stat.ncsu.edu/
>=
>

Paul R. Swank, PhD.
Professor  Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033

= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = 

statistical errors

2001-03-22 Thread Paul Swank
I couldn't help wanting to add my own 2 cents to the discussion about statistical errors because I have always thought that people put too much faith in formal tests of assumptions. When the tests of assumptions are most sensitive to violations is when they are of less concern, when the sample size is large. When the ramifications of violating assumptions are greatest, when samples are small, the tests have no power to detect violations. There is no substitute for examining your data. If the data are badly skewed, you don't need a normality test to tell you that, a simple histogram will do it.



Paul R. Swank, PhD.
Professor  Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033

= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =