Re: adjusted r-square

2001-08-22 Thread Joe Ward

If the least-squares regression algorithm does not
REQUIRE THE NUMBER OF OBSERVATIONS TO EXCEED
THE NUMBER OF PREDICTORS, THEN THE REGRESSION
ALGORITHM COULD BE USED TO SOLVE A SYSTEM OF
SIMULTANEOUS EQUATIONS THAT WOULD HAVE
NO ERRORS.

Another interesting characteristic of Excel Regression is that it
requires
the number of observations to exceed the number of predictors.

Fortunately, Colin Bell is working with the Excel folks at Microsoft to
improve the numerous interesting characteristics of  Statistics in Excel.

-- Joe



*** Joe H. Ward,  Jr.
*** 167 East Arrowhead Dr.
*** San Antonio, TX 78228-2402
*** Phone: 210-433-6575
*** Fax:   210-433-2828
*** Email: [EMAIL PROTECTED]
*** http://www.ijoa.org/resumes/ward.html
*** ---
*** Health Careers High School
*** 4646 Hamilton-Wolfe
*** San Antonio, TX 78229
*




- Original Message -
From: Graeme Byrne [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, August 22, 2001 4:42 AM
Subject: Re: adjusted r-square


 In short, you don't. If the number of terms in the model equals the number
 of observations you have much bigger problems than not being able to
compute
 adjusted R^2. It should always be the case that the number of observations
 exceed the number of terms in the model otherwise you cannot calculate any
 of the standard regression diagnostics (F-stats, t-stats etc). My advice
is
 get more data or remove terms from the model. If neither of these is an
 option you are stuck.


 Atul [EMAIL PROTECTED] wrote in message
 [EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
  I have a doubt regarding adjusted r-square
 
  How do we calculate the adjusted r-square when the error degrees of
  freedom are zero ?
  (or in other words, number of samples is equal  to the number of
  regression terms including the constant)
 
  Such a situation leads to a zero in the denominator in the expression
  for calculating adjusted r-square.
 
  Your help is highly appreciated.
 
  Thanks
  Atul




 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Student's t vs. z tests

2001-04-17 Thread Joe Ward

Eric --
Good comment!

Also, it is helpful to keep in mind that:

t^2 (df2) = F(1,df2)

-- Joe

Joe Ward
167 East Arrowhead Dr.
San Antonio, TX 78228-2402
Home phone: 210-433-6575
Home fax: 210-433-2828
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html

Health Careers High School
4646 Hamilton Wolfe
San Antonio, TX 78229
Phone: 210-617-5400
Fax: 210-617-5423


- Original Message -
From: "Eric Bohlman" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, April 16, 2001 3:43 PM
Subject: Re: Student's t vs. z tests


 Mark W. Humphries [EMAIL PROTECTED] wrote:
  Hi,

  I am attempting to self-study basic multivariate
statistics using Kachigan's
  "Statistical Analysis" (which I find excellent btw).

  Perhaps someone would be kind enough to clarify a point
for me:

  If I understand correctly the t test, since it takes
into account degrees of
  freedom, is applicable whatever the sample size might
be, and has no
  drawbacks that I could find compared to the z test. Have
I misunderstood
  something?

 You're running into a historical artifact: in pre-computer
days, using the
 normal distribution rather than the t distribution reduced
the size of the
 tables you had to work with.  Nowadays, a computer can
compute a t
 probability just as easily as a z probability, so unless
you're in the
 rare situation Karl mentioned, there's no reason not to
use a t test.





=
 Instructions for joining and leaving this list and remarks
about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/


=




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: cite for using linear regression instead of logistic regression

2001-03-18 Thread Joe Ward



David --

Logistic Regression is more appealing to some 
folks since
it maps the Predicted values into the range 
0-1.

If you do a least-squares regression predicting a 
0-1 
dependent variable, the predicted values may not 
be
mapped into 0-1 (e.g. some predicted values may 
be  0
and some may be  1.

However, for "practical" decision-making such as 
"selection",
"classification" the results will be the 
same.

Since you brought up the 
question, I'm sure that the "logistic regression"
folks can enlighten us on 
the "practical" advantages of "logistic regression".

-- Joe

Joe 
Ward167 East Arrowhead Dr.San Antonio, TX 78228-2402Home phone: 
210-433-6575Home fax: 210-433-2828Email: [EMAIL PROTECTED]http://www.ijoa.org/joeward/wardindex.htmlHealth 
Careers High School4646 Hamilton WolfeSan Antonio, TX 78229Phone: 
210-617-5400Fax: 
210-617-5423


- Original Message - 
From: "David Duffy" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, March 18, 2001 8:41 PM
Subject: Re: cite for using linear regression 
instead of logistic regression
 Scheltema, Karen [EMAIL PROTECTED] 
wrote:   I've read several times on this listserve comments 
from people that when  p(y) is not extreme, a logistic regression 
model can be estimated by a  linear regression model. 
 Some references cited by Harvey (1982): also BFH 
 Harvey WR (1982). Least squares analysis of discrete data. 
J Anim Sci 54: 1067-1071.  Cochran WG (1940). The 
analysis of variance when experimental errors follow the Poisson or 
binomial laws. Ann Math Statis 11: 335.  Cochran WG 
(1943). Analysis of variance for percentages based on unequal 
numbers. JASA 38:287.  Li JCR (1964). Introduction to 
statistical inference I. Ann Arbor: Edwards.  --  
| David 
Duffy. 
,-_|\ | email: [EMAIL PROTECTED] ph: INT+61+7+3362-0217 fax: -0101 
/ * | Epidemiology Unit, The Queensland 
Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, 
Queensland 4029, 
Australia 
v
= 
Instructions for joining and leaving this list and remarks about the 
problem of INAPPROPRIATE MESSAGES are available at 
 
http://jse.stat.ncsu.edu/ 
= 



Re: Re: topic?

2001-01-02 Thread Joe Ward

Happy New Year --

Perhaps Laurie Snell will make a good start through the future
CHANCE issues.

-- Joe

Joe Ward
167 East Arrowhead Dr.
San Antonio, TX 78228-2402
Home phone: 210-433-6575
Home fax:   210-433-2828
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html

Health Careers High School
4646 Hamilton Wolfe
San Antonio, TX 78229
Phone: 210-617-5400
Fax:   210-617-5423




- Original Message - 
From: "Bokhorst, Frank" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, January 02, 2001 4:41 AM
Subject: Re: topic?


 Bob Hayden asked:
  
  Anybody have anything to say about statistical education???
 
 I would like to turn the question round, and ask if it might be
 possible to summarize relevant material from the recent discussion 
 on the forum about the US election saga into a form suitable for 
 teaching purposes? 
 
 In particular, to sift through the EDSTAT archive and edit a 
 resource text.
 
 There was much off-topic discussion, but there was also a huge 
 volume of generally polite and reasonable talk with many good 
 points illustrating key issues relevant to education.  The topic 
 itself was extremely pertinent and interesting to a wide audience.  
 For example, someone recently asked for examples of the misuse of 
 statistics - surely many examples could be found in the US election 
 saga?   What we need is a good summary. 
 
 As another example, I note that Herman Rubin frequently argues
 the need for proper understanding of statistics:  Could he, or 
 someone anybody else on the EDSTAT forum, perhaps help educators
 by compiling some examples that arose in the recent discussion?
 What kind of understanding of statistics might be required of 
 lawyers, politicians, voters, media editors?
 
 Maybe someone could list key points that came out of these EDSTAT 
 discussions?
 
 
 Frank Bokhorst
 http://www.uct.ac.za/depts/psychology/bok
   _O
 tel: 021 650-3708   -\,
 fax: 021 689-7572   One car less  (.)/(.)
 Psychology Dept.,   The owner of this bicycle
 University of   takes responsibility for 
 Cape Town,  the shape of his drawing 
 Rondebosch 7701,only if you use a fixed
 South Africa.   size font such as Courier.
 
 
 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =
 



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Statistical penalties for sequential analyses

2000-12-08 Thread Joe Ward

Rich -
You might want to consider doing some Resampling (Cross-Validation,
Bootstrap)
as you continue through your analyses.

-- Joe


Joe Ward Health Careers High School
167 East Arrowhead Dr _ 4646 Hamilton Wolfe
San Antonio, TX 78228-2402  San Antonio, TX 78229
Phone: 210-433-6575__  Phone:  210-617-5400
Fax: 210-433-2828   Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***
- Original Message -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Friday, December 08, 2000 3:30 PM
Subject: Statistical penalties for sequential analyses


 Need some advice.  We are doing a series of tests looking for correlations
 among age-sensitive variables in a population of mice. We will have about
 600 mice in all, and it will take 3 years to test each mouse at about 200
 mice tested each year.

 We are considering three strategies:

 A) Wait 3 years until all the data are in; then do the analyses.

 B) Analyze the data on the first 300 mice, and publish anything that looks
 exciting and meets conventional significance criteria. When the second set
 of mice is finished, we can use these second 300 animals as a replicate
 samples to (try to) confirm the significant findings we reported on the
first
 set.  And we can also pool all 600 mice to obtain higher statistical power
 than we had for the initial analysis with N = 300.

 Of course this represents testing some hypotheses twice, and thus
increases
 the Type I error rate. I suspect that there are theoretically justified
 methods for adjusting significance criteria to "adjust" for taking two
looks
 at the data, but I don't know how to do this.  Anyone have a recipe, or a
 reference to get me started?

 Thanks.

 Rich Miller
 University of Michigan

 Reply to: [EMAIL PROTECTED]


 Sent via Deja.com http://www.deja.com/
 Before you buy.


 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: [ap-stat] Textbook for regular statistics vs. AP Statistics

2000-12-01 Thread Joe Ward


- Original Message -
From: "Carole Black" [EMAIL PROTECTED]
To: "AP Statistics" [EMAIL PROTECTED]
Sent: Wednesday, November 29, 2000 12:58 PM
Subject: [ap-stat] Textbook for "regular" statistics vs. AP Statistics


 I have taught a "regular" statistics class at my high school for the
 last 3 years using Elementary Statistics by Mario Triola. (This was
 the book I inherited.)  This is textbook adoption year for Georgia and
 I have the priviledge of picking out Statistics books for both the
 "regular" stat class as well as a new AP class that will be offered
 for the first time next year.  (I will be teaching both classes).  My
 first question is, should I go with 2 different textbooks or the same
 textbook?

 My second question is much the same as many others posted on this
 site, which book? I am seriously considering the Yates, Moore and
 McCabe "The Practice of Statistics" for the AP class.  I am
 considering either Moore's "Basic Practice of Statistice" or the
 "Elementary Statistics" book published by McGraw Hill for the regular
 statistics class.

 Any comments would be greatly appreciated.
 Carole Black

 ---

=  Joe Ward Comments  ==

Hi, Carole --

Your opportunity of having an AP-Statistics class and a "regular" Statistics
class can allow
you the freedom of using the "regular" class to give students the capability
to use the
combined power of Regression/Linear Models and Computers to investigate some
interesting and
practical research questions.  You might recruit some of your science
students to give them
useful techniques to support their research projects.  You can give your
students the
power to create models to answer their research questions.

It is certainly reasonable that  you must give your AP-Statistics students
the objectives that tend to
match the corresponding college course.  For the "regular" Statistics course
you can
make the course both interesting  and practical without the constraints of
AP-Statistics.

There probably are many AP teachers who can accomplish the AP-Statistics
objectives
AND have extra time to give their students some more powerful capabilities.

Try to make your "regular" statistics course available for ALL students.
Frequently,
the "regular" course is designed for the less talented.  You CAN make the
regular
course the more popular since your students might be able to do some
powerful
research.  Students who are involved with Science Fairs, Jr. Academy of
Science and
the ASA Project/Poster competitions should be your target population for the
"regular"
course.

Be sure to have access to books that contain ideas of how to use
Regression/Linear
models to create models to answer the students research questions of
interest.

-- Joe



Joe Ward Health Careers High School
167 East Arrowhead Dr _ 4646 Hamilton Wolfe
San Antonio, TX 78228-2402  San Antonio, TX 78229
Phone: 210-433-6575__  Phone:  210-617-5400
Fax: 210-433-2828   Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***







=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



[ap-stat] RE: election proposal

2000-11-14 Thread Joe Ward

Does anyone know WHY so many states DON'T DO IT THIS WAY?
Perhaps the Political Science/History folks can comment.

-- Joe


Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***


- Original Message -
From: "Lee Creighton" [EMAIL PROTECTED]
To: "AP Statistics" [EMAIL PROTECTED]
Sent: Monday, November 13, 2000 8:11 AM
Subject: [ap-stat] RE: election proposal


 People are listening! This is exactly how Nebraska and Maine vote, as we
speak.

 It was decided after the disastrous 1824 election that the states would
have the power to manage how they pick electors, and *not* the federal
government.

  -Original Message-
  From: Jon Graetz [mailto:[EMAIL PROTECTED]]
  Sent: Sunday, November 12, 2000 11:30 PM
  To: AP Statistics
  Subject: [ap-stat] RE: election proposal
 
 
  I like it!  Now, to get anyone else to listen...
 
  Jon Graetz
  The Miami Valley School
  5151 Denise Drive
  Dayton, OH  45429
  (937)434-
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
 
  -Original Message-
  From: Reba Taylor [mailto:[EMAIL PROTECTED]]
  Sent: Sunday, November 12, 2000 11:00 PM
  To: AP Statistics
  Subject: [ap-stat] election proposal
 
 
  I've been toying with this idea:
 
  Each state has the same number of electors as their congressional
  delegation:  e.g.  in VA, we have 11 congressional districts
  + 2 senators =
  13 electors.
 
  Let's keep the electors, but have the ones representing the
  congressional
  districts vote the way their district  votes.  Then the 2
  at-large electors
  will vote the way the state as a whole votes.
 
  I think this is more equable than winner-take-all.  I also
  think it would
  be a more representative sample of the popular vote, but
  still giving the
  smaller states as much clout as the larger ones.
 
  Reba Taylor
 
 
  *
  *   Reba Taylor [EMAIL PROTECTED] *
  * *
  *   Home: School: *
  * Blacksburg High School *
  *   2418 Ridge Road 520 Patrick Henry Drive *
  *   Blacksburg, VA 24060 Blacksburg, VA 24060 *
  *   540-953-2421 540-951-5706 *
  * *
  *  AP Computer Science, AP Statistics, Math *
  * *
  *  Black holes are where God divided by zero. *
  * *
  * "Can't never could, till it tried!"  -- S.C. Taylor
  *
  * *
  *

 ---
 You are currently subscribed to ap-stat as: [EMAIL PROTECTED]
 To unsubscribe send a blank email to
 [EMAIL PROTECTED]
 Frequently Asked Questions(FAQ) Site is at
 http://www.ncssm.edu/statsteachers
 AP Statistics Archives are at http://forum.swarthmore.edu/epigone/apstat-l





---
You are currently subscribed to ap-stat as: [EMAIL PROTECTED]
To unsubscribe send a blank email to
[EMAIL PROTECTED]
Frequently Asked Questions(FAQ) Site is at
http://www.ncssm.edu/statsteachers
AP Statistics Archives are at http://forum.swarthmore.edu/epigone/apstat-l
 



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: [ap-stat] RE: election proposal

2000-11-13 Thread Joe Ward

Does anyone know WHY so many states DON'T DO IT THIS WAY?
Perhaps the Political Science/History folks can comment.

-- Joe


Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***


- Original Message -
From: "Lee Creighton" [EMAIL PROTECTED]
To: "AP Statistics" [EMAIL PROTECTED]
Sent: Monday, November 13, 2000 8:11 AM
Subject: [ap-stat] RE: election proposal


 People are listening! This is exactly how Nebraska and Maine vote, as we
speak.

 It was decided after the disastrous 1824 election that the states would
have the power to manage how they pick electors, and *not* the federal
government.

  -Original Message-
  From: Jon Graetz [mailto:[EMAIL PROTECTED]]
  Sent: Sunday, November 12, 2000 11:30 PM
  To: AP Statistics
  Subject: [ap-stat] RE: election proposal
 
 
  I like it!  Now, to get anyone else to listen...
 
  Jon Graetz
  The Miami Valley School
  5151 Denise Drive
  Dayton, OH  45429
  (937)434-
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
 
  -Original Message-
  From: Reba Taylor [mailto:[EMAIL PROTECTED]]
  Sent: Sunday, November 12, 2000 11:00 PM
  To: AP Statistics
  Subject: [ap-stat] election proposal
 
 
  I've been toying with this idea:
 
  Each state has the same number of electors as their congressional
  delegation:  e.g.  in VA, we have 11 congressional districts
  + 2 senators =
  13 electors.
 
  Let's keep the electors, but have the ones representing the
  congressional
  districts vote the way their district  votes.  Then the 2
  at-large electors
  will vote the way the state as a whole votes.
 
  I think this is more equable than winner-take-all.  I also
  think it would
  be a more representative sample of the popular vote, but
  still giving the
  smaller states as much clout as the larger ones.
 
  Reba Taylor
 
 
  *
  *   Reba Taylor [EMAIL PROTECTED] *
  * *
  *   Home: School: *
  * Blacksburg High School *
  *   2418 Ridge Road 520 Patrick Henry Drive *
  *   Blacksburg, VA 24060 Blacksburg, VA 24060 *
  *   540-953-2421 540-951-5706 *
  * *
  *  AP Computer Science, AP Statistics, Math *
  * *
  *  Black holes are where God divided by zero. *
  * *
  * "Can't never could, till it tried!"  -- S.C. Taylor
  *
  * *
  *

 ---
 You are currently subscribed to ap-stat as: [EMAIL PROTECTED]
 To unsubscribe send a blank email to
 [EMAIL PROTECTED]
 Frequently Asked Questions(FAQ) Site is at
 http://www.ncssm.edu/statsteachers
 AP Statistics Archives are at http://forum.swarthmore.edu/epigone/apstat-l






=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Help needed ... :-(

2000-11-13 Thread Joe Ward

Well said, Bob --

-- Joe


Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***
- Original Message -
From: "Bob Hayden" [EMAIL PROTECTED]
To: "EdStat-L" [EMAIL PROTECTED]
Sent: Monday, November 13, 2000 9:46 PM
Subject: Re: Help needed ...



 - Forwarded message from David Heiser -


 - Original Message -
 From: Dennis [EMAIL PROTECTED]

  Hello Newsgroup, I'm searching for real good books on stats. I'm a
  student of psychology and we've been taught very much stats. But I
  read all the time your postings and wonder why I've never heard
  about that what I read.
 ...
  Hopefully and with much regards
  yours Dennis
 
 ---

 What you need is a good class in written English
 DAH

 - End of forwarded message from David Heiser -

 From the email address, it appears that Dennis lives in a European
 country where English is not the predominant language.  The written
 English here far surpasses my written French, German or Latin, to
 mention only languages I have studied.  I note that, unlike most
 Americans, Dennis uses the word "hopefully" correctly.  Of course, if
 Americans were as good with other people's languages as Europeans are,
 Dennis could have sent us a native-language posting, and then
 criticized us when we tried to respond in that language.

 I think this list can benefit greatly from being an INTERNATIONAL
 list.  Let's make folks from other countries feel welcome.


   _
  | | Robert W. Hayden
  | |  Work: Department of Mathematics
 /  | Plymouth State College MSC#29
|   | Plymouth, New Hampshire 03264  USA
| * | fax (603) 535-2943
   /|   Home: 82 River Street (use this in the summer)
  | ) Ashland, NH 03217
  L_/ (603) 968-9914 (use this year-round)
 Map of New[EMAIL PROTECTED] (works year-round)
 Hampshire http://mathpc04.plymouth.edu (works year-round)

 The State of New Hampshire takes no responsibility for what this map
 looks like if you are not using a fixed-width font such as Courier.

 "Opportunity is missed by most people because it is dressed in
 overalls and looks like work." --Thomas Edison



 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: [ap-stat] revote and Accuracy and Design of Voting Forms

2000-11-10 Thread Joe Ward


  Bob Hayden wrote to the AP list:   ==
  - Original Message -
  From: "Bob Hayden" [EMAIL PROTECTED]
  To: "AP Statistics" [EMAIL PROTECTED]
  Sent: Friday, November 10, 2000 10:01 AM
  Subject: [ap-stat] revote
 
 
   After considering all the issues raised on the lists regarding the
   election, I think the best solution would be a revote in every state
   of the union -- but with NEW CANDIDATES!-)
   --
| | Robert W. Hayden
| |  Work: Department of Mathematics
   /  | Plymouth State College MSC#29
  |   | Plymouth, New Hampshire 03264  USA
  | * | fax (603) 535-2943
 /|   Home: 82 River Street (use this in the summer)
| ) Ashland, NH 03217
L_/ (603) 968-9914 (use this year-round)
   Map of New[EMAIL PROTECTED] (works year-round)
   Hampshire http://mathpc04.plymouth.edu (works year-round)
  
   The State of New Hampshire takes no responsibility for what this map
   looks like if you are not using a fixed-width font such as Courier.
  
   "Opportunity is missed by most people because it is dressed in
   overalls and looks like work." --Thomas Edison

===  Joe Ward replied to Bob Hayden ===

  Hey, Bob --
 
  THAT really brought some hearty chuckles to
  Bettie and I.
 
  -- Joe
 
********
  Joe Ward.Health Careers High
School
  167 East Arrowhead Dr4646 Hamilton Wolfe
  San Antonio, TX 78228-2402...San Antonio, TX 78229
  Phone: 210-433-6575...Phone:  210-617-5400
  Fax: 210-433-2828Fax: 210-617-5423
  Email: [EMAIL PROTECTED]
  http://www.ijoa.org/joeward/wardindex.html
 
***

== Then Bob Hayden wrote:   =
- Original Message -
From: "Bob Hayden" [EMAIL PROTECTED]
To: "Joe Ward" [EMAIL PROTECTED]
Sent: Friday, November 10, 2000 10:41 AM
Subject: Re: [ap-stat] revote

 Their post-election bickering did not endear them to me.  I think they
 should both go home, return to their jobs, and SHUT UP.


  Joe Ward Comments about Accuracy of Voting Responses ==
  Is there research on the Design of Voting Forms?  =
Hi, Bob --

In ANY election, the format for obtaining voting responses should be
designed
to minimize the chances for inaccurate responses.  It is surprising that
the "format-approval folks" in Palm Beach did not redesign the form.  It
looks like
the form was designed for convenience of the computer folks or
the print shop or others--but not for the accuracy of responses.

No matter who is the winner in any election,  there probably are some local
voting
systems that need "fine tuning".

In San Antonio, we have gone through numerous varieties of
voting formats.  Some seem better than others. I'm not sure how
the final forms are "approved".

In this recent election we used felt-tip markers!!!  The ink soaked through
to the back side of the paper but when my wife mentioned it, the "judges"
said that it had been checked and "did not interfere with the markings on
the other side".
 But do we know what happens if there is a SMEAR of the wet ink?  Does THAT
BALLOT COUNT, or is it rejected?  If I were running for election in our
county and the
voting was close, then I certainly would ask for a "hand" recount to find
out how many
votes were rejected by the scan machine because of "smear" or because the
wet ink
soaked through the paper (probably cheap paper) and was "sensed" on the
back.

 Perhaps there should be a research project designed by a TASK FORCE of some
ASA
members to evaluate the many different forms to find out which form(s)
MINIMIZE INACCURACY
OF RESPONSE.  It is likely that  such research has been done since it
is such an important activity.  The studies should consider age, education,
language and other
variables.

-- Joe


Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: 2 factor ANOVA with empty cells

2000-11-01 Thread Joe Ward

Right you are, Elliot.

However, when one finds "no-interaction" among all of those cells
that are present, then one can feel "better" about estimating
the "missing" cell values.  Of course, there could be a surprising
explosion!! The more interaction that is detected the more dangerous it can
be.

When there is little or no interaction it is possible to design the study
to save money and time.  There is no need to fill in  all the cells all the
time -- particularly when the cost is great.

The real experimental design "experts" can get lots of information from
a small study that might have missing cells "strategically located".

- Joe

****

Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***


- Original Message -
From: "Elliot Cramer" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, November 01, 2000 8:43 PM
Subject: Re: 2 factor ANOVA with empty cells


 Jeff E. Houlahan [EMAIL PROTECTED] wrote:
 : Is it ever appropriate to do a 2-factor unreplicated ANOVA with
 : empty cells if you aren't sure there is no interaction between the
 ^
 you can test the part of the interaction that is testable, but of course
 you can never know about the rest.



 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Independent-Dependent Variable Discussion--Inverse Estimation

2000-10-20 Thread Joe Ward

Hi Dan and all --

I had intended to comment about the independent-dependent variable
discussion
earlier but I got side-tracked.  Since Dan reminded us with his comment:

" This problem statement also brings back the independent-dependent
variable
 discussion.  In the real context, the activity level of the crickets
depends
 upon the temperature, so temperature is the independent variable and
number
 of chirps the dependent variable.  However, if you want to predict the
 temperature using the number of chirps, you must consider the number of
 chirps as the "independent" variable and temperature as the "dependent"
 variable."


I have inserted some comments below:

===  Joe Ward writes ==
In the ancient past (1950s), for calibration studies --

Let
Y be a reading from a measuring instrument, SUBJECT TO "ERRORS OF
MEASUREMENT".
and
X be a KNOWN STANDARD, ASSUMED TO BE "WITHOUT ERROR" (FIXED).

Then the least-squares regression model used to PREDICT THE "STANDARD" (X)
from the measurement Y  WAS computed as:

Y = b0  + b1*X + Error

Then from this equation to estimate (predict) the KNOWN STANDARD (X) from
the measurement (Y), the past procedure was to solve  for X in the above
equation
(leaving off the Error)

Y = b0  + b1*X

or

X = (Y-b0)/b1

is used to PREDICT X from Y.

Dan,  you probably are better acquainted with the most recent approach from
the Bureau of Standards since I have not kept up with any changes in the
Standards calibration policy.

Furthermore, in the distant past, it is interesting to note that
simultaneous regression equations were solved to estimate  unkown amounts of
chemical compositions in a solution.

An interesting study by Fisher, Hans, R.G. Hansen, and H.W. Norton (1955).
Quantitative determination of  glucose and galactose. Anal. Chem. 27,
857-859. is discussed in
E.J Williams' book Regression Analysis, Wiley, 1959, page 163.  Williams
refers to this topic as INVERSE ESTIMATION.

Even though the goal is to ESTIMATE (PREDICT) the values of X,  the
dependent variables (Y's) are the MEASURES SUBJECT TO ERROR.  After the
least-squares solutions are computed then the simultaneous regression
equations are solved, INVERSELY, for unknown X values from
measured(observed) values of Y (which are subject to ERRORS).

It would be interesting to know if this approach is still used.  Is the
INVERSE method BETTER? Have there been recent studies comparing the REGULAR
approach with the
INVERSE approach?

Comments from experienced "experts" in this area are welcome.

-- Joe

****

Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***
===  End of Joe Ward's message  =


- Original Message -
From: "Teague, Dan" [EMAIL PROTECTED]
To: "AP Statistics" [EMAIL PROTECTED]
Sent: Friday, October 20, 2000 10:42 AM
Subject: [ap-stat] RE: effect on LSRL


 Rebecca,

 If your student chose values of the independent variable that were very
 large (250-450) and found the y-values that correspond to these x-values
 using y = 56.212 + 0.1356x, then he could increase the slope.  For these
 data, the point (249, 55) is below that portion of the regression line on
 the left.  The regression line would be pulled towards the point, just as
 you said, but in this situation, it would cause the slope to increase.

 The student's argument is flawed to the extent that these values of the
 independent variable do not match the summary statistics (xbar = 167 and s
=
 31).  We expect to find the number of chirps between 70 and 290 and the
 temperature roughly between 50 and 100.  For these values of x, the slope
 will be pulled down by the addition of this point.

 This problem statement also brings back the independent-dependent variable
 discussion.  In the real context, the activity level of the crickets
depends
 upon the temperature, so temperature is the independent variable and
number
 of chirps the dependent variable.  However, if you want to predict the
 temperature using the number of chirps, you must consider the number of
 chirps as the "independent" variable and temperature as the "dependent"
 variable.


 Daniel J. Teague
 NC School of Science and Mathematics
 1219 Broad Street
 Durham, NC  27705
 [EMAIL PROTECTED]


 -Original Message-
 From: Rebecca Brewer [mailto:[EMAIL PROTECTED]]
 Sent: Friday, October 20, 2000 11:02 AM
 To: AP Statistics
 Subject: [ap-stat] effect on LSRL


 Help! 

[ap-stat] Independent-Dependent Variable Discussion--Inverse Estimation

2000-10-20 Thread Joe Ward

Hi Dan and all --

I had intended to comment about the independent-dependent variable
discussion
earlier but I got side-tracked.  Since Dan reminded us with his comment:

" This problem statement also brings back the independent-dependent
variable
 discussion.  In the real context, the activity level of the crickets
depends
 upon the temperature, so temperature is the independent variable and
number
 of chirps the dependent variable.  However, if you want to predict the
 temperature using the number of chirps, you must consider the number of
 chirps as the "independent" variable and temperature as the "dependent"
 variable."


I have inserted some comments below:

===  Joe Ward writes ==
In the ancient past (1950s), for calibration studies --

Let
Y be a reading from a measuring instrument, SUBJECT TO "ERRORS OF
MEASUREMENT".
and
X be a KNOWN STANDARD, ASSUMED TO BE "WITHOUT ERROR" (FIXED).

Then the least-squares regression model used to PREDICT THE "STANDARD" (X)
from the measurement Y  WAS computed as:

Y = b0  + b1*X + Error

Then from this equation to estimate (predict) the KNOWN STANDARD (X) from
the measurement (Y), the past procedure was to solve  for X in the above
equation
(leaving off the Error)

Y = b0  + b1*X

or

X = (Y-b0)/b1

is used to PREDICT X from Y.

Dan,  you probably are better acquainted with the most recent approach from
the Bureau of Standards since I have not kept up with any changes in the
Standards calibration policy.

Furthermore, in the distant past, it is interesting to note that
simultaneous regression equations were solved to estimate  unkown amounts of
chemical compositions in a solution.

An interesting study by Fisher, Hans, R.G. Hansen, and H.W. Norton (1955).
Quantitative determination of  glucose and galactose. Anal. Chem. 27,
857-859. is discussed in
E.J Williams' book Regression Analysis, Wiley, 1959, page 163.  Williams
refers to this topic as INVERSE ESTIMATION.

Even though the goal is to ESTIMATE (PREDICT) the values of X,  the
dependent variables (Y's) are the MEASURES SUBJECT TO ERROR.  After the
least-squares solutions are computed then the simultaneous regression
equations are solved, INVERSELY, for unknown X values from
measured(observed) values of Y (which are subject to ERRORS).

It would be interesting to know if this approach is still used.  Is the
INVERSE method BETTER? Have there been recent studies comparing the REGULAR
approach with the
INVERSE approach?

Comments from experienced "experts" in this area are welcome.

-- Joe

****

Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***
===  End of Joe Ward's message  =


- Original Message -
From: "Teague, Dan" [EMAIL PROTECTED]
To: "AP Statistics" [EMAIL PROTECTED]
Sent: Friday, October 20, 2000 10:42 AM
Subject: [ap-stat] RE: effect on LSRL


 Rebecca,

 If your student chose values of the independent variable that were very
 large (250-450) and found the y-values that correspond to these x-values
 using y = 56.212 + 0.1356x, then he could increase the slope.  For these
 data, the point (249, 55) is below that portion of the regression line on
 the left.  The regression line would be pulled towards the point, just as
 you said, but in this situation, it would cause the slope to increase.

 The student's argument is flawed to the extent that these values of the
 independent variable do not match the summary statistics (xbar = 167 and s
=
 31).  We expect to find the number of chirps between 70 and 290 and the
 temperature roughly between 50 and 100.  For these values of x, the slope
 will be pulled down by the addition of this point.

 This problem statement also brings back the independent-dependent variable
 discussion.  In the real context, the activity level of the crickets
depends
 upon the temperature, so temperature is the independent variable and
number
 of chirps the dependent variable.  However, if you want to predict the
 temperature using the number of chirps, you must consider the number of
 chirps as the "independent" variable and temperature as the "dependent"
 variable.


 Daniel J. Teague
 NC School of Science and Mathematics
 1219 Broad Street
 Durham, NC  27705
 [EMAIL PROTECTED]


 -Original Message-
 From: Rebecca Brewer [mailto:[EMAIL PROTECTED]]
 Sent: Friday, October 20, 2000 11:02 AM
 To: AP Statistics
 Subject: [ap-stat] effect on LSRL


 Help! 

Re: How to Pool Slopes

2000-10-08 Thread Joe Ward

Hi, Stan --
I've inserted a reply at the end of your message. Let me know
how things turn out.
-- Joe


Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***

- Original Message -
From: "Stanley110" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, October 08, 2000 1:59 PM
Subject: Q: How to Pool Slopes


 Assume I have three sets of x,y data. I fit each by least-squares to a
straight
 line. I determine that the three fitted lines are homogeneous and
 indistinguishable at a certain significance level. I want to express the
slope
 (of the three) as a single point estimate and as a confidence interval.
What is
 the formula for doing this?

 Please reply to this newsgroup and to the writer at [EMAIL PROTECTED].

 Thank you for your help.

 stan alekman


 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =


==  JOE WARD REPLIES  ===
Hi, Stan --

Your Title says
(1)"How to Pool Slopes" and you indicate later that
(2)"I determine that the three fitted lines are homogeneous and
indistinguishable.
   For (1) it sounds like you will want THREE DIFFERENT INTERCEPTS, but
for case (2) it sounds like you may want only ONE INTERCEPT.

This is good example of the use of the Regression Option of "NO INT" option
in SAS
or "Y-intercept = zero". The reason that this appears to be a difficult
problem is the use of the frequently-used DEFAULT option in most statistics
packages.
The approach used below for your THREE GROUP DATA is  shown for TWO groups
of data
in the Prentice-Hall published book (1973) -- "Introduction to Linear
Models" by
Ward and Jennings. Chapter 8, page 143.

I don't know which Regression Software you are using, but you should be sure
to
FORCE THE Y-intercept THROUGH THE ORIGIN..

First, it is important to put ALL THREE SETS OF DATA in the same model.

Let Y = dependent variable (containing ALL THREE SETS OF DATA)
D1 = 1 if the corresponding element of Y is from DATA SET #1; 0 otherwise
D2 = 1 if the corresponding element of Y is from DATA SET #2; 0 otherwise
D3 = 1 if the corresponding element of Y is from DATA SET #3; 0 otherwise
X1 = Value of x if the corresponding element of Y is from DATA SET #1; 0
otherwise
X2 = Value of x if the corresponding element of Y is from DATA SET #2; 0
otherwise
X3 = Value of x if the corresponding element of Y is from DATA SET #3; 0
otherwise
X = Value of x for ALL corresponding elements of Y.
U =  1 for every element.

Then your ASSUMED MODEL is shown below: (this should give you the same
regression
coefficients that you already have computed -- a check that your new model
is
correct)

Y = a1*D1 + b1*X1 + a2*D2 + b2*X2 + a3*D3 + b3*X3 + E1 (Model #1)

After you have computed this ASSUMED MODEL you may want to TEST THE
HYPOTHESIS
that you imply in CASE (1) above, that the
THREE SLOPES ARE EQUAL, i.e., b1=b2= b3=bc (THE COMMON SLOPE)

Then substituting these restrictions into Model #1 produces  the RESTRICTED
MODEL
FOR CASE (1):

Y = a1*D1 + bc*X1 + a2*D2 + bc*X2 + a3*D3 + bc*X3 + E2 (Model #2)

Factoring (or collecting terms) produces:

Y = a1*D1 + a2*D2 + a3*D3 + bc*X + E2 (Model #2)
(Note that the values of a1, a2, and a3 in Model #2 are NOT numerically
equal to
the values in Model #1)

From Model #2, bc is the least-squares SINGLE POINT estimate of the COMMON
SLOPE.

Your favorite Regression procedure should give what you need to compute a
confidence interval (such as the standard error of bc).

Now for CASE (2) above you may want to test that:
THREE SLOPES ARE EQUAL, i.e., b1=b2= b3=bc ( THE COMMON SLOPE)
 and
THREE INTERCEPTS ARE EQUAL, i.e., a1=a2=a3=ac (THE COMMON INTERCEPT)

In which case, the RESTRICTED MODEL becomes:
Y = ac*D1 + bc*X1 + ac*D2 + bc*X2 + ac*D3 + bc*X3 + E3 (Model #3)

Factoring (or collecting terms) produces:

Y = ac*U + bc*X + E3 (Model #3)
(Note that the value of bc in Model #3 is NOT numerically equal to the value
in
Model #2)

And, as before, your favorite Regression procedure should give what you need
to
compute a confidence interval (such as the standard error of bc). Let me
know how
this works out.  If you have any problems with this approach you are welcome

Re: How many Olympic Medals should Great Britain have won?

2000-10-03 Thread Joe Ward



Hi, Graham --

It's been a long time since I've heard any 
discussion about
UNDERACHIEVERS and OVERACHIEVERS. 
I've never been able to understand
the discussions.

NO MATTER WHAT VALUE THE CORRELATION (SLOPE 
OF THE REGRESSION LINE) HAS we
know that the ALGEBRAIC SUM OF THE ERRORS 
IS ZERO. Now that says that
the SUM OF THE ABSOLUTE VALUES OF THE 
POSITIVE ERRORS IS EQUAL TO THE
SUM OF THE ABSOLUTE VALUES OF THE NEGATIVE 
ERRORS. THEN WE WOULD EXPECT
TO OBSERVE ABOUT ONE-HALF OF THE 
OBSERVATIONS TO HAVE POSITIVE ERRORS AND
ONE-HALF TO HAVE NEGATIVE VALUES. 


THEREFORE, FOR ALL CORRELATIONS (ZERO 
INCLUDED) WE SHOULD EXPECT TO
CONCLUDE THAT ABOUT ONE-HALF OF ALL 
CASES
WOULD BE CALLED "OVER-ACHIEVERS" AND ABOUT 
ONE-HALF WOULD BE CALLED
"UNDER-ACHIEVERS". DOES THAT 
DESIGNATION HAVE ANY OPERATIONALLY USEFUL
MEANING? 

--Joe
********Joe 
Ward.Health Careers High School167 
East Arrowhead Dr4646 Hamilton 
Wolfe 
San Antonio, TX 78228-2402...San Antonio, TX 78229Phone: 
210-433-6575...Phone: 210-617-5400Fax: 
210-433-2828Fax: 210-617-5423Email: [EMAIL PROTECTED]http://www.ijoa.org/joeward/wardindex.html***

  - Original Message - 
  From: 
  Dr Graham D Smith 
  To: Edstat 
  Sent: Monday, October 02, 2000 11:40 
  AM
  Subject: How many Olympic Medals should 
  Great Britain have won?
  
  
  How many Olympic Medals should Great Britain have 
  won?
  British Olympians won a grand total of 28 medals at the Sydney 2000 Games, 
  our best medal haul for 80 years. Many commentators have suggested that the 
  big improvement in British fortunes compared to the Atlanta 1996 Games is due 
  to the use of Lottery funding to help our top sportsmen and sportswomen. But 
  how many medals should Britain expect to win? Did we fulfil our potential or 
  fall short of it?
  One important determinant of a country's Olympic success is the size of its 
  population. USA, China and Russia head the Sydney 2000 medal table, they also 
  have large populations. However, population size does not fully account for 
  the number of medals won. Both India and China have much larger populations 
  than USA but won fewer medals. Another important predictor of a nation's 
  Olympic performance is economic prosperity. Richer nations often outperform 
  poorer nations of the same size. Gross domestic product (GDP) is an economic 
  index that reflects both economic success and population size.
  A scatterplot of the number of medals won and GDP of the 80 medal winning 
  countries at the 2000 Olympics shows a positive correlation; r = 0.595, 
  p  0.01 (see attached). GDP accounts for 35.4% of the variance of 
  medals won. A regression analysis was performed on the data to estimate the 
  number of medals Team GB should expect. Given that the UK GDP is equivalent to 
  US$ 1.29 trillion the expected number of medals is 15. It seems that our 
  Olympians did far better than we could have expected. Well done team GB!
  And well done too to Team USA, their expected medal count is 26.5. However, 
  the top overachiever was Russia (followed by USA and Australia). The top 
  underachiever was India.
  
  *Dr Graham 
  D. SmithPsychology DivisionPark CampusUniversity College 
  NorthamptonBoughton Green Rd.NorthamptonNN2 7AL
  Tel: +44 (0) 1604 735500 Ext 2393E-mail: [EMAIL PROTECTED]*
  
  
  
  *Dr 
  Graham D. SmithPsychology DivisionPark CampusUniversity College 
  NorthamptonBoughton Green Rd.NorthamptonNN2 7AL
  
  Tel: +44 (0) 1604 735500 Ext 2393E-mail: [EMAIL PROTECTED]*


Re: How many Olympic Medals should Great Britain have won?

2000-10-03 Thread Joe Ward



Hi, Paige --

Good comments about "There are so many 
different factors..."


"To say that half the observations should 
have positive errors and halfshould have negative errors is to confuse 
median with mean."
 I used the word ABOUT intentionally to
distinguish from EXACTLY.

--Joe

- Original Message - 
From: "Paige Miller" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, October 03, 2000 10:19 
AM
Subject: Re: How many Olympic Medals should 
Great Britain have won?

  Hi, Graham --  
  It's been a long time since I've heard any discussion 
about  UNDERACHIEVERS and OVERACHIEVERS. I've never been able 
to understand  the discussions.NO 
MATTER WHAT VALUE THE CORRELATION (SLOPE OF THE REGRESSION LINE) HAS we 
 know that the ALGEBRAIC SUM OF THE ERRORS IS ZERO. Now that says 
that  the SUM OF THE ABSOLUTE VALUES OF THE POSITIVE ERRORS IS EQUAL 
TO THE  SUM OF THE ABSOLUTE VALUES OF THE NEGATIVE ERRORS. 
THEN WE WOULD EXPECT  TO OBSERVE ABOUT ONE-HALF OF THE OBSERVATIONS TO HAVE 
POSITIVE ERRORS AND  ONE-HALF TO HAVE NEGATIVE VALUES. 
THEREFORE, FOR ALL CORRELATIONS (ZERO 
INCLUDED) WE SHOULD EXPECT TO  CONCLUDE THAT ABOUT ONE-HALF OF ALL CASES  WOULD BE 
CALLED "OVER-ACHIEVERS" AND ABOUT 
ONE-HALF WOULD BE CALLED  "UNDER-ACHIEVERS". 
DOES THAT DESIGNATION HAVE ANY OPERATIONALLY USEFUL  MEANING? 



Paige writes 

 There are so many different factors 
that go into the amount of medals won that it seems silly to perform a 
regression based upon population and GDP to use as predictors. 
Organization of Olympic Committees, training facility quality, programs 
for youths, weather, etc. all can affect the number of medals won, and 
then there is the factor of injuries, which to me seems like it cannot 
be modelled except as random noise.   To say that half 
the observations should have positive errors and half should have 
negative errors is to confuse median with mean.   --  
Paige Miller Eastman Kodak Company [EMAIL PROTECTED] 
 "It's nothing until I call it!" -- Bill Klem, NL Umpire "Those 
black-eyed peas tasted all right to me" -- Dixie Chicks  
 
= 
Instructions for joining and leaving this list and remarks about the 
problem of INAPPROPRIATE MESSAGES are available at 
 
http://jse.stat.ncsu.edu/ 
= 



Re: What is today's Hogg Craig?

2000-09-23 Thread Joe Ward

Hi, Gary, Jerry et al --
Here is a message from Bob Hogg.

-- Joe
- Original Message -
From: "Robert V. Hogg" [EMAIL PROTECTED]
To: "Joe Ward" [EMAIL PROTECTED]
Sent: Friday, September 22, 2000 9:19 AM
Subject: Re: Fw: What is today's Hogg  Craig?


 joe,  HOGG AND TANIS is used more for undergrads.COSELLA AND BERGER
for
 first year grad students in stat.HOGG AND CRAIG for good seniors and
 first year grad students in other areas[like actuarial sci].   bob



 At 11:24 PM 9/21/00 -0500, Joe Ward wrote:
 Bob --
 
 Any suggestions for Jerry?
 
 -- Joe

*******
*
 
 Joe Ward.Health Careers High
School
 167 East Arrowhead Dr4646 Hamilton Wolfe
 San Antonio, TX 78228-2402...San Antonio, TX 78229
 Phone: 210-433-6575...Phone:  210-617-5400
 Fax: 210-433-2828Fax: 210-617-5423
 Email: [EMAIL PROTECTED]
 http://www.ijoa.org/joeward/wardindex.html

***
 - Original Message -
 From: "Jerry Dallal" [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Thursday, September 21, 2000 9:32 PM
 Subject: What is today's Hogg  Craig?
 
 
  Back in the "old days", the standard text for an undergraduate math
stat
  course was Hogg  Craig.  I had some fondness for Lindgren.  I haven't
  taught this course in nearly 20 years.  Which texts occupy their
position
  today?
 
  Thanks.
 

- Original Message -
From: "Gary McClelland" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Friday, September 22, 2000 11:49 AM
Subject: Re: What is today's Hogg  Craig?


 in article [EMAIL PROTECTED], Jerry Dallal at [EMAIL PROTECTED]
 wrote on 9/21/00 8:32 PM:

  Back in the "old days", the standard text for an undergraduate math stat
  course was Hogg  Craig.  I had some fondness for Lindgren.  I haven't
  taught this course in nearly 20 years.  Which texts occupy their
position
  today?
 
  Thanks.

 According to amazon.com, the 1994 5th edition is still in print.
 I keep my much earlier edition closely guarded.  But I too would be
 interested in hearing what the kids learn with today.

 gary
 --
 [EMAIL PROTECTED]



 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =







=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: cluster

2000-09-22 Thread Joe Ward

Hi, Thomas --

If you have a SAS Manual the McQuitty method is described briefly in the
CLUSTER Chapter.

Also,  I think the original article is:

McQuitty, L.L. (1966) "Similarity Analysis by Reciprocal Pairs for Discrete
and Continuous Data"
Ed and Psy Meas, 17, 207-229.

Look at:

Anderberg, M.R. (1973) "Cluster Analysis for Applications"  New York,
Academic Press.

--- Joe
********

Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***

- Original Message -
From: "Thomas Pesl" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Friday, September 22, 2000 4:19 AM
Subject: cluster


 Does anyone know the formula of the McQuitty clustering method?

 Thanks,
 Thomas




 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: How can I analyze split-design by SPSS v9.0?

2000-09-07 Thread Joe Ward

Anuvat --

Here comes my "standard" comment!

1.  State your research question(s) in "natural language".
2.  Create a model that enables you to answer the "natural language"
questions that YOU WANT.
3.  Impose restrictions on YOUR MODEL that answers YOUR questions of
interest.
4.  Use the computer to get YOUR DESIRED RESULTS.

Then AFTER YOU HAVE VERIFIED THAT THERE EXISTS A "PACKAGED" ALGORITHM THAT
ANSWERS YOUR QUESTIONS OF INTEREST, THEN USE THE "PACKAGED" ALGORITHM.

Since many "interesting" research questions involve creating models for
unique problems, it can be more efficient to create your OWN MODELS rather
than searching for "packaged" algorithms that MAY fit YOUR research
questions of interest.  IMHO  it seems best to take time to develop
"model-creation" skills so that you can have the POWER that is available.

If you have time to take a look at the URL below, Slides 7 and 8 of the
PowerPoint presentation on "Using Calculators and Computers in Statistics" -
Laura Niland  Joe Ward, CAMT98 45th Annual Conference, San Antonio, July
23, 1998 - give a pictorial view of "Forcing" vs. "Creating" Models.

Good luck--

Joe


Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***
- Original Message -
From: "Anuvat Jangchud" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, September 06, 2000 10:32 PM
Subject: How can I analyze split-design by SPSS v9.0?


 I would like to use SPSS v.9.0 for SPLIT Design anlysis.  Could you help
me
 out?



 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Math Education of Mathematics Teachers

2000-08-01 Thread Joe Ward

Dick --

I'm staying 'til Friday to attend THAT SESSION.

The discussions should be of interest to secondary teachers in the
Indianapolis
area.  It would be great if arrangements could be made for teachers to
attend
THAT session without needing to register for the JSM.

I think it is Session 281, Thursday, Aug. 17 10:30 a.m. - 12:30.

--  Joe



Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***


- Original Message -
From: "Richard L. Scheaffer" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Tuesday, August 01, 2000 1:22 PM
Subject: Math Education of Mathematics Teachers


 I would like to call your attention to a session at the Joint Statistics
 Meetings that those of you interested in statistics education might have
 overlooked.  Session 279, The Importance of Statistics in the Education of
 Future Teachers reports on a project of the Conference Board of the
 Mathematical Sciences, funded by NSF an DoEd, that will attempt to get
 departments of mathematical sciences more involved in the education of
future
 teachers.  Teachers coming out of colleges of education are ill equipped
to
 teach in the modern math curriculum - a curriculum that includes much
 statistics.  This project makes a series of recommendations on how to
solve
 this problem.  Among the recommendations are strong statements about the
 importance of statistics.

 The panel consists of Alan Tucker, mathematician and lead writer of the
CBMS
 report, Judy Sowder, math educator responsible for the middle school
section
 of the report, Gail Burrill, former president of NCTM and now head of the
Math
 Sciences Education Board at the NAS, and Jerry Moreno, a well-known
statistics
 educator.

 Unfortunately, this session is in the last time slot of the meeting, 10:30
 Thursday morning.  So, I hope some of you will have the time and interest
to
 stop by.  It should be a lively discussion of a very important topic.

 Hope to see you there!

 Dick Scheaffe



 ps  A draft of the report is on the web.

 CBMS Math Education of Teachers Project Draft Report on the Web
 www.maa.org/cbms


 --
 Richard L. Scheaffer   [EMAIL PROTECTED]
 Department of Statistics phone 352-392-1941 (#224)
 Box 118545 fax 352-392-5175
 University of Florida
 Gainesville, FL 32611

 907 NW 21 Terrace 352-378-1996
 Gainesville, FL  32603


 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: regression books?

2000-07-22 Thread Joe Ward

If you are near a university library you may want to take a look
at INTRODUCTION TO LINEAR MODELS by Ward and Jennings.

The Purdue library might have a copy.

Also, the Fountain-Ward JSE article shown at the URL below is related to
your interest.

http://www.ijoa.org/joeward/wardindex.html

http://www.amstat.org/publications/jse/v4n3/ward.html


-- Joe



Joe Ward.Health Careers High School
167 East Arrowhead Dr4646 Hamilton Wolfe
San Antonio, TX 78228-2402...San Antonio, TX 78229
Phone: 210-433-6575...Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
***

- Original Message -
From: "Christopher Tong" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Saturday, July 22, 2000 2:12 PM
Subject: regression books?



 Does anyone have recommendations for introductory
 books on regression analysis?  I posted this question on
 sci.stat.math and got only one reply so far.

 I am currently using Neter, Kutner, Nachtsheim, and
 Wasserman, which I find unwieldy and not very concise.
 I have my eye on Montgomery  Peck, but am wondering what anyone
 else would recommend.  My one reply so far suggested Cohen  Cohen.




 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Novice questions about regression analysis.

2000-06-28 Thread Joe Ward

Good comment, Paige--

" A well-designed experiment will yield regression estimates with more
 desirable properties than a poorly-designed experiment will.
 Specifically, the parameter estimates may have smaller variance in a
 well-design experiment, and the parameters will be less correlated (or
 uncorrelated) with each other. The predicted values of the responses
 likewise will have smaller variance in a well-designed experiment."

However, it is safest to be sure that the "packaged" analyses do what
the researcher wants.Do many "packaged COVARIANCE algorithms" still
assume NO INTERACTION?  Does SAS (or other stat packages) warn us
 when there is a "missing cell" in an ANOVA-LIKE   GLM computation?

-- Joe
**********
Joe Ward  Health Careers High School
167 East Arrowhead Dr.   4646 Hamilton Wolfe
San Antonio, TX 78228-2402   San Antonio, TX 78229
Phone: 210-433-6575   Phone:  210-617-5400
Fax: 210-433-2828Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/watdindex.html
**





- Original Message -
From: "Paige Miller" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, June 28, 2000 11:08 AM
Subject: Re: Novice questions about regression analysis.


 Wen-Feng Hsiao wrote:
 
  Dear listers,
 
  I am stuck with the experiment design of my dissertation. My experiment
  would like to investigate the influences of different factors of stimuli
  on the subject's response (each factor is a continuous variable), and
  further build a regression model for these relations. My questions are:
 
  1. It seems that no experiment-design issues related to Regression
  Analysis are discussed in the usual statistics textbook. Why? Does it
  mean one needn't consider the experiment design if he uses Regression
  Analysis to analyze his data?

 A well-designed experiment will yield regression estimates with more
 desirable properties than a poorly-designed experiment will.
 Specifically, the parameter estimates may have smaller variance in a
 well-design experiment, and the parameters will be less correlated (or
 uncorrelated) with each other. The predicted values of the responses
 likewise will have smaller variance in a well-designed experiment.

  2. Due to the measure of the dependent variable is the participants'
  subjective responses, to remove unrelated subject-specific variables, I
  am considering to employ a within-subject design. But there seems no
  statistical packages ready for dealing with within-subject design of
  Regression Analysis?

 SAS and JMP will perform these analyses, although the manual may not
 specifically call them 'within-subject' analyses. Other packages
 probably will handle them as well, but I cannot advise you of specifics.

  Suppose a design in which each of the n subjects gives rise to a Y
  observation under each of c different conditions, then a total of N=ncY
  observations could be obtained. How can I use Regression Analysis to
  analyze these observations?

 The model will predict the response Y as a function of the subject and
 each of the design variables, plus any desired interactions between
 design variables, interactions between subject and design variables, and
 polynomial terms (if desired) involving design variables.


 --
 Paige Miller
 Eastman Kodak Company
 [EMAIL PROTECTED]

 "It's nothing until I call it!" -- Bill Klem, NL Umpire
 "Those black-eyed peas tasted all right to me" -- Dixie Chicks



===
 This list is open to everyone.  Occasionally, less thoughtful
 people send inappropriate messages.  Please DO NOT COMPLAIN TO
 THE POSTMASTER about these messages because the postmaster has no
 way of controlling them, and excessive complaints will result in
 termination of the list.

 For information about this list, including information about the
 problem of inappropriate messages and information about how to
 unsubscribe, please see the web page at
 http://jse.stat.ncsu.edu/

===





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Stupid question on relationship of r and t

2000-06-24 Thread Joe Ward

Jason --

t^2 = r^2*(n-2)
---
 (1-r^2)

is a special case of the more general case of using R^2 to compute
the F statistic in a Prediction/Regression/Linear Models approach to
research studies.


Letting

R^2(Assumed) = R^2 for the ASSUMED MODEL
R^2(Restricted)= R^2 for the RESTRICTED MODEL
NA  =  number of linearly independent predictor vectors (i.e., the number of
parameters) in the ASSUMED MODEL.
NR  =  number of linearly independent predictor vectors (i.e., the number of
parameters) in the RESTRICTED MODEL
N=  total number of observations (cases)
df1  =  NA - NR  =numerator degrees of freedom
df2 =   N   - NA=denominator degrees of freedom

F(df1,df2) =   (R^2(Assumed) - R^2(Restricted))/(df1)
   ---
   (1 - R^2(Assumed))/(df2)

Now consider the your special case when:

The ASSUMED MODEL CONTAINS ONLY TWO PREDICTORS:

Y = b0*U + b1*X + Ea

and the Hypothesis is "b1 = 0").  Then the RESTRICTED MODEL is:

Y = b0*U + Er

In this special case,

R^2(Restricted) = 0

and then

F(df1,df2) = (R^2(Assumed)/(df1)
   ---
   (1 - R^2(Assumed))/(df2)

and you can easily solve for R^2 if desired.

R^2(Assumed) =  F*(df1)
---
(df2) + F*(df1)


and in your special case of only ONE predictor (in addition to, U),
sometimes called "simple regression".

df1 = 2 - 1 = 1
 and
df2 = N - 2

R^2(Assumed) = r^2  =F

N - 2 + F


but since

t^2(df2) = F(1,df2)

then we have

r^2 =t^2
 -
 N - 2 + t^2

which is what you obtain from Bob's
suggestion --

  t= r * sqrt(n-2)
 -
 sqrt(1-r^2)
 
  I want to be able to calculate r from t.  I tried algebraically
  manipulating the formula, but never quite got it to where I could do
  this.  Any advice?
 
 Try squaring both sides and re-arranging.  ( Joe Ward's comment "GOOD
SUGGESTION BY BOB")

 Bob

 --
 Bob O'Hara
 Metapopulation Research Group
 Division of Population Biology
 Department of Ecology and Systematics
 PO Box 17 (Arkadiankatu 7)
 FIN-00014 University of Helsinki
 Finland

 tel: +358 9 191 7382  fax: +358 9 191 7301
 email: [EMAIL PROTECTED]
 To induce catatonia, visit:
 http://www.helsinki.fi/science/metapop/



- Original Message -
From: "Anon." [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Saturday, June 24, 2000 7:20 AM
Subject: Re: Stupid question on relationship of r and t


 "Jason Osborne, Ph.D." wrote:
 
  I am working on a power analysis project- we are reviewing old journal
  articles to calculate observed effect sizes and power.  Some of these
  articles, for example reporting t-test results, only give means and
  t-test, no standard deviation.  thus, no effect size calculation is
  possible.  I was hoping to estimate an effect size by converting a t to
  an r.  I seem to remember a formula that relates the two, but am having
  a dickens of a time tracking one down.  The one I did track down, for
  calculating t from r, is not that helpful:
 
  t= r * sqrt(n-2)
 -
 sqrt(1-r^2)
 
  I want to be able to calculate r from t.  I tried algebraically
  manipulating the formula, but never quite got it to where I could do
  this.  Any advice?
 
 Try squaring both sides and re-arranging.

 Bob

 --
 Bob O'Hara
 Metapopulation Research Group
 Division of Population Biology
 Department of Ecology and Systematics
 PO Box 17 (Arkadiankatu 7)
 FIN-00014 University of Helsinki
 Finland

 tel: +358 9 191 7382  fax: +358 9 191 7301
 email: [EMAIL PROTECTED]
 To induce catatonia, visit:
 http://www.helsinki.fi/science/metapop/

 I have yet to see any problem, however complicated, which, when you
 looked at it in the right way, did not become still more complicated.  -
 Poul Anderson



===
 This list is open to everyone.  Occasionally, less thoughtful
 people send inappropriate messages.  Please DO NOT COMPLAIN TO
 THE POSTMASTER about these messages because the postmaster has no
 way of controlling them, and excessive complaints will result in
 termination of the list.

 For information about this list, including information about the
 problem of inappropriate messages and information about how to
 unsubscribe, please see the web page at
 http://jse.stat.ncsu.edu/

===





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of 

Re: Beginner requests for help on ANOVA and T-tests (n SYSTAT97 --CAUTION)

2000-06-15 Thread Joe Ward

Edmond--

You may want to use the REGRESSION program in Excel (WITH CAUTION).
  That way you can create your own models to do what YOU WANT TO DO.
You might want to contact a statistician to help you use REGRESSION
models.  You don't need to use some of the Pre-Computer algorithms if
you know who to create your models to answere YOUR QUESTIONS.

The URL below has a few articles related to this message:
 http://www.ijoa.org/joeward/wardindex.html

If the "packaged" algorithms answer the questions of interest,
then you can use them.

I am using Excel 97 with three high school students this summer.
2 Sophomores and 1 Senior in preparation for their Science Fair
Research Projects.  I usually use SYSTAT. However, these students
already have Excel, so we are "testing" the use of
REGRESSION in Excel.

Incidentally, when you use REGRESSION models that need to:

NOT HAVE THE Y-INTERCEPT TO PASS THROUGH ZERO,

THE REGRESSION SUM OF SQUARES ARE NOT CORRECT.

So be careful when you use REGRESSION in Excel 97.

The Excel97  Error is due to the fact that the REGRESSION SUM OF SQUARES
IS CALCULATED FROM THE "TOTAL SUM OF SQUARES" MINUS THE "RESIDUAL
SUM OF SQUARES".   THE "TOTAL SUM OF SQUARES" IS NOT CORRECT
WHEN YOU INDICATED THAT YOU DO NOT WANT THE INTERCEPT TO PASS THROUGH
THE ORIGIN.

 THE EXCEL PROGRAM USES THE "ADJUSTED SUM OF SQUARES"
(REMOVING the REGRESSION SUM OF SQUARES ACCOUNTED FOR BY THE
UNIT VECTOR (the "MEAN").  The REAL TOTAL SUM OF SQUARES IN THIS
CASE SHOULD BE THE SUM OF SQUARES FOR THE DEPENDENT VARIABLE.

Apparently the programmer of the REGRESSION  procedure did not know how to
compute the REAL TOTAL SUM OF SQUARES.

As some of the users and creators of Statistical Software Packages
frequently mention:

"Using the statistical routines in Excel can be risky."

 Of course, ALL statistical packages should be used with caution.

We have not had time to check on the Excel2000 to find out if it is still
has the
same problem.

Keep in touch.

--  JHW

*  Joe Ward
*  167 East Arrowhead Dr.
*  San Antonio, TX 78228-2402
*  Phone: 210-433-6575
*  Fax:  210-433-2828
*  Email: [EMAIL PROTECTED]
*  http://www.ijoa.org/joeward/wardindex.html
*  
*  Health Careers High School
*  4646 Hamilton Wolfe
*  San Antonio, TX 78229
*  Phone: 210-617-5400
*  Fax:   210-617-5423
**

-Original Message-
From: [EMAIL PROTECTED] [EMAIL PROTECTED]
To: [EMAIL PROTECTED] [EMAIL PROTECTED]
Date: Thursday, June 15, 2000 9:37 AM
Subject: Beginner requests for help on ANOVA and T-tests


Hello, I am a 16 year old student and a beginner to statistics.
I'm lost.
Currently I only have Microsoft Excel 97. And I would like to know the
differences between the following ANOVA tests (in Excel):

ANOVA Single Factor
ANOVA Two-Factors with replication
ANOVA Two-Factors without replication

What do all these mean? Where and when should they be applied? And can
anyone please use simple english terms to explain? I am only a beginner.
What is one-way or two-way ANOVA?

How about for T-Test?
T-Test: Paired two samples for means
T-Test: Two-sample assuming equal variances
T-Test: Two-sample assuming unequal variances

Also, can I use ANOVA instead of T-test when testing null hypothesis?
Between 2 groups?

Thanks for your help,
Edmund


Sent via Deja.com http://www.deja.com/
Before you buy.


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===






===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: MANOVA

2000-06-14 Thread Joe Ward

If the 'ZERO' or 'DOT' means that you have some missing cells then
that is a good time to "CREATE YOUR OWN MODEL".

-- Joe
********
*  Joe Ward
*  167 East Arrowhead Dr.
*  San Antonio, TX 78228-2402
*  Phone: 210-433-6575
*  Fax:  210-433-2828
*  Email: [EMAIL PROTECTED]
*  http://www.ijoa.org/joeward/wardindex.html
*  
*  Health Careers High School
*  4646 Hamilton Wolfe
*  San Antonio, TX 78229
*  Phone: 210-617-5400
*  Fax:   210-617-5423
**
-Original Message-
From: HAideren [EMAIL PROTECTED]
To: [EMAIL PROTECTED] [EMAIL PROTECTED]
Date: Wednesday, June 14, 2000 8:12 PM
Subject: MANOVA


Hi,

I have run a MANOVA and in the 'Parameter Estimates' section of the
results,
some of the cells are filled with a zero or a dot (.). Is there a way to
overcome this problem? If no, should I run a different multivariate test
and
what would be the appropriate substitute test?

Cheers.





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Inequalities constrains on the coefficients

2000-06-08 Thread Joe Ward

I asked  Lee Wilkinson how this is done in SYSTAT.
Here is his reply.

-- Joe


*  Joe Ward
*  167 East Arrowhead Dr.
*  San Antonio, TX 78228-2402
*  Phone: 210-433-6575
*  Fax:  210-433-2828
*  Email: [EMAIL PROTECTED]
*  http://www.ijoa.org/joeward/wardindex
*  
*  Health Careers High School
*  4646 Hamilton Wolfe
*  San Antonio, TX 78229
*  Phone: 210-617-5400
*  Fax:   210-617-5423
**
-Original Message-
From: Wilkinson, Leland [EMAIL PROTECTED]
To: 'Joe Ward' [EMAIL PROTECTED]
Date: Thursday, June 08, 2000 9:34 AM
Subject: RE: Inequalities constrains on the coefficients


The SYSTAT procedure NONLIN does the same with the LOSS option and FUNPAR.
Could you perhaps post this to Ed-Stat in the same thread?
Thanks,
Lee

-Original Message-
From: Joe Ward [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 06, 2000 11:56 AM
To: Wilkinson, Leland (SYSTAT
Subject: Fw: Inequalities constrains on the coefficients


Lee --

Is this available in any version of SYSTAT?
What about SYSTAT8-Student Version?

-- Joe

=
-Original Message-
From: Jonathan Fry [EMAIL PROTECTED]
To: [EMAIL PROTECTED] [EMAIL PROTECTED]
Date: Tuesday, June 06, 2000 11:05 AM
Subject: Re: Inequalities constrains on the coefficients


Arie Beresteanu wrote:

 Hi,

  Estimation of linear (multivariate) regression with equality constrains
 on the coefficients is a well known problem (at least for me). What
 about if the constrains are inequalities? More specifically:

 Y=Xb+e
 s.t.
 Qb=q

 where Q is a matrix and q is a vector. (for example Y=b0+b1*X1+b2*X2+e
 s.t. b1+2*b2=0 )

 How do I solve that? How do I test the constrain? Is there something on
 MatLab/STATA/SAS for that?

 Thank you,
 Arie.

The SPSS procedure CNLR (constrained non-linear regression) handles this
kind of problem directly, using a quadratic programming solver.

Jonathan Fry
SPSS Inc.


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: R sq vs r sq

2000-05-05 Thread Joe Ward

Hi Paul, William et al.--

This may be ANOTHER GOOD TIME TO COMMENT ON 
THE COMMUNICATION PROBLEMS OF STATISTICS (AND OTHER AREAS, TOO).

I suggest that when we use the terms LINEAR and NONLINEAR that we
tell the reader what the SENDER means by those terms.

When I write:

Y = b1*X1 + b2*X2 + ... + bp*Xp + E

where bi (i = 1,2,...p) are least-squares regression coefficients, I
will refer to this as a LINEAR MODEL.

The Xs can be any numbers that I choose-- log(z), ln(z), z^3,  cos(z), 1/z, binary 
(1or 0), ...

If a person writes the form:

Y = a0 + a1*X + a2*X^2 + a3*X ^3 + E

then they might say that this is a NONLINEAR model.

As long as the reader knows exactly what the model is-- then we are communicating.

In these days of fancy 3D graphic displays, it is interesting to picture the function:

Y = a0 + a1*X + a2*X^2 

in the 2D space of Y and X -- which appears as a CURVE.

and then picture the function in the 3D space of Y, X and X^2 or
re-designating X^2 as Z 

Y = a0 + a1*X + a2*Z

We notice that the 3D function lies in a PLANE -- reminding us that
we have a "LINEAR MODEL".

If we hurriedly say to someone that "this function is NONLINEAR in the 2D space  of Y 
and X, but
LINEAR in the 3D space of Y,X and Z", then we might even cause more frustration. :-(

"COMMUNICATION" IS A PROBLEM EVERYWHERE!

DO WILLIAM AND PAUL HAVE THE SAME MEANING FOR "NONLINEAR"?
:-)

--- Joe
**** 
* Joe Ward  Health Careers High School *
* 167 East Arrowhead Dr 4646 Hamilton Wolfe*
* San Antonio, TX 78228-2402San Antonio, TX 78229  *
* Phone: 210-433-6575   Phone: 210-617-5400*
* Fax: 210-433-2828 Fax: 210-617-5423  *
* [EMAIL PROTECTED]*
* http://www.ijoa.org/joeward/wardindex.html   *





- Original Message - 
From: Paul Velleman [EMAIL PROTECTED]
To: William J. Larson [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, May 05, 2000 6:43 AM
Subject: Re: R sq vs r sq


| At 11:18 AM +0200 05/05/2000, William J. Larson wrote:
| 
| It appears that R sq is some sort of generalization of r sq
| for nonlinear cases. True?
| 
| Not really. common convention is  to capitalize the R for multiple 
| correlation. The R sqr reported in regressions allows for the 
| generalization of simple regression to a multiple regression (2 or 
| more predictors). In both cases R sqr is the squared correlation 
| between y and y-hat. Y-hat represents the best (in the least squares 
| sense) fit to y among all linear combinations of the x's. All of 
| these are statistics for linear models. It is dangerous to apply them 
| to nonlinear models.
| 
| -- Paul
| -- 
| Paul F. Velleman
| Cornell University  Data Description, Inc.
| 358 Ives Hall  Box 4555
| Ithaca, NY 14853   Ithaca, NY 14852-4555
| (607) 255-4411  (607) 257-1000
| (607) 255-8484 fax(607) 257-4146 fax
| ===
| The Advanced Placement Statistics List
| To UNSUBSCRIBE send a message to [EMAIL PROTECTED] containing:
| unsubscribe apstat-l email address used to subscribe
| Discussion archives are at
| http://forum.swarthmore.edu/epigone/apstat-l
| Problems with the list or your subscription? mailto:[EMAIL PROTECTED]
| ===
| 



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: R sq vs r sq

2000-05-05 Thread Joe Ward

Bill --

You are so right!!  The term NONLINEAR is very confusing.

As I indicated in the earlier message, most folks in the statistics world refer to a 
LINEAR MODEL as I indicated.  

Y = b1*X1 + b2*X2 + ... + bp*Xp + E

and some folks will write UNFORTUNATELY --

Y = b0 + b1*X1 + b2 * X2 + ... + bp*Xp + E

that leads to more confusion!!

The main point is that the functions are LINEAR IN THE UNKNOWN COEFFICIENTS.

This is why we sometimes take the logs of the function so that the new function is
LINEAR IN THE UNKNOWN COEFFICIENTS -- AND THE SOLUTIONS ARE EASIER.

A  "REAL" NONLINEAR MODEL NEEDS SOME SPECIAL ALGORITHMS FOR SOLUTION.

---
Someday -- long after I'm out of this world -- the AP-Statistics objectives WILL ALLOW 
OUR
STUDENTS TO HAVE --

"The power they deserve to use REGRESSION/LINEAR MODELS and COMPUTERS/CALCULATORS
to their fullest".

Perhaps the secondary teachers can speed up improvements through the NCTM
"Principles and Standards for School Mathematics". 

Perhaps there should be an Applied Research Statistics course that has few 
restrictions on the content -- focusing on those topics that help students do what
they NEED to accomplish practical results -- leading to more enthusiasm for
statistics and data analysis.

Change is slow!!

:-)

-- Joe
******** 
* Joe Ward  Health Careers High School *
* 167 East Arrowhead Dr 4646 Hamilton Wolfe*
* San Antonio, TX 78228-2402San Antonio, TX 78229  *
* Phone: 210-433-6575   Phone: 210-617-5400*
* Fax: 210-433-2828 Fax: 210-617-5423  *
* [EMAIL PROTECTED]*
* http://www.ijoa.org/joeward/wardindex.html   *




- Original Message - 
From: William J. Larson [EMAIL PROTECTED]
To: Joe Ward [EMAIL PROTECTED]; Paul Velleman [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, May 05, 2000 10:46 AM
Subject: Re: R sq vs r sq


| Joe,
| 
| Well by linear *I* meant what we mean in algebra 2 class y = mx + b,
| but I do not object to calling y = a0 + a1 x1 + a2 x2 + a3 x3 + ... linear.
| I certainly DO object to your definition of linear, although I suppose
| it *is* used by some people, I find it very confusing.
| 
| Cheers,
| Bill Larson
| Geneva, Switzerland
| 
| - Original Message -
| From: Joe Ward [EMAIL PROTECTED]
| To: William J. Larson [EMAIL PROTECTED]; Paul Velleman
| [EMAIL PROTECTED]
| Cc: [EMAIL PROTECTED]
| Sent: 2000 May 05 9:07 PM
| Subject: Re: R sq vs r sq
| 
| 
| 
| Hi Paul, William et al.--
| 
| This may be ANOTHER GOOD TIME TO COMMENT ON
| THE COMMUNICATION PROBLEMS OF STATISTICS (AND OTHER AREAS, TOO).
| 
| I suggest that when we use the terms LINEAR and NONLINEAR that we
| tell the reader what the SENDER means by those terms.
| 
| When I write:
| 
| Y = b1*X1 + b2*X2 + ... + bp*Xp + E
| 
| where bi (i = 1,2,...p) are least-squares regression coefficients, I
| will refer to this as a LINEAR MODEL.
| 
| The Xs can be any numbers that I choose-- log(z), ln(z), z^3,  cos(z), 1/z,
| binary (1or 0), ...
| 
| If a person writes the form:
| 
| Y = a0 + a1*X + a2*X^2 + a3*X ^3 + E
| 
| then they might say that this is a NONLINEAR model.
| 
| As long as the reader knows exactly what the model is-- then we are
| communicating.
| 
| In these days of fancy 3D graphic displays, it is interesting to picture the
| function:
| 
| Y = a0 + a1*X + a2*X^2
| 
| in the 2D space of Y and X -- which appears as a CURVE.
| 
| and then picture the function in the 3D space of Y, X and X^2 or
| re-designating X^2 as Z
| 
| Y = a0 + a1*X + a2*Z
| 
| We notice that the 3D function lies in a PLANE -- reminding us that
| we have a "LINEAR MODEL".
| 
| If we hurriedly say to someone that "this function is NONLINEAR in the 2D
| space  of Y and X, but
| LINEAR in the 3D space of Y,X and Z", then we might even cause more
| frustration. :-(
| 
| "COMMUNICATION" IS A PROBLEM EVERYWHERE!
| 
| DO WILLIAM AND PAUL HAVE THE SAME MEANING FOR "NONLINEAR"?
| :-)
| 
| --- Joe
| ********
| * Joe Ward  Health Careers High School *
| * 167 East Arrowhead Dr 4646 Hamilton Wolfe*
| * San Antonio, TX 78228-2402San Antonio, TX 78229  *
| * Phone: 210-433-6575   Phone: 210-617-5400*
| * Fax: 210-433-2828 Fax: 210-617-5423  *
| * [EMAIL PROTECTED]*
| * http://www.ijoa.org/joeward/wardindex.html   *
| 
|

STATISTICS AT ISEF2000- International Science Engineering Fair -- Detroit May 7-13 --Summer Workshop in San Antonio

2000-05-04 Thread Joe Ward

Topic #1 --The directory of finalists for ISEF2000 is now available at:

http://www.sciserv.org/isef/finaldir.pdf

There are finalists from all U.S. states and over 40 nations.

I did a brief search for MICHIGAN and a few schools represented are:

Renaissance HS
Saginaw Arts  Science Academy
Western High School
Redford HS

It is easy to find finalists near your location.

If you know any finalists, teachers, parents or others who might
be interested I will present the annual Shop Talk titled:

"Combining the Power of Statistics and Computers to Enhance Science Fair Projects"

at 9:00-10:00 a.m. on Monday, May 8, 2000 in Cobo Hall Room O2-41.

The purpose of this session is to provide guidance to Science Fair students, teachers 
and others to help them acquire statistics advice and  suggest kinds of questions they 
might ask their
statistical advisors.  As you can guess, I will encourage the participants to
get assistance from those who can teach them to create the models needed
to answer their, possibly unique, questions of interest. 

You can tell your friends that there will be some valuable drawing prizes for
those who get there early and stay 'til the end.

This year, none of the students with whom I advised in their data-analysis  made it to 
ISEF2000.
---sigh :-(

==

Topic #2 --

We have decided to open our Summer Workshop, emphasizing the Power of Statistics and 
Computers in Science 
Research, to a select few folks who may want to attend FROM OUTSIDE THE SAN ANTONIO 
REGION.  The application form with detailed information can be seen at the web site 
shown below.  This may be of interest to those who work with
student research projects  and those AP-Statistics teachers who have some extra school 
time AFTER THE MAY AP-EXAM
to introduce their students to some additional data-analysis ideas.
http://www.ijoa.org/joeward/wardindex.html   

  The dates are May 29 - June 9. If a participant can
stay for only the first week, that's OK.  Those who may be interested can call me to 
discuss details.

--- Joe
******** 
* Joe Ward  Health Careers High School *
* 167 East Arrowhead Dr 4646 Hamilton Wolfe*
* San Antonio, TX 78228-2402San Antonio, TX 78229  *
* Phone: 210-433-6575   Phone: 210-617-5400*
* Fax: 210-433-2828 Fax: 210-617-5423  *
* [EMAIL PROTECTED]*
* http://www.ijoa.org/joeward/wardindex.html   *















===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing -Reply

2000-04-17 Thread Joe Ward

Hi, Robert and all --

Yes, there occasionally were discussions in our Air Force research
whether or not we were working with the POPULATION or a SAMPLE.

As Dennis comments:
| 
|  the flaw here is that ... she has population data i presume ... or about
| as
|  close as one can come to it ... within the institution ... via the budget
|  or comptroller's office ... THE salary data are known ... so, whatever
|  differences are found ... DEMS are it!
| 

One of my Professors used to use the Invertebrate Paleontologists as his
example of a POPULATION.  I think at that time there were less than 20
people who were Invertebrate Paleontologists.

-- Joe
 
* Joe Ward  Health Careers High School *
* 167 East Arrowhead Dr 4646 Hamilton Wolfe*
* San Antonio, TX 78228-2402San Antonio, TX 78229  *
* Phone: 210-433-6575   Phone: 210-617-5400*
* Fax: 210-433-2828 Fax: 210-617-5423  *
* [EMAIL PROTECTED]*
* http://www.ijoa.org/joeward/wardindex.html   *




- Original Message - 
From: Robert Dawson [EMAIL PROTECTED]
To: dennis roberts [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Monday, April 17, 2000 9:54 AM
Subject: Re: hyp testing -Reply


| 
| - Original Message -
| From: dennis roberts
|  At 10:32 AM 4/17/00 -0300, Robert Dawson wrote:
| 
|   There's a chapter in J. Utts' mostly wonderful but flawed low-math
| intro
|  text "Seeing Through Statistics", in which she does much the same. She
|  presents a case study based on some of her own work in which she looked
| at
|  the question of gender discrimination in pay at her own university, and
|  fails to reject the null hypothesis [no systemic difference in pay
| between
|  male and female faculty]. She heads the example "Important, but not
|  significant, differences in salaries"; comments (_perhaps_ technically
|  correctly but misleadingly) that "a statistically naive reader could
|  conclude that there is no problem" and in closing states:
| 
| and Dennis Roberts replied:
| 
|  the flaw here is that ... she has population data i presume ... or about
| as
|  close as one can come to it ... within the institution ... via the budget
|  or comptroller's office ... THE salary data are known ... so, whatever
|  differences are found ... DEMS are it!
| 
|  the notion of statistical significance in this case seems IRRELEVANT ...
|  the real issue is ... given that there are a variety of factors that might
|  account for such differences (numbers in ranks, time in ranks, etc. etc.)
|   is the remaining difference (if there is one) IMPORTANT TO DEAL WITH
| ...
| 
| 
| If one can totally explain all contributing factors, so that a model
| with significantly fewer parameters than there are faculty fits everybody to
| within a practically significant margin of error, then yes, either the model
| continues to work with gender removed or it doesn't.
| 
| If, on the other hand, there are unknown sources of variation (a
| reasonable assumption in any situation involving people), or more sources of
| variation than there are data (another good bet if one thought hard enough),
| one cannot automatically go from the observation
| 
| (*)  "The average pay of female faculty members here is less than that of
| male faculty members"
| 
| to the apparently desired conclusion
| 
| (**)  "There is a gender-based _pattern_ of discrimination in faculty
| salaries"
| 
| without considering the study as a pseudo-experiment, and analyzing it as
| such.  One would be trying to decide: is the difference between mean male
| and female faculty salaries greater than one would expect if one took N1
| males and N2 females and assigned factors such as experience, rank,
| skill/luck at negotiating a first contract, demand for specialties,  merit
| pay actually deserved [as opposed to given on a gender basis], etc. at
| random?
| 
| This is what Utts and her coauthors were, it seems, trying to do.
| However, when the tests were not significant at the chosen level they seem
| to have fallen back on inferring (**) directly from (*).
| 
| -Robert Dawson
| 
| 
| 
| ===
| This list is open to everyone.  Occasionally, less thoughtful
| people send inappropriate messages.  Please DO NOT COMPLAIN TO
| THE POSTMASTER about these messages because the postmaster has no
| way of controlling them, and excessive complaints will result in
| termination of the list.
| 
| For information about this list, including information about the
| problem of inappropriate messages and information a

Re: linear model or interactive model?

2000-04-15 Thread Joe Ward

Wen-Feng-

The term LINEAR is a difficult term.

As I mentioned to you in an earlier message (included for
reference as the end of this message),
a LINEAR STATISTICAL MODEL is "LINEAR" in the unknown
coefficients, a1, a2,... ap in the model:

Y = a1*X1 + a2*X2 + ... + ap*Xp + E

The X predictors can be ANY NUMBERS THAT WE LIKE.

If we write --

Y = a1*U + a2*X + a2*X^2 + E

where 
U = 1
X  = a continuous predictor
X^2   = X*X 
E = error or residual

we might say that the function is NON-LINEAR in the two-dimensional, Y-X plane,
but it is LINEAR in the three dimensional space of Y-X-X^2.  With 3-D displays that we
can rotate as we would like, it is enlightening to observe that the CURVE seen in the 
two-dimensional
space lies in a PLANE in the three-dimensional space of Y-X-X^2.

-- Joe  
******** 
* Joe Ward  Health Careers High School *
* 167 East Arrowhead Dr 4646 Hamilton Wolfe*
* San Antonio, TX 78228-2402San Antonio, TX 78229  *
* Phone: 210-433-6575   Phone: 210-617-5400*
* Fax: 210-433-2828 Fax: 210-617-5423  *
* [EMAIL PROTECTED]*
* http://www.ijoa.org/joeward/wardindex.html   *



- Original Message - 
From: Wen-Feng Hsiao [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Saturday, April 15, 2000 5:14 AM
Subject: Re: linear model or interactive model?


| Dear Hartig,
| 
| Thanks for your reply. I am sorry for my poor knowledge in statistics.
| But I wonder why the definition of 'linearity' of statistics is different 
| from that of engineering mathematics, which defines 'linear' as:
| 
|  Each unknown xj appears to the first power only, and that there are no 
| cross product terms xi*xj with i!=j.
| 
| Wen-Feng
| 
| In article [EMAIL PROTECTED], 
| [EMAIL PROTECTED] says...
|  Generally, you can include an interaction (or moderator) term in a linear
|  model, like
|  y = b0 + b1 * x1 + b2 * x2 + b3 * x1*x2,
|  and the model still is linear. If you decide not to include x1 and x2, like
|  y = b0 + b1 * x1*x2,
|  you still have a linear model.
| 


- Original Message ----- 
From: Joe Ward [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; Wen-Feng Hsiao [EMAIL PROTECTED]
Sent: Thursday, April 13, 2000 10:30 AM
Subject: Re: linear model or interactive model?

- Original Message - 
From: Wen-Feng Hsiao [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, April 13, 2000 3:06 AM
Subject: linear model or interactive model?

| Dear all,
| 
| Suppose I have an aggregation model which is in the following form:
|   Y =  c1*(X11 * X12) + c2*(X21 * X22)?

| 
| This model could be thought as an aggregation of two knowledge, namely 
| X1. and X2.. Each knowledge contains two pieces of information 
| (attributes). For example, X1 contains X11 ans X12. Now if X.1 is the 
| height, and X.2 is the weight of a person. Then, the aggregation of any 
| two persons, say, Student1(height=170cm, weight=60kg), 
| Student2(height=180cm, weight=68kg) can be represented by
| 
| Y = 170*60+180*68=22440.
| 
| My question: a model as the above form is linear or interactive? I doubt 
| it is not a linear model. Since it is not in this form: Y= c1 X1 + c2 X2, 
| where c1 and c2 are constant. I doubt it is not a pure interactive form, 
| since X.1 and X.2 are dependent. Sorry for this stupid question.
| 
| Wen-Feng
| 
  Joe Ward writes| 
===

Wen-Feng---

Your model --

Y = X11 * X12 + X21 * X22.

does not have any unknowns.

Did you mean to write:

Y =  c1*(X11 * X12) + c2*(X21 * X22)?

All models of the form:

Y = c1*X1 + c2*X2 + ... + cp*Xp + E

are LINEAR MODELS.

It does not matter what NUMBERS are included in the Xs.

Y = c1*X1 + c2*X2 + c3*(X1*X2) + c4*(X1^2) + c5*(lnX1) + E

is LINEAR in the unknown coefficients c1, c2, ...

The most useful Xs are the BINARY( 1 or 0) predictors.


--- Joe
******** 
* Joe Ward  Health Careers High School *
* 167 East Arrowhead Dr 4646 Hamilton Wolfe*
* San Antonio, TX 78228-2402San Antonio, TX 78229  *
* Phone: 210-433-6575   Phone: 210-617-5400*
* Fax: 210-433-2828 Fax: 210-617-5423  *
* [EMAIL PROTECTED]*
* http://www.ijoa.org/joeward/wardindex.html   *





===
This list is open t

Re: hyp testing

2000-04-07 Thread Joe Ward

Hi, Dennis--

Yes, "LOT of years!" ago (the 1950's), when I first started into the real applied 
world,
our main job was to PREDICT, PREDICT, PREDICT outcomes.  We had
some real cost figures to evaluate our predictions.  Before the term Bootstrap
arrived on the scene, we were Cross-Validating like mad.  We would divide those
punched cards into "random?" groups and shuffle them over and over again and 
"re-group".
Then apply the predictions developed from one data set to the others to see how well
he were doing.

Hypothesis testing -- in the classical sense -- was not involved

I still believe that TWO important ideas in life are:

- PREDICTION
and 
- OPTIMIZATION (choosing among alternative PREDICTIONS to MAXIMIZE or MINIMIZE one
or more OBJECTIVE FUNCTIONS).

If "Hypothesis testing" helps improve PREDICTION and OPTIMIZATION then that's great.

One of the difficulties in academia may be due to the lack of practical, 
decision-making
opportunities.  

What PRACTICAL  ACTIONS do we take as a result of analyzing a two-way table with
a Chi-Square "test" if we find a "statistically significant"  outcome?  I imagine we 
will get
some suggestions from our readers!
:-)
-- Joe
**** 
* Joe Ward  Health Careers High School *
* 167 East Arrowhead Dr 4646 Hamilton Wolfe*
* San Antonio, TX 78228-2402San Antonio, TX 78229  *
* Phone: 210-433-6575   Phone: 210-617-5400*
* Fax: 210-433-2828 Fax: 210-617-5423  *
* [EMAIL PROTECTED]*
* http://www.ijoa.org/joeward/wardindex.html   *





- Original Message - 
From: dennis roberts [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Friday, April 07, 2000 6:41 AM
Subject: hyp testing


| let's say that today ... we as the statistical community decided, by 
| democratic vote, that the concept of 'hypothesis testing' ... which has 
| essentially dominated statistical work for as long as i can remember 
| (which,  er um ... is a LOT of years!) ... is relegated to the 'we USED 
| to do this stuff' category
| 
| just THINK about this 
| 
| what would the vast majority of folks who either do inferential work and/or 
| teach it ... DO
| what analyses would they be doing? what would they be teaching?
| 
| 
| 
| ===
| This list is open to everyone.  Occasionally, less thoughtful
| people send inappropriate messages.  Please DO NOT COMPLAIN TO
| THE POSTMASTER about these messages because the postmaster has no
| way of controlling them, and excessive complaints will result in
| termination of the list.
| 
| For information about this list, including information about the
| problem of inappropriate messages and information about how to
| unsubscribe, please see the web page at
| http://jse.stat.ncsu.edu/
| ===
| 



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Reference for regression discontinuity

2000-03-22 Thread Joe Ward



Hi, Carl ---

If you still have your copy of
Introduction to Linear Models (Ward  
Jennings)
you will find many examples in Chapters 10 and 
11.

An interesting example is on paged 217, 

11.9 Discontinuity Between Two 
Second-Degree Polynomials.

With facilityto create linear models 
appropriate to the 
research questions of interest, many 
seemingly-unique problems
can be handled easily, e.g. Cubic 
Splines.

-- Joe
 
* Joe 
Ward 
Health Careers High School ** 167 East Arrowhead 
Dr 
4646 Hamilton Wolfe ** San 
Antonio, TX 
78228-2402 
San Antonio, TX 78229 ** Phone: 
210-433-6575 
Phone: 210-617-5400 ** Fax: 
210-433-2828 
Fax: 210-617-5423 ** 
[EMAIL PROTECTED] 
** http://www.ijoa.org/joeward/wardindex.html 
*





- Original Message - 
From: Carl J Huberty [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, March 22, 2000 8:30 
AM
| Will someone give me a (readable) 
reference for "regression| discontinuity"? Thanks in advance.| 
| Carl Huberty| | | | 
===| 
This list is open to everyone. Occasionally, less thoughtful| people 
send inappropriate messages. Please DO NOT COMPLAIN TO| THE POSTMASTER 
about these messages because the postmaster has no| way of controlling them, 
and excessive complaints will result in| termination of the list.| | 
For information about this list, including information about the| problem of 
inappropriate messages and information about how to| unsubscribe, please see 
the web page at| http://jse.stat.ncsu.edu/| 
===| 



Re: Matrix multiplication

2000-03-18 Thread Joe Ward



David --

Great message!!

One of most "revealing" numerical analysis 
problems is when there is
interest in "POWERING" a transition matrix in a 
Markov model.

PRE-MULTIPLYING to "POWER" the matrix 

compared to
POST-MULTIPLYING can get quite different 
results

This due to the different order of accumulation of 
the sum of products of
numbers between 0 and 1.

Numerical analysts can have lots of challenging 
problems.

-- Joe
******** 
* Joe 
Ward 
Health Careers High School ** 167 East Arrowhead 
Dr 
4646 Hamilton Wolfe ** San 
Antonio, TX 
78228-2402 
San Antonio, TX 78229 ** Phone: 
210-433-6575 
Phone: 210-617-5400 ** Fax: 
210-433-2828 
Fax: 210-617-5423 ** 
[EMAIL PROTECTED] 
** http://www.ijoa.org/joeward/wardindex.html 
*




- Original Message - 
From: David A. Heiser [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; 
Anthony Pleticos [EMAIL PROTECTED]
Sent: Friday, March 17, 2000 2:27 PM
Subject: Re: Matrix 
multiplication
| | - Original Message 
-| From: Anthony Pleticos [EMAIL PROTECTED]| 
To: [EMAIL PROTECTED]| 
Sent: Wednesday, March 15, 2000 4:24 PM| Subject: Matrix multiplication| 
| |  I don't know if I hit the correct site but would be grateful 
for an| answer -|  it is a fundamental one. We all know that linear 
regression can be|  accomplished by matrix multiplication and that there 
are packages which| will|  do it for you. I am teaching myself C++ 
and for the purposes of the|  excercise I would like to know how to 
create a matrix or obtain ready made|  code (ie "numerical recipe" 
)class so I could declare in a program:| |  #include 
iostream.h|  #include math.h|  #include 
matrix.h /* if there is such a file 
*/| 
| 
| The basic problem is that there is an enormous 
differences between real| world matricies. There is no one method for 
numerical matrix reductions. For| example note the very large number of 
Fortran subroutines that focus on| peculiar aspects (banded, complex, 
sparse, near singular, positive definite,| not positive definate, 
triangular, rank deficient, etc., etc) Note the large| number of free 
Fortran subroutines devoted to matrices in "NETLIB". There| are other free 
Fortran libraries available from the web.| | Matrix multiplication is 
not numerically straightforward given a finite| computer environment. One 
can get very misleading results doing the standard| multiply and add method 
using standard single precision.| | I would suggest you get familiar 
with numerical analysis methods. I| personally prefer the works of G. W. 
Stewart as a source.| | DAHeiser| | | | 
===| 
This list is open to everyone. Occasionally, less thoughtful| people 
send inappropriate messages. Please DO NOT COMPLAIN TO| THE POSTMASTER 
about these messages because the postmaster has no| way of controlling them, 
and excessive complaints will result in| termination of the list.| | 
For information about this list, including information about the| problem of 
inappropriate messages and information about how to| unsubscribe, please see 
the web page at| http://jse.stat.ncsu.edu/| 
===| 



Re: Why do we use and teach z?

2000-03-17 Thread Joe Ward



Josh, Bill, et al --

I can't resist!!

Yes, those who have invested much of their life in 
acquiring certain
knowledge tend to want future generations to have 
those "exciting"
historical experiences. It is rather 
unfortunate that we have a hard
time making changes to give future generations 
some of the power they deserve.

I experienced some difficulty in the 1950's with 
those folks who had become
"masters" of the various analysis of 
variancealgorithms that were developed
before computers became available. My first 
major job in the 1950s was to 
"get us off of Frieden, Marchant and Monroe 
desk calculators onto the
IBM 602A followed by IBM 607, then IBM 650 etc." The biggest difficulty was 
to get researchers to take advantage of the 
computer power that allowed them the
freedom to create their own models to answer their questions of 
interest.

It was very difficult for persons with Ph.D. degrees to give up that for which 
they had invested so much time to learn. 
It wasa little "traumatic" in the 1950s when a Ph.D. was told 
that"you don't need to have equal or proportional Ns in a two-way ANOVA". And it 
was 
really interesting to see the reaction when they 
were told that "you don't need 
a response in every cell". As a matter of 
fact, the managers of our Air Force research
organization assembled a panel of experts to come 
in to find out what Bob Bottenberg
and I were up to when we were promoting the use of 
a more general approach to
creating models to answer research questions of 
interest.

It is indeed amazing that, 40 years later, many 
first-course statistics students
are told that "IT IS BEYOND THE SCOPE OF THIS TEXT 
TO DEAL WITH SITUATIONS IN WHICH 
SAMPLE SIZES ARE UNEQUAL IN THE CELLS OF TWO-WAY 
ANOVA".
It is little wonder that these students can do 
very little data analysis in support
of practical research.

A few of you have heard this "sermon" 
before!!

By the way, those of you who have six weeks of 
school after the exammight 
want to give your students some power to use 
Prediction/Regression/Linear Models
and Computers. They might be able to do some 
useful data analysis and appreciate
your efforts!!

Well,that's enough from a "NON-INFLUENTIAL 
OUTLIER".

-- Joe

 
* Joe 
Ward 
Health Careers High School ** 167 East Arrowhead 
Dr 
4646 Hamilton Wolfe ** San 
Antonio, TX 
78228-2402 
San Antonio, TX 78229 ** Phone: 
210-433-6575 
Phone: 210-617-5400 ** Fax: 
210-433-2828 
Fax: 210-617-5423 ** 
[EMAIL PROTECTED] 
** http://www.ijoa.org/joeward/wardindex.html 
*

  - Original Message - 
  From: 
  Joshua Tabor 
  To: William J. Larson ; AP Stats. list 
  
  Sent: Friday, March 17, 2000 9:11 
AM
  Subject: RE: Why do we use and teach 
  z?
   Reply to:   RE: Why do we use and teach z?

I agree with you completely. The 
  only explanation I received for why it is still in most books is that it is a 
  nice stepping stone to a full fledged t-test (of course, it is very likely I 
  am misinformed). Anyway, this year I have decided to teach inference for 
  proportions first (as the stepping stone) and then go straight into t-tests, 
  eliminating z-tests for means. It helps make the course more realistic, and it 
  saves me precious time (we start the second week of september and have 6 weeks 
  of school after the AP!).I am curious to hear what the college folks 
  (and textbook authors) have to sayjoshJosh TaborWilson HSHacienda 
  Heights, CA[EMAIL PROTECTED]William J. Larson wrote:Why do we use and teach z?As I 
  continually tell my students, normally (no pun intended) we do not 
  know sigma, so we should use t not z. Indeed can we ever knowsigma? If 
  not why do we even bother to mention z? Is it historical reasons? Or 
  because in the real world lots of people ignore the above fact  
  use z anyway, so we are conscientiously preparing our students for the 
  real world? Or (more likely) am I missing something?Dr. 
  William J. Larson[EMAIL PROTECTED]Institut Monte RosaMontreux, 
  Switzerland===The 
  Advanced Placement Statistics ListTo UNSUBSCRIBE send a message to 
  [EMAIL PROTECTED] containing:unsubscribe apstat-l email address used to 
  subscribeDiscussion archives are athttp://forum.swarthmore.edu/epigone/apstat-lProblems with the list or your 
  subscription? mailto:[EMAIL PROTECTED]===RFC822 
  header--- Return-Path: 
  [EMAIL PROTECTED] Received: from learn.etc.bc.ca 
  ([142.44.5.2]) by ns700-1.enet.hlpusd.k12.ca.us (Post.Office MTA 
  v3.1.2 release (PO203-101c) ID# 0-57237U2600L100S0V35) with 
  ESMTP id AAA20002; Fri, 17 Mar 2000 00:47:46

Re: Looking for text on resampling...

2000-03-17 Thread Joe Ward



Scott --
Peter Bruce should be able to give us the 
latest "word".
-- Joe
******** 
* Joe 
Ward 
Health Careers High School ** 167 East Arrowhead 
Dr 
4646 Hamilton Wolfe ** San 
Antonio, TX 
78228-2402 
San Antonio, TX 78229 ** Phone: 
210-433-6575 
Phone: 210-617-5400 ** Fax: 
210-433-2828 
Fax: 210-617-5423 ** 
[EMAIL PROTECTED] 
** http://www.ijoa.org/joeward/wardindex.html 
*



- Original Message - 

From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, February 06, 2000 12:08 
PM
Subject: Looking for text on 
resampling...
|  Our small college library has a 
collection of basic biostats texts but| nothing that specifically covers the 
area of resampling. I am currently| looking over a 1991 text by Bryan Manly 
(Randomization and Monte Carlo| Methods in Biology) - the first two chapters 
seem quite accessible (to| someone unfamiliar with the field!)| | 
 Could anyone suggest other texts that might cover bootstrapping and| 
jacknife techniques - I would favour texts that have a biology bent and| are 
written so non-specialists can follow...| |  Many thanks!| 
| Scott| | | Sent via Deja.com http://www.deja.com/| Before you buy.| 
| | 
===| 
 This list is open to everyone. Occasionally, people lacking respect| 
 for other members of the list send messages that are inappropriate| 
 or unrelated to the list's discussion topics. Please just delete the| 
 offensive email.| |  For information concerning the list, 
please see the following web page:|  http://jse.stat.ncsu.edu/| 
===| 



Re: When *must* use weighted LS?

2000-03-15 Thread Joe Ward



John--

If you are interested in PREDICTION then the 
wayYOU use your information is up to
YOU. By Cross-validation, Resampling etc. 
you can determine which prediction method
seems to be "best" for your 
situation.

-- Joe
******** 
* Joe 
Ward 
Health Careers High School ** 167 East Arrowhead 
Dr 
4646 Hamilton Wolfe ** San 
Antonio, TX 
78228-2402 
San Antonio, TX 78229 ** Phone: 
210-433-6575 
Phone: 210-617-5400 ** Fax: 
210-433-2828 
Fax: 210-617-5423 ** 
[EMAIL PROTECTED] 
** http://www.ijoa.org/joeward/wardindex.html 
*





- Original Message - 

From: John Hendrickx [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, March 15, 2000 1:22 
AM
Subject: Re: When *must* use weighted 
LS?
| In article 8am7d1$hqj$[EMAIL PROTECTED]">8am7d1$hqj$[EMAIL PROTECTED], 
| [EMAIL PROTECTED] 
says...|  |  I think I made the formulation too wordy in 
previous|  post. |  |  Let me try this simple 
question:|  |  When one wishes to do a (multi)linear regression 
on a set of |  observed data, and one is in the (unusual) position of 
possessing|  a set of sample standard deviations (of varying degrees of 
f.) |  at each value of the "explanatory" variable, how does one| 
 determine whether one ought or ought not to solve the weighted|  
least squares problem using those sample standard deviations?|  | 
 What is the usual decision test for "heterscedasticity" *before* one| 
 solves the regression system? What do people do in practise?| 
 | Most social scientists don't worry very much about the assumptions of 
OLS | regression, noting that OLS estimates are fairly robust and can give 
| unbiased estimates even if those assumptions aren't fulfilled. Exceptions 
| are multilevel models and time series data, data for which the assumption 
| of uncorrelated error terms is violated. But these require special | 
programs, not weighted least squares.| | There is also some debate on 
using weights for stratified sampling and/or | to correct for sampling bias. 
Weighting leads to correct estimates but | incorrect standard errors. One 
solution is to include the design | variables in the model instead of 
weighting. Stata and Wesvar are two | programs that can take weighting into 
account when calculating standard | errors of estimates. But a quite common 
approach is to use weights for | descriptive statistics, but not in 
multivariate models.| | Weights can also be used for certain dependent 
variables that will | violate the assumption of heteroscedasticity, e.g. a 
dichotomous | dependent. I recently did a weighted least squares analysis 
for a co-| worker to replicate an analysis in another paper. The weight was 
| groupn*pct*(1-pct), where groupn was the number of cases per group and 
| pct was the proportion with a positive response within each group. But 
| this basically amounts to a poor approximation of a logit model. Programs 
| like GLIM that use iteratively reweighted least squares use pct*(1-pct) 
| as the weight when estimating the model, but now pct is the predicted 
| probability from the previous iteration.| | As for a test for 
heteroscedasticity, Stata has a "hettest", which | performs a Cook-Weisberg 
test and produces a chi-square statistic. They | wrote a book in 1982, 
"Residuals and influence in regression". I've never | used it though.| 
| Hope this helps,| John Hendrickx| | | 
===| 
This list is open to everyone. Occasionally, less thoughtful| people 
send inappropriate messages. Please DO NOT COMPLAIN TO| THE POSTMASTER 
about these messages because the postmaster has no| way of controlling them, 
and excessive complaints will result in| termination of the list.| | 
For information about this list, including information about the| problem of 
inappropriate messages and information about how to| unsubscribe, please see 
the web page at| http://jse.stat.ncsu.edu/| 
===| 



Re: Repeated measures

2000-03-09 Thread Joe Ward



Hi, Kaspar--

The CORRECT model is the one that allows YOU to 
answer YOUR OWN
questions of interest. If the "packaged" 
PROCs have been 
verified to do what YOU want, then that's good.
It is sometimes difficult to know what question a 
"packaged" PROC 
is attempting to answer.

 Be careful -- especially if theremay 
be "missing cells".

:-)
--Joe
******** 
* Joe 
Ward 
Health Careers High School ** 167 East Arrowhead 
Dr 
4646 Hamilton Wolfe ** San 
Antonio, TX 
78228-2402 
San Antonio, TX 78229 ** Phone: 
210-433-6575 
Phone: 210-617-5400 ** Fax: 
210-433-2828 
Fax: 210-617-5423 ** 
[EMAIL PROTECTED] 
** http://www.ijoa.org/joeward/wardindex.html 
*




- Original Message - 
From: Kasper Hornbæk [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, March 09, 2000 1:41 
AM
Subject: Q: Repeated measures
| Hi everybody.| I have a question 
concerning repeated measures analysis. I am not sure of| whether a linear 
model with a factor that varies as repeated measures are| taken (e.g., order 
or session) is identical to a repeated measures analyses.| I'll detail the 
question below.| | I have a within-subject study in which subjects used 
three methods to solve| six different tasks. The experiment is run in three 
sessions, each| consisting of two tasks. Three of the tasks are very 
different from the| other three tasks.| | For analysing this 
experiment, I plan to use a model like Y[ijkl]:= u+| subject[i]+ task[j]+ 
session[k]+ method[l]+ e[ijkl],| possibly adding interactions between task, 
method and session. Is this a| repeated measures analysis or equivalent to a 
repeated measures analysis?| | If not, how should I analyse these data 
using SAS's repeated measures| option?| | Kind regards,| 
 Kasper Hornbæk/|  kash(at)diku.dk| | | 
| | 
===| 
This list is open to everyone. Occasionally, less thoughtful| people 
send inappropriate messages. Please DO NOT COMPLAIN TO| THE POSTMASTER 
about these messages because the postmaster has no| way of controlling them, 
and excessive complaints will result in| termination of the list.| | 
For information about this list, including information about the| problem of 
inappropriate messages and information about how to| unsubscribe, please see 
the web page at| http://jse.stat.ncsu.edu/| 
===| 



Fw: other uses for Minitab

2000-03-06 Thread Joe Ward



Hi, Tim --It's good to hear that 
  some folks think it is useful to fit a least-squaresline through the 
  origin. Of course it is even better to be able to "force"a 
  least-squares model to have a wide range of properties 
  (restrictions).Without any connection 
  to statistics, students should be given the 
  opportunityto use their algebra "savvy" to impose restrictions on math 
  models.For example, Given a model of the 
  form:Y = a0 + a1*X + a2*X^2 + Eit might be of 
  interest to "restrict" the model to:-- Pass through the 
  originor-- Pass through X=1 and Y = 2or-- Slope = 0 at 
  X=5 (For the calculus crowd) orMany 
  others!---Using Algebra, Geometry and Trig. the "least-squares 
  story" can be presented to students WITHOUT 
  CALCULUS.Minimizing "distance" from a point to a line, or plane, 
  or hyper-plane seems tobe more appealing than taking partial 
  derivatives. Connecting "perpendicularity" to"orthogonality" seems 
  to work well.-- 
  Joe**** 
  * Joe 
  Ward 
  Health Careers High School ** 167 East Arrowhead 
  Dr 
  4646 Hamilton Wolfe ** San 
  Antonio, TX 
  78228-2402 
  San Antonio, TX 78229 ** Phone: 
  210-433-6575 
  Phone: 210-617-5400 ** Fax: 
  210-433-2828 
  Fax: 210-617-5423 ** 
  [EMAIL PROTECTED] 
  ** http://www.ijoa.org/joeward/wardindex.html 
  *----- 
  Original Message - From: Tim Erickson [EMAIL PROTECTED]To: Joe Ward [EMAIL PROTECTED]Sent: Sunday, 
  March 05, 2000 3:28 PMSubject: Re: other uses for Minitab| on 
  00.03.03 10:51 PM, Joe Ward at [EMAIL PROTECTED] wrote:| |  
  A Bob, you remembered.|  |  I've been "bugging" the 
  calculator makers for many years about including|  the least-squares 
  model of the form:|  |  LinReg(bx), Letting the function pass 
  through the origin.| | | just a note -- Fathom has a "lock 
  Intercept at Zero" command for its least| squares regression, hich amounts 
  to the same thing.| | I think it's also an interesting exercise for a 
  (calculus?) student to| derive a formula for "b" given an arbitrary set of 
  data and the constraint| that b must minimize the sum of squares of the 
  residuals. At least it was| interesting to me!| | Tim| 
  | | 

Earl 
Jennings 
Phone: (512) 
345-0628 
|
6917 Thorncliffe Dr. 
e-mail 
address: 
|
Austin, TX 78731-2955 
[EMAIL PROTECTED] |
 



Re: Howto interpret interactions in an ANOVA

2000-02-29 Thread Joe Ward



Hi all --

Again -- I'm jumping on the band wagon in support 
of these messages that
advocate-- what I call -- a 
PREDICTION/REGRESSION/LINEAR MODELS approach.

I was attracted to Lee Wilkinson and SYSTAT many 
years ago when Lee
had a sign at one ofhis SYSTAT BOOTHS that 
said:

"Ask me about Cell Means Analysis" (May not be 
Lee's exact words)

I was so excited to see a software package 
that required the user to
insert the word CONSTANT in the regression model 
when the user
wanted it -- NOT AS THE DEFAULT. When using 
SAS at 
Clemson in 1985-86, I had to tell students that 
they must use the NOINT
OPTION until I explained why. A most 
misunderstood and troublesome idea
is the lack of understanding of the predictor, U, 
a vector of 1's. If students
would -- in the beginning -- insert THEIR OWN U, 
when needed, then they might
have a better understanding of the "efficiency" of 
having the CONSTANT or INTERCEPT
as the DEFAULT. This lack of understanding about 
the CONSTANT or INTERCEPT is
revealed by the many Email messages we see related 
to "What is RSQ WHEN there is NO 
CONSTANT or INTERCEPT".

It is interesting that the more "modern" versions 
of SYSTAT require the user to
REMOVE THE CONSTANT when appropriate.

It would be really great if the statistics 
education folks would advocate the
introduction of PREDICTION/REGRESSION/LINEAR 
MODELS early so that the students
would have something useful in their experience 
and perhaps continue their study
of statistics. I'm afraid that many FIRST 
STATISTICS COURSES have little
"selling/marketing" effect on 
students.

The "Cell-Means Approach" is easy to introduce to 
high school students, since
these students have experiences with AVERAGES, 
MEANS, GPAs. And the
"Missing Cells Problem?" is really not a 
problem until the students are
told that some folks don't know what to do about 
"Missing Cells".

Enough "preaching to the choir"!!

--Joe


 
* Joe 
Ward 
Health Careers High School ** 167 East Arrowhead 
Dr 
4646 Hamilton Wolfe ** San 
Antonio, TX 
78228-2402 
San Antonio, TX 78229 ** Phone: 
210-433-6575 
Phone: 210-617-5400 ** Fax: 
210-433-2828 
Fax: 210-617-5423 ** 
[EMAIL PROTECTED] 
** http://www.ijoa.org/joeward/wardindex.html 
*
- Original Message - 
From: Gregory C. Mayer [EMAIL PROTECTED]

To: [EMAIL PROTECTED]
Sent: Tuesday, February 29, 2000 6:46 
AM
Subject: Re: Howto interpret interactions in an 
ANOVA
| R.R. Sokal  F.J. Rohlf in 
Biometry (1995, Freeman) emphasize the unity of| anova, ancova and 
regression (and in their shorter Introduction to| Biostatistics, anova and 
regression). They introduce them in turn,| however; I agree that a 
text that began with glm and then took up anova,| ancova and regression as 
instances of the general approach would be| preferable. This is 
especially so when using Systat, as the model| statements closely parallel 
the models, allowing more complex| models to be grasped and implemented 
immediately, instead of being treated| as some new technique.| | 
Gregory C. Mayer| [EMAIL PROTECTED]| 
| | | | On Mon, 28 Feb 2000, Bob Madden wrote:| |  I 
agree. In fact, I have sought in vain for an introductory level 
statistics|  text that does not treat ANOVA and regression as two 
totally separate,|  disconnected techniques.|  With 
disconcerting monotony, they all monkey each other in this respect. I| 
 think students|  would be better served by being shown early on 
that regression, ANOVA, and for|  that|  matter, ANCOVA, are all 
special cases of the glm.|  |  --Bob Madden|  |  
James Friedrich wrote:|  |   Let me ad to the speculation 
regarding why interaction effects are often|   omitted from multiple 
regression. I think the reality is that people are|   
generally trained in one "mode" or the other (ANOVA or Regression) without| 
  a sense of their connectedness (a point already alluded to in 
previoous|   posts). In an in-press national survey of 
undergraduate statistical|   instruction for psychology majors, I 
found that ANOVA dominates, with|   little attention to 
regression (except "simple"). The specialties of|   those 
teaching the stats / methods courses tends to be in laboratory -|   
experimental areas where ANOVAs are the norm. The bottom line is that 
i|   don't think budding psychologists, at least, get much training 
- or good|   training - in MR or GLM perspectives. I also see 
this in advising /|   consulting I do with biology students. 
Sadly, I think the heavy ANOVA|   emphasis and minimal attention to 
regression approaches has the side|   effect of leaving people 
poorly schooled in measurement issues. My|   experience has 
been that professionals well-versed in MR / GLM are much|   more in 
tune with these concerns.|  |  

Re: Linear Regression with known intercept (Long Message)

2000-02-14 Thread Joe Ward




  Mark writes -
  
  - Original Message - From: [EMAIL PROTECTED]To: [EMAIL PROTECTED]Sent: 
  Saturday, February 12, 2000 4:51 PMSubject: Linear Regression with known 
  intercept| Hi,| | If I want to find the least squares 
  estimator of the slope of a simple| linear regression model where my 
  intercept is known, will this| estimator will be the same as if I did not 
  know my intercept(=Sxy/sxx)?| How about the variance and the confidence 
  interval of my estimator?| will they be bigger or smaller than the 
  estimator for the case where| both my intercept and slope unknown?| 
  | Thank you for your help.| | Mark| | | Sent via 
  Deja.com http://www.deja.com/---Hi, Mark --Glad 
  you sent this Email. It is a nice and simple example of the useof 
  Prediction/Regression/Linear Models -- which should be one of theimportant 
  objectives of a FIRST NON-CALCULUS-BASED STATISTICS COURSE.Consider, 
  first, the Simple Regression Model:Y = a1*U + a2*X + 
  E1where Y = a vector containing 
  observations on a dependent or response variable.U = a predictor (vector) containing all 1's.(THE MOST 
  NEGLECTED AND NON-UNDERSTOOD PREDICTOR OF ALL)X = another predictor 
  with any elements -- could be BINARY (0,1).E1= the Error or Residual 
  vector.a1 = least-squares regression coefficient 
  of U (this is frequently 
  referred to as the "Y-intercept").a2 = least-squares regression 
  coefficient of X (this is 
  frequently referred to as the "Slope".A powerful capability to give 
  students who are comfortable withAlgebra is to be able to IMPOSE ANY 
  DESIRED LINEAR RESTRICTIONSON A 
  LINEAR MODEL OF THE FORM:Y = a1*X1 + a2*X2 + ... + ap *Xp + 
  EThis capability is useful in many applications 
  BESIDES STATISTICS.Now, to your neat example:"If I want to 
  find the least squares estimator of the slope of a simplelinear regression 
  model where my intercept is known, ... "You wish to impose the 
  restriction that-a1 = k (a known value)Imposing that restriction 
  on Model 1 above gives:Y = k*U + a2*X + E2The only 
  unknown regression coefficient is a2 which I will rename as:Let b2 = 
  a2 to remind us that the numerical value of the coefficient of Xin Model 1 
  is most likely different from the value in Model 2.Then, Y = k*U + 
  b2*X + E2Since k*U is known, the least-squares value for b2 is 
  obtained from:Y-k*U = b2*X + E2or letting 
  Y-k*U be designated by a single symbol, WW = b2*X 
  + E2and the least-squares value of b2 for Model 2 (and for any 
  ONE-PREDICTOR model) is: b2 = (W'X)/(X'X) 
  = Sum(wi*xi)/Sum 
  (xi*xi) b2 is 
  the "slope of the line which is "forced by the restriction" a1 = 
  kMost software now allows one to find the value of b2 by 
  forcing
  an option that requires that the vector U be omitted as a 
  predictor.
  If you have good software available, the software will 
  produce the 
  standard errors of a1 and a2 by solving equation 1 and the 
  standard
  error b2 by solving equation 2. 
  ---Now, if it is "interesting" to TEST AN 
  HYPOTHESIS THAT --a1 = kThen a statistic student may 
  want to compute:F = (SSQE2 - SSQE1)/(2-1) 
   
  --- 
  (SSQE1)/(n-2)F = (SSQE2 - SSQE1)/1 
   
  --- 
  (SSQE1)/(n-2)and since F(1,df2) = t^2(df2)t(df2) = 
  sqrt(F(1,df2))This IS a 
  "t-test".And, perhaps, from this value of "t" another statistics 
  studentmight want to compute the Standard Error of 
  a1, and then computea Confidence 
  Interval.The astute student can compute the 
  Standard Error from: t = 
  Statistic/Standard Errorbut sine the numerical values of t and the "Statistic" are known we 
  have:Standard Error = Statistic/tIn 
  this particular case,Standard Error = a1/tThis procedure allows 
  for easy computation of the "StandardError" of any of the 'weights' 
  (intercept or slope) in a regression model and in the more general case, 
  any linearcombination of the weights in a multiple linear regression 
  model.
  
  Sorry for the length of this message, but I couldn't resist 
  promoting theuse of Prediction/Regression/Linear Models for ALL 
  STUDENTS.--- Joe
  
  
  
  
  
  
  
  
  
  


Re: ANN vs. nonlinear regression: forecasting

2000-02-11 Thread Joe Ward



John --

Sounds very interesting--

If you mean "classical" least-squares model, there 
are no assumptions involved
in fitting least-squares. It's only the 
"statistics" assumptions that get added into
the extra "assumptions".

PREDICTION is the important thing. 

Compare the PREDICTIVE accuracy/costs/etc.of 
various approaches.

You may wish to include 
RESAMPLING/BOOTSTRAP/CROSS-VALIDATION 
in your 
research.

The 
proof of the "best" is how well it PREDICTS

I will be interested in what you 
learn.

-- Joe
******** 
* Joe 
Ward 
Health Careers High School ** 167 East Arrowhead 
Dr 
4646 Hamilton Wolfe ** San 
Antonio, TX 
78228-2402 
San Antonio, TX 78229 ** Phone: 
210-433-6575 
Phone: 210-617-5400 ** Fax: 
210-433-2828 
Fax: 210-617-5423 ** 
[EMAIL PROTECTED] 
** http://www.ijoa.org/joeward/wardindex.html 
*


- Original Message - 
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Friday, February 11, 2000 7:01 
AM
Subject: ANN vs. nonlinear regression: 
forecasting
| I'm working on a study that compares 
neural networks to classical non-| linear statistical estimators in 
forecasting time series. My thesis is| that the NN would be robust 
under conditions where the assumptions of| the classical model are not met, 
and the nn would be inferior where the| classical assumptions are 
satisfied.| | What would be a good classical model to compare a neural 
network to?| Does anyone know of any papers/sources on this subject?| 
| I sincerely appreciate any help/suggestions.| | John Carrier| 
[EMAIL PROTECTED]| | 
| Sent via Deja.com http://www.deja.com/| Before you buy.| 
| | 
===| 
 This list is open to everyone. Occasionally, people lacking respect| 
 for other members of the list send messages that are inappropriate| 
 or unrelated to the list's discussion topics. Please just delete the| 
 offensive email.| |  For information concerning the list, 
please see the following web page:|  http://jse.stat.ncsu.edu/| 
===| 



Re: adjusting marks; W. Edwards Deming

2000-02-09 Thread Joe Ward



 Robert Knodt writes in response to 
themessage at http://www.remarq.com 
The Internet's Discussion Network (SEE BELOW) 
---

Re: adjusting marks; W. Edwards 
Deming

It would be nice if those sending 
to the mailing list would clearly identify themselves. It would also be nice if 
they used an e-mail address so individuals might send them e-mail directly. 
Thanks, 

Dr. Robert C. Knodt 4949 Samish 
Way, #31
Bellingham, WA 98226 [EMAIL PROTECTED] 

 End of 
Robert Knodt's message

 Beginning of Joe Ward's comment 
--

Good comment, Robert --

Perhaps the unidentified writer is 
afrustrated product of "Non-mastery" Spelling Education
and is intentionally (or unintentionally) showing 
the results.

See BOLD items below.

-- 
Joe
******** 
* Joe 
Ward 
Health Careers High School ** 167 East Arrowhead 
Dr 
4646 Hamilton Wolfe ** San 
Antonio, TX 
78228-2402 
San Antonio, TX 78229 ** Phone: 
210-433-6575 
Phone: 210-617-5400 ** Fax: 
210-433-2828 
Fax: 210-617-5423 ** 
[EMAIL PROTECTED] 
** http://www.ijoa.org/joeward/wardindex.html 
*

- End of Joe Ward's comment 
--

- Original Message - 
From: Consultantssuck [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, February 07, 2000 5:12 
PM
Subject: Re: adjusting marks; W. Edwards 
Deming
| Dr. Deming Naive? You, sir, are 
misguided and unfortunately,| misinformed of the genius of the master Dr. 
Shewhart, and his| disiple and 
messenger to the latter half of the 20th century,| Dr. Deming.| | 
Humans want to do a good job. Dr. Deming was pellucid on this| 
point. People and school fit nicely into this axiom.| | what 
you fail to understand is the profound knowledge of| thinking preparing, and 
continual improvement. Grading is nice,| succinct, and above all, 
usually useless in its existing| design. Does grading permit our 
student to readdress problem or| slow areas? In many cases grading 
only shows how well you did,| based on varying factors-The next test, 
completely different.| | we have all seen studies where the pretty girl 
is awarded better| grades for the same caliber of work as others. we 
have all| seen reports where teachers are wrong in their 
suppositions,| then corrected or challenged by students, ultimately 
leading| these educators to hold a grudge for "attitude and behavior"| 
when report card time recurs.| | Do you want to know why the AFT and the 
NEA are against teaching| LOGIC in elementary schools (Logic being the 
foundation for all| higher math applications)?| | Could it be 
because some protege will learn to ask the harder| questions? Possibly 
Some "smart alec" will not accept our| educator's "Because I told you it 
did."| | A recent report found Elementary educators, when pressed 
for| answers they did not know, simply "winged it." This sophristry| unfortunately happens when our 
educators are not versed in the| sciences, history or math, and they wish to 
appear (to| themselves and) to their students, smart.| | People want 
to do a good job. Grading allows teachers to make| decisions in our 
children's early years based on mostly the| faliable 
educator's emotions toward that one particular budding| 
mind. Grading should be benchmarks for ever improvement based| 
on practice, practice practice of the fundementals. Then of| course moving foward with a keen sence of where the student is| going. Any good 
music teacher will tell you the ones who| practice the fundemental scales, dilegently, go on to master the| difficult 
pieces.| | Read the book OUT OF CRISES again, and again. I assure 
you, you| will soon "get it."| | | | * Sent from RemarQ http://www.remarq.com The Internet's Discussion 
Network *| The fastest and easiest way to search and participate in Usenet - 
Free!| | | | 
===| 
 This list is open to everyone. Occasionally, people lacking respect| 
 for other members of the list send messages that are inappropriate| 
 or unrelated to the list's discussion topics. Please just delete the| 
 offensive email.| |  For information concerning the list, 
please see the following web page:|  http://jse.stat.ncsu.edu/| 
===| 



Re: Looking for text on resampling...

2000-02-06 Thread Joe Ward

Scott --

Peter Bruce is the contact!!!

-- Joe

- Original Message - 
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, February 06, 2000 12:08 PM
Subject: Looking for text on resampling...


|   Our small college library has a collection of basic biostats texts but
| nothing that specifically covers the area of resampling. I am currently
| looking over a 1991 text by Bryan Manly (Randomization and Monte Carlo
| Methods in Biology) - the first two chapters seem quite accessible (to
| someone unfamiliar with the field!)
| 
|   Could anyone suggest other texts that might cover bootstrapping and
| jacknife techniques - I would favour texts that have a biology bent and
| are written so non-specialists can follow...
| 
|   Many thanks!
| 
| Scott
| 
| 
| Sent via Deja.com http://www.deja.com/
| Before you buy.
| 
| 
| ===
|   This list is open to everyone. Occasionally, people lacking respect
|   for other members of the list send messages that are inappropriate
|   or unrelated to the list's discussion topics. Please just delete the
|   offensive email.
| 
|   For information concerning the list, please see the following web page:
|   http://jse.stat.ncsu.edu/
| ===
| 



===
  This list is open to everyone. Occasionally, people lacking respect
  for other members of the list send messages that are inappropriate
  or unrelated to the list's discussion topics. Please just delete the
  offensive email.

  For information concerning the list, please see the following web page:
  http://jse.stat.ncsu.edu/
===



Re: teaching statistical methods by rules?

1999-12-20 Thread Joe Ward

Yep!!

As you say:
"Why are people so obsessed with T and Z? "

Perhaps it would be even better (easier?) to focus on F since

F(df1,df2) = t^2(df2)

(Reminder: when using a t-table, the p-values usually involve ONE-TAIL and
when using the F-table, the p-values involve TWO-TAILS )

Example:  The critical-value of t for probability of  p =  .05 at t(18) = 1.734
The critical-value of F for probability of p = .10  at F(1,18)  =  
(1.734)^2  =  3.01

:-)
-- Joe
******** 
* Joe Ward  Health Careers High School *
* 167 East Arrowhead Dr 4646 Hamilton Wolfe*
* San Antonio, TX 78228-2402San Antonio, TX 78229  *
* Phone: 210-433-6575   Phone: 210-617-5400*
* Fax: 210-433-2828 Fax: 210-617-5423  *
* [EMAIL PROTECTED]*
* http://www.ijoa.org/joeward/wardindex.html   *


 





- Original Message - 
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, December 19, 1999 4:44 PM
Subject: Re: teaching statistical methods by rules?


| In article [EMAIL PROTECTED], 
| [EMAIL PROTECTED] says...
| 
|  snip
| 
| On the other hand, a body of knowledge can be thought of as a set of
| 'rules'. The important thing is that this set is constructed by the
| individual, so our aim should not be to teach statistics as a set of
| rules, but in such a way that each student can develop his or her own
| set of rules. They won't be the same for all, and they will different
| from the teacher's, but they hopefully will work. (If you like, this is
| a defintion of a 'good student' - one who manages to construct a
| successful set of rules for each subject.
| 
| 
| It's either undergraduate students in Australia are much smarter than those 
| living in the United States or you live on a different planet. The last time I 
| taught an undergraduate introductory statistics class, some students couldn't 
| even do fractions and simple algebra. Can you expect them to develop their own 
| rules?
| 
| Why are people so obsessed with T and Z? When the degrees of freedom exceeds 
| say 30, the difference between T and Z is practically negligible. You can use T 
| or Z in such a case. However, the P-value from Z is easier to compute.
| 
| -- 
| Tjen-Sien Lim
| [EMAIL PROTECTED]
| www.Recursive-Partitioning.com
| 
| Get your free Web-based email! http://recursive-partitioning.zzn.com
| 
| 



Re: Prediction Model Question

1999-12-16 Thread Joe Ward

- Original Message - 
From: Burke Johnson [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Thursday, December 16, 1999 9:13 AM
Subject: Prediction Model Question

| Hi,
| 
| A student of mine is getting ready to develop a GLM prediction model that will 
|include a mixture of categorical and quantitative predictor variables. We will 
|probably not include interaction terms in the model (i.e., it will be a main effects 
|only model).
| 
| Here's my question: Do you suggest using dummy coding (0,1) or effects coding 
|(1,0,-1) for the categorical variables included in the model? 
| 
| The reason I'm asking is because dummy coding does not always give the same result 
|for a factorial design as does ANOVA and effects coding, and, hence, Pedhazur 
|recommends using effects coding rather than dummy coding in the factorial case. Do 
|you know if the choice of dummy or effects coding matters for a main effects only 
|model with multiple categorical and quantitatively scaled predictor variables?
| 
| Thanks in advance,
| Burke Johnson 
| 
--
Hi, Burke --

First, I use the words BINARY (or INDICATOR) predictors -- and NOT "DUMMY" predictors.
In the beginning ALL PREDICTOR INFORMATION IS BINARY!

It is unfortunate that the word DUMMY has became popular.  Students might get the idea 
that
there is something wrong with using DUMMIES!!  I think that the BINARIES are really 
the most
BRILLIANT!!

Now to your concern --

Your last paragraph

"The reason I'm asking is because dummy coding does not always give the same result 
for a factorial design as does ANOVA and effects coding, and, hence, Pedhazur 
recommends using effects coding rather than dummy coding in the factorial case. Do you 
know if the choice of dummy or effects coding matters for a main effects only model 
with multiple categorical and quantitatively scaled predictor variables?"

 is a very good example of the situation that arises in the use of "packaged"
algorithms.  The user of the "package" may have no idea what questions are being 
answered by the
"package".  

I always suggest that researchers create their own models!  That is the only SAFE WAY!
If a "packaged" procedure is verified to produce the results desired by the researcher 
then it certainly
should be used.

The researcher should:

1. State their research questions in "natural language" -- avoid terms such as  "MAIN 
EFFECTS"  and
 "EFFECTS CODING" since those expressions may mean different things to different 
people.  In some instances
  the user of those terms may not know what is meant when they utter the statement.  
Ask someone what they
  mean if they utter something about MAIN EFFECTS in a 3-factor ANOVA with unequal 
numbers of observations
  in the cells.  

2. Create an ASSUMED MODEL that allows the researcher to investigate their research 
questions of interest.

3. Impose resrictions on the parameters of ASSUMED MODEL that are implied by the 
research questions of interest.
This results in a RESTRICTED MODEL.

4. Compare the Error Sum of Squares between the ASSUMED and RESTRICTED MODELS using an 
F-test and
obtain confidence intervals if appropriate.

I assume there must be a reason for assuming that there is NO INTERACTION among the 
predictors.  
Many researchers would test for NO INTERACTION first.  Then, if appropriate, switch to 
the NO INTERACTION MODEL.

I would be interested in seeing the models that your student develops to investigate 
his/her  OWN QUESTIONS OF INTEREST!!

:-)

-- Joe
** 
* Joe Ward  Health Careers High School 
* 167 East Arrowhead Dr  4646 Hamilton Wolfe   
* San Antonio, TX 78228-2402   San Antonio, TX 78229
* Phone: 210-433-6575 Phone: 210-617-5400   
* Fax: 210-433-2828 Fax: 210-617-5423
* [EMAIL PROTECTED] 
* http://www.ijoa.org/joeward/wardindex.html   




Re: Need to evaluate difference between two R's

1999-11-24 Thread Joe Ward

- Original Message - 
From: Herman Rubin [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, November 24, 1999 10:07 AM
Subject: Re: Need to evaluate difference between two R's


| In article [EMAIL PROTECTED],
| Rich Ulrich  [EMAIL PROTECTED] wrote:
| On Tue, 23 Nov 1999 04:39:28 GMT, [EMAIL PROTECTED] wrote:
| 
|  Does any one know how one might test for significant differences
|  between two multiple R's (or R squar's)generated from two sets of data?
|  I need to determine if two R's generated on two separate occasions
|  using the same DV and IV's differ significantly from one another.
| 
| Correlations are not very good candidates for comparisons, since it is
| so easy to do tests that are more precise.
|  - to test whether the predictive relations are different, you would
| test the regressions -- do a Chow test or the equivalent, to see if a
| different set of regressors are needed for a different sampling.
|  - to test whether the variances are different (which is something
| that would change the correlations), you might test variances
| directly.
| 
| This is correct.  In fact, it is generally the case that
| correlations, except as measures of how well the model
| fits, do not have any real meaning.
| 
| Even the amount of the variance explained can change
| drastically with a change in design, but the parameters of
| the model do not change, if normalizations are not done.
| For example, if one has a "normal" model with correlation
| coefficient .5, 25% of the variance is explained.  Now 
| suppose that the predictor variable is selected to be
| 2 standard deviations away from the mean, equally likely
| to be in either direction.  Then the correlation becomes
| .756, and the proportion of the variance explained goes
| up to 57%.  But the prediction model is still the same.
| -- 
| This address is for information only.  I do not claim that these views
| are those of the Statistics Department or of Purdue University.
| Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
| [EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558
| 
-- 
Herman --

Great comment!

Discussions about correlation coefficients arise
periodically on various lists. So when the time seems 
appropriate I resend an old message (see below and the WORD 
attachment) that might be of interest.

IMHO their is too much time spent on the correlation coefficient
since it is of limited and sometimes misleading value
for practical decision-making in the real world.  However,
there are still some folks who are adjusting correlation
coefficients for "restriction of range" in hopes that it
might be useful.

-- Joe
*****  
Joe Ward   Health Careers High School 
167 East Arrowhead Dr  4646 Hamilton Wolfe   
San Antonio, TX 78228-2402 San Antonio, TX 78229  
Phone:  210-433-6575   Phone: 210-617-5400
Fax: 210-433-2828  Fax: 210-617-5423 
[EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html   
* 



-- Forwarded message --
Date: Fri, 23 May 1997 09:30:20 -0400 (EDT)
From: Mike Palij [EMAIL PROTECTED]
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: Testing basic statistical concepts

I'd like to thank Joe Ward for reminding us of this situation
(his posting is appended below), as well as jogging my own
memory for a previous posting I had made.  A while back I
had posted the Anscombe dataset (in the context of an SPSS
program) which also clearly shows the benefit of plotting
the data:  the four situations produce almost identical
Pearson r values but only one actually shows the classic
scatterplot, the others show a nonlinear pattern and the
influence that a single point has on the calculation of r.
What does the value of r tell us here?  Aren't the basic 
statistical concepts to be learned in this situation far 
more important and most clearly seen through a coordination
of the graphical and numerical information?

-Mike Palij/Psychology Dept/New York University

Joe H Ward [EMAIL PROTECTED] writes:
 To Mike et al --
 
 There have been several message related to the Simple Correlation
 Coefficient.  IMHO, when out in the "real world" involving practical
 decision-making the correlation coefficient has very limited value and
 sometimes dangerous consequences.  The correlation coefficient may be
 an important topic for the history of statistics to learn the problems 
 associated with its use . 
 
 Attached below is an item that I submitted a long time ago, and it may be 
 of interest to those following the discussion of "r".
 
 -- Joe
 *******