Re: Levels of measurement.

2001-02-05 Thread John Uebersax

In his post, [EMAIL PROTECTED] (Paul W. Jeffries) wrote:

> I have a question that must have a simple response...but I don't see
> it right now.  The textbooks say that a ratio scale has the
> properties of an interval scale plus a true zero point.

Okay.

> This implies that any scale that has a true zero point should have
> the cardinal property of an interval scale; namely, equal intervals
> represent equal amounts of the property being measured.

I do not see how the second statement follows from the first.

Statement 1:  (A implies B) and (A implies C)

   where A = is a ratio scale
 B = is an interval scale
 C = has a true zero


Statement 2:  C implies B

But does Statement 2 imply Statement 1?  Consider another example:

 A = is an elephant
 B = is large
 C = is a mammal

Then [(A-->B) and (A-->C)] is true, but C-->A is not true.

--
John Uebersax
[EMAIL PROTECTED]


Sent via Deja.com
http://www.deja.com/


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: IRT/Rasch Modeling with SAS?

2001-03-13 Thread John Uebersax

Hi Lee,
 
If you go to my web page for Latent Trait and Item Response Theory (IRT)
Models,
 
http://ourworld.compuserve.com/homepages/jsuebersax/lta.htm
 
(please let me know if this link doesn't work)
 
that will point to several other pages that might help.  
 
> Then the IRT curve that I am looking for (something they call a 3-parameter
> logistic, which I think is not a 100% correct name) is described by the
> following function (best viewed in a fixed-width font):
 
A well-kept secret is that it is just as easy to estimate a probit
(cumulative gaussian) latent trait model.  The probit model is
theoretically more appropriate in many applications.
 
Of course, you will need to decide, if you haven't already, whether to
pursue a 1- 2- or 3-parameter model.
 
> find a reference that tells me exactly the recipe for finding it, but the
> best I can tell is that the algorithm would start with an initial guess for
> T, fit the curve parameters a, b, and c, then use this curve to re-estimate
> T. The process repeats until some convergence criterion is reached.
 
That's one approach.  Another is "brute force" optimization, where one
uses a general purpose optimization routine to (simultaneously) find the
set of paramter values that maximizes a given criterion--usually the
log-likelihood.
 
Here's a good book that covers the material without making things more
complicated than necessary:
 
Hulin, C. L., F. Drasgow, C. K. Parsons, Item Response Theory,
Homewood, Illinois, Dow Jones-Irwin, 1983.
 
I'd also recommend looking at some of Bock's work, such as:
 
Bock, R. D., and Aitkin, M. (1981). "Marginal Maximum Likelihood
Estimation of Item Parameters:  Application of an EM Algorithm,"
Psychometrika, 46, 443-459.
 
Of course, the "bibles" are still:
 
Lazarsfeld, P. F., and Henry, N. W. (1968), Latent Structure
Analysis, 2oston:  Houghton Mifflin.
 
Lord FM, Novick MR. (1968).  Statistical theories of mental test
scores.  Reading, Massachusetts:  Addison-Wesley.
 
> Does anyone know if SAS will do this?
 
One of my pages describes how to estimate a 2-parameter latent trait
model by factor-analyzing a matrix of tetrachoric correlations.  SAS
(via a macro available on the SAS site) can produce a matrix of
tetrachoric correlations.  And the matrix can be supplied to and
factored by PROC FACTOR.
 
This works pretty well for estimating the item paramters (slopes and
thresholds).  However if you also want to score respondents (i.e.,
estimate their latent trait levels) that takes a little more work (a
separate page on my site talks about this).
 
A 1-parameter Rasch model can be formulated as a loglinear model.
Therefore it might be possible to use say, PROC CATMOD or something like
that to estimate a Rasch model.
 
> I have found a piece of software
> that claims to fit "Rasch models", but the classical Rasch model is a
> one-parameter version of what I'm looking for (set b and c to zero, and
> you have a Rasch model).
 
Correct.  I prefer 2-parameter models, unless there is some theoretical
reason to expect a 1-parameter model (i.e., that all items have the same
correlation with the latent trait).
 
I maintain that the choice of logistic IRT vs probit IRT vs Rasch model
should be made based on the theoretical assumptions of each model and
the assumptions about your data.  For example, Rasch has a very nice
theory about how people answer test items that justifies use of
Rasch modeling.  (I don't necessarily agree with the model, but
it is interesting).  On the other hand, if you have a familiar:
 
manifest trait = latent trait + error
 
model, where error is (a) normally distributed, and (b) homoscedastic (
error variance not correlated with latent trait level), and where
one assumes discretizing thresholds that convert latent continuous
responses to observed binary responses, then a probit latent trait
model is appropriate.
 
> Plus, the software costs about $1000, and I don't have that to spare.
> The software (one called "BIGSTEPS" is the only one I can find that will
> deal with the 89,000 students I have to deal with) is not exactly
> "Microsoft Bob" in its ease of use.
 
Check my web site.  One page talks about software for estimating IRT and
Rasch models.  Personally, for Rasch models, I use MIRA or WINMIRA; for
IRT models I use my own programs for "discrete latent trait" modeling:
 
Heinen T. Latent class and discrete latent trait models:
Similarities and differences. Thousand Oaks, California: Sage, 1996.
 
I also have a FAQ on the Rasch model on the site, including information
specifically on Rasch software.
 
Hope this helps.
 
John Uebersax
[EMAIL PROTECTED]
http://ourworld.compuserve.com/homepages/jsuebersax
 
P.S.  The limiting factor on IRT software is usually the number of
items, rather than the n

Re: calculating reliability

2001-03-22 Thread John Uebersax

Paul's comment is very apt.  It is very important to consider whether
a consistent error should or should not count against reliability.
In some cases, a constant positive or negative bias should not matter.
For example, one might be willing to standardize each measure before
using it in statistical analysis.  The standardization would then
remove differences due to a constant bias (as well as differences
associated with a different variance for each measure/rating).

John Uebersax
[EMAIL PROTECTED]


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: factor analysis of dichotomous variables

2001-05-01 Thread John Uebersax

A list of such programs and discussion can be found at:

http://ourworld.compuserve.com/homepages/jsuebersax/binary.htm

The results of Knol & Berger (1991) and Parry & MacArdle (1991) 
(see above web page for citations) suggest that there is not much 
difference in results between the Muthen method and the simpler 
method of factoring tetrachoric correlations.  For additional 
information (including examples using PRELIS/LISREL and SAS) on 
factoring tetrachorics, see

http://ourworld.compuserve.com/homepages/jsuebersax/irt.htm 

Hope this helps.

John Uebersax


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: How calculate 95%=1.96 stdv

2001-07-05 Thread John Uebersax

Hi Stefan,

"s.petersson" <[EMAIL PROTECTED]> wrote in message 
news:<XBE07.7641$[EMAIL PROTECTED]>...

> Let's say I want to calculate this constant with a security level of
> 93.4563, how do I do that? Basically I want to "unfold" a function like
> this:
> 
> f(95)=1.96
> 
> Where I can replace "95" with any number ranging from 0-100.

To Eric's reply I'd just add that use of a table is unnecessary. 
Especially in a computer program, it is easier to use a numerical
function to calculate the confidence interval.

The tables you've seen are for the cumulative probabilities of the
standard normal curve--otherwise known as the standard normal
cumulative density function (cdf).  The standard normal cdf is the
function:

+infinity 
   p = PHI(z) = INTEGRAL  phi(z)
   -infinity

where:
 z   =  standard normal deviate
  PHI(z) =  is the probability (p) of observing a score at or
below z
  phi(z) =  is the formula for the standard normal curve:

1/sqrt(2*pi) * exp(-z^2/2)  

Note that PHI() and phi() -- (these mean the greek letters, upper-case
and lower-case, respectively) are different.  PHI() is the cumulant of
phi().

With the function above, one supplies a value for z, and is given a
cumulative probability.

You seek the inverse function for PHI(), sometimes called the "probit
function."  With the probit function, one supplies a value for p and
is returned the value of z such that the area under the standard
normal curve from -inf to z equals p.  (As Eric noted, you may need to
adjust p to handle issues of 1- vs 2-tailed intervals.)

Both the PHI() and probit() functions are well approximated in simple
applications (such as calculating confidence intervals) by simple
polynomial formulas of a few terms.  Some of these take as few as 2 or
3 lines of code.  A good reference for such approximations is:

Abramowitz, M., and I. A. Stegan, 1972: Handbook of Mathematical
Functions. Dover.

Hope this helps.

John Uebersax


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=