date:20000502

Re: Number of factors to be extracted

2000-05-02 Thread Donald F. Burrill


On Wed, 3 May 2000, Paul Gardner wrote, inter alia:

> I can reduce all this to a single maxim:  
> Factor analysis is an art as well as a science.
^^ 
I would have written ...  "rather than" ...
Cheers!
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Number of factors to be extracted

2000-05-02 Thread Paul Gardner

I would add another criterion, which is qualitative, and therefore not
reducible to a quantitative rule:

3. Use your professional judgement.  Does the pattern of factor loadings
make sense?  For example, if the variables are item scores on a
multi-dimensional instrument, can you see a meaningful connection among
the items which load highly on a particular factor?

The "eigen-value greater than 1" criterion is very arbitrary, and in
interpreting a factor analysis matrix of item scores, I often discard
numerous factors which meet the eigen-value criterion but fail to make
any sense when I apply my judgement to the pattern of loadings.

I can reduce all this to a single maxim: Factor analysis is an art as
well as a science.

Paul Gardner

Alex Yu wrote:
> 
> There are several rules. The most popular two are:
> 
> 1. Kasier criterion: retain the factor when eigenvalue is larger than 1
> 2. Scree plot: Basically, it is eyeballing. Plot the number of factors
> and the eigenvalue and see where the sharp turn is.
> 
> Hope it helps.

> Chong-ho (Alex) Yu, Ph.D., CNE, MCSE

> 
> On Tue, 2 May 2000 [EMAIL PROTECTED] wrote:
> 
> > Would any of you know a rule of thumb for selecting the proper (of
> > optimal) number of factors to be extracted from a factor analysis.
> > Also, how many variables can there be in such factor (is two variable
> > in one factor not enough?).

begin:vcard 
n:Gardner;Dr Paul
tel;cell:0412 275 623
tel;fax:Int + 61 3 9905 2779 (Faculty office)
tel;home:Int + 61 3 9578 4724
tel;work:Int + 61 3 9905 2854
x-mozilla-html:FALSE
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-29488
fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University,  Vic. Australia 3800
end:vcard

Re: no correlation assumption among X's in MLR

2000-05-02 Thread Alan McLean

Hi Don,

There are times when I realise the rust that has accumulated, and this is one
of them.

Changing the order of things a little, you (and D&S) are of course quite
correct that X variables are typically correlated, and that if they are not
the coefficients are the same as if a set of simple regressions are carried
out. Coincidentally, I was pointing this out to a class a couple of days ago
- but the class is 'not mathematically able', like most these days, so the
explanation was not of course at all technical. Rust..

With regard to correlation and collinearity - I have become used to
'explaining' collinearity to my classes in terms only of pairs of explanatory
variables, forgetting that the collinearity could involve a set of three or
more variables, and this 'pair-wise no collinearity' is, as I understand it,
equivalent to 'no linear correlation'. This suggests, incidentally, that 'not
collinear' is stronger than 'uncorrelated' (not *linearly* correlated) which
doesn't agree with your statement - is this so? It also suggests that
'collinearity' means more than just 'correlated'.

A useful way of picturing the situation is that each variable corresponds to
an axis, the angles between the axes determined by the correlation
coefficient. (I think, very uncertainly, that the correlation coefficient is
the cosine of the angle.) If variables are uncorrelated, the axes are
orthogonal; if they are perfectly correlated, the axes are identical. If
there is a linear combination between variables, the corresponding dimensions
collapse to a 'plane'. (This is all happening in k dimensions.) This
corresponds to the matrix X'X having rank less than k (for k variables) so
leads (as I understand it) to the collinearity problem.

In terms of the data, there is unlikely to be total collapse (just as a
sample correlation of exactly zero is highly unlikely) but you might get near
collapse. For only two variables highly correlated, the axes are nearly
indistinguishable; for three variables you will get a very low hill (this is
difficult to describe!). The problem then is to decide whether or not to
exclude variables - is the hill high enough to count as three variables, or
so low that one variabel should be excluded?

I think I stand by my original observation, that *in the data* there is
always some evidence of collinearity/correlation; if this evidence is strong
enough you have to reduce it by reselecting the variables.

In your third paragraph you seem to be identifying collinearity with
correlation - more precisely, that the problems with collinearity are those
of correlation - and to a large extent identifying 'the trouble' that I spoke
of.

Thanks for helping to chip off some of the rust. I  know there is a lot
more.

Regards,
Alan

"Donald F. Burrill" wrote:

> On Tue, 2 May 2000, Alan McLean wrote:
>
> > 'No collinearity' *means* the X variables are uncorrelated!
>
> This is not my understanding.  "Uncorrelated" means that the correlation
> between two variables is zero, or that the intercorrelations among
> several variables are all zero.   "Not collinear" means that there is not
> a linear dependency lurking among the variables (or some subset of them).
> "Uncorrelated" is a much stronger condition than "not collinear".
>
> > The basic OLS method assumes the variables are uncorrelated
> > (as you say).
>
> Not as presented in, e.g., Draper & Smith;  who go to some trouble to
> show how one can produce from a set of correlated variables a set of
> orthogonal (= mutually uncorrelated) variables, and remark on the
> advantages that accrue if the X-matrix is orthogonal.  But it is clear
> that they expect predictors to be correlated as a general rule.
>
> > In practice there is usually some correlation, but the estimates are
> > reasonably robust to this.  If there is *substantial* collinearity you
> > are in trouble.
>
> If there is collinearity _at_all_ you are in trouble;  further, if the
> correlations among some of the predictors are high enough (= close enough
> to unity), a computing system with finite precision may be unable to
> detect the difference between a set of variables that are technically not
> collinear but are highly correlated, and a set of variables that _are_
> collinear.  (E.g., X and X^4 are not collinear;  but if the range of X
> in the data is, say, 101 to 110, a plot of X^4 vs X will look very much
> like a straight line.)  For this reason various safety features are
> usually built in to regression programs:  variables whose tolerance value
> with respect to the other predictors is lower than a certain threshold
> (or whose variance inflation factor -- the reciprocal of tolerance -- is
> above a corresponding threshold) are usually excluded from an analysis;
> although it is often possible to override the system defaults if one
> thinks it necessary.  The existence of such defaults is clear evidence
> that at least the persons responsible for system packages expected that
> va

Question: Comparing two groups...

2000-05-02 Thread Jean-Pierre Guay


I would like to compare two different groups of prisonners on a
psychopathy test (PCL-R, for those who like to know). One has been
evaluated on the basis of an interview as well as on the basis of their
personal and correctional files. The second group has only been
evaluated on the basis of their correctional file.  I would like to
compare the number of times or the proportion that a certains ratings
occurs (presence of a psychopathic characteristic). Say, I want to know
if, both groups being similar, we have a tendency to rate subjects
differently, with these two different methods.

For exemple, one of the characteristics is "lying".  I would like to
know if there are significant differences etween the ratings of these
two groups.  What are the proportions of subjects rated as "lying". If
in the first group it's 50% and in the second 30%, what test should I
use to know if these differences are significant?  How can I compare
these two independent groups, in such a "parallel" design? I've heard
that Joseph L. Fliess discusses that question in his "The design and
analysis of clinical experiment" book, but I just can't figure out
where...

Sorry again for my poor english...
Bonne journée

Jean-Pierre



Sent via Deja.com http://www.deja.com/
Before you buy.


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Roy's Largest...What?

2000-05-02 Thread Rich Ulrich

On Tue, 02 May 2000 03:42:52 GMT, Mike and Michele Hewitt
<[EMAIL PROTECTED]> wrote:

> Can anyone tell me the conditions for using Roy's Largest Root for
> multivariate repeated measures rather than the Pillai's, Wilks, or
> Hotelling's which may be "more conservative and perhaps less powerful".

I know about multivariate; I am less sure about your exact context of
"multivariate repeated measures" but I think this applies.

The different tests have been written as weighted combinations of the
eigenvalues.  Wilks test spreads the test across all of the roots of
the eigen problem.

If the "largest root" is what is interesting, then you think the
important effects are in the first eigenvector.  

I usually think the effect will be an obvious, first-eigenvector
effect; but I also think that I should be able to define the contrast
in advance:  So I can do a t-test (say) on an obvious "Summary score"
instead of doing an obscure test on an newly defined vector.  Since it
does not have to reckon on capitalizing on chance, the test on the
summary will be more powerful -- unless I have been badly mistaken in
defining it.

You use Roy's if you think there is a simple effect and (for some
reason) you can't describe that in advance.

-- 
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html

===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Number of factors to be extracted

2000-05-02 Thread Alex Yu

There are several rules. The most popular two are:

1. Kasier criterion: retain the factor when eigenvalue is larger than 1
2. Scree plot: Basically, it is eyeballing. Plot the number of factors 
and the eigenvalue and see where the sharp turn is.

Hope it helps.

Chong-ho (Alex) Yu, Ph.D., CNE, MCSE
Instruction and Research Support
Information Technology
Arizona State University
Tempe AZ 85287-0101
Voice: (602)965-7402
Fax: (602)965-6317
Email: [EMAIL PROTECTED]
URL:http://seamonkey.ed.asu.edu/~alex/

On Tue, 2 May 2000 [EMAIL PROTECTED] wrote:

> Would any of you know a rule of thumb for selecting the proper (of
> optimal) number of factors to be extracted from a factor analysis.
> Also, how many variables can there be in such factor (is two variable
> in one factor not enough?).
> 
> Sorry for my english...
> 
> 
> Sent via Deja.com http://www.deja.com/
> Before you buy.
> 
> 
> ===
> This list is open to everyone.  Occasionally, less thoughtful
> people send inappropriate messages.  Please DO NOT COMPLAIN TO
> THE POSTMASTER about these messages because the postmaster has no
> way of controlling them, and excessive complaints will result in
> termination of the list.
> 
> For information about this list, including information about the
> problem of inappropriate messages and information about how to
> unsubscribe, please see the web page at
> http://jse.stat.ncsu.edu/
> ===
> 

===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Statistical Software

2000-05-02 Thread Richard Goldstein

As a statistician who works on large class-action lawsuits for various
attorneys, I respond by saying that I do all work for these cases in
Stata (http://www.stata.com) and I use both DBMS/COPY and Stat/Transfer
for import and export issues.  The speed, flexibility and power of Stata
are, for these purposes, unrivalled in my opinion -- and, in fact, I
know of at least one opposing law firm that bought Stata just so they
could easily check my work.

Rich Goldstein

[EMAIL PROTECTED] wrote:
> 
> I'm the original postee, and this is why I asked the question. I'm a
> computer analyst for a large law firm. Most of my work is involved in
> large class action lawsuit, where I need to gather, organize and store
> mounds of data. From time to time I will need to perform some
> statistical work on this data. Usually the law firm will contract out
> this work, since as an employee I am not qualified to be an expert
> witness when it comes to statistical evidence. However my job is a new
> position for this law firm and they would like to perform some in house
> statistical work not only for comparison to the outside consultants but
> for internal questions as well. What type of statistical work I will be
> performing is unknown at this time. Hopefully from the type of data I'm
> collecting, someone can determine what statistical package is best
> suited for my needs. I would ask the consultants we work with, but was
> instructed not to.
> 
> Thanks
> 
> Sent via Deja.com http://www.deja.com/
> Before you buy.

===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Number of factors to be extracted

2000-05-02 Thread jeanpierre_guay


Would any of you know a rule of thumb for selecting the proper (of
optimal) number of factors to be extracted from a factor analysis.
Also, how many variables can there be in such factor (is two variable
in one factor not enough?).

Sorry for my english...


Sent via Deja.com http://www.deja.com/
Before you buy.


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: What is the logarithmic distribution? (many questions)

2000-05-02 Thread Graeme Byrne




Vincent Vinh-Hung wrote:

> General question,
> I've seen two descriptions of "logarithmic distribution".
> One is related to the frequency of digits called Benford's law (digit 1
> occurs more frequently than 2, 2 than 3, etc) whose explanation is that
> it is the result of a mixture of distributions.
> The other description is a 2-page paragraph The logarithmic distribution
> in Kendall and Stuart (1977, The Advanced theory of statistics, Vol 1,
> 4th edition, pp 139-140), attributing the derivation to Fisher (1943).
> Are these concepts of logarithmic distribution the same or not?
>
> Second question I would like to ask: Kendall and Stuart give an
> example of a distribution of the logarithmic type from Fisher (1943),
> "distribution of butterflies in Malaya, with theoretical frequencies
> given by the logarithmic distribution"
> No. of species  Theoretical frequency   Observed frequency
> 1   135.05  118
> 2   67.33   74
> 3   44.75   44
> 4   33.46   24
> 5   26.69   29
> 6   22.17   22
> 7   18.95   20
> etc ...
> From what I've understood, the theoretical frequency was generated
> by
>   - ( q^r ) / ( r * ln(1-q) )
> in which r is the No. of species, q is the probability of the presence
> of an attribute.
> How was, how can the fit be realized?

You will need a value of q first. This will either be estimated from the raw
data or assumed by some hypothesis. Once you have this just plug in the
value of r you want and multiply the resulting probability by  the sum of
the observed  frequencies.

You might also be able to use the theorectical mean q/((q - 1 )*Log[1 - q])
to estimate q by equating it to the sample mean and solving for q.

>
>
> With thanks in advance,
> Vincent Vinh-Hung

--
Dr Graeme Byrne
La Trobe University, Bendigo
PO Box 199, Bendigo, 3552
Phone: 61 3 5444 7263
Fax:   61 3 5444 7998
[EMAIL PROTECTED]




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Statistical Software

2000-05-02 Thread dennis roberts

but, another alternative is to think about not ONE package ... but perhaps 
2 ... sure, to become comfortable with both, it takes more time BUT, many 
packages allow for pretty good inter changeability of worksheets AND ... 
there are some student editions that would keep the cost down ...

i would suspect that for some things you might want to do ... package A 
might be best ... whereas for other things ... maybe package B would be 
better ... in fact, there actually are many many ONLINE routines that might 
be satisfactory for your purposes ... ONCE you discover what these are ...

have a look at ...

http://members.aol.com/johnp71/javastat.html

At 01:34 PM 5/2/00 +, [EMAIL PROTECTED] wrote:
>  Hopefully from the type of data I'm
>collecting, someone can determine what statistical package is best
>suited for my needs. I would ask the consultants we work with, but was
>instructed not to.
>
>Thanks

===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Statistical Software

2000-05-02 Thread Gordon Sande


On Tue, 02 May 2000 13:34:49 GMT, [EMAIL PROTECTED] wrote:

>In article <[EMAIL PROTECTED]>,
>  [EMAIL PROTECTED] (SAlbert) wrote:
>> Cheryl makes a good point:  the "right" package depends on what the
>user wants
>> to do.  MINITAB might be a good choice -- or SPSS, or any of dozens of
>others.
>> Is the application area psychology?  Biology?  Economics?
>Meteorology?
>> Demography?  Chemistry?  Do we need regression?  Cross-tabs?  Time
>series?
>> Design of Experiments?
>> The original question can't have a general answer that's correct
>for
>> everyone.  If the original poster could provide a little more
>information about
>> needs, we could be a lot more helpful.
>>
>> Steve Albert
>>
>
>I'm the original postee, and this is why I asked the question. I'm a
>computer analyst for a large law firm. Most of my work is involved in
>large class action lawsuit, where I need to gather, organize and store
>mounds of data. From time to time I will need to perform some
>statistical work on this data. Usually the law firm will contract out
>this work, since as an employee I am not qualified to be an expert
>witness when it comes to statistical evidence. However my job is a new
>position for this law firm and they would like to perform some in house
>statistical work not only for comparison to the outside consultants but
>for internal questions as well. What type of statistical work I will be
>performing is unknown at this time. Hopefully from the type of data I'm
>collecting, someone can determine what statistical package is best
>suited for my needs. I would ask the consultants we work with, but was
>instructed not to.
>
>Thanks
>

You have two related problems. One is acquire data from various
sources. Two is to do some amount of analysis.

There are a variety of data translators available. DBMS/Copy etc.

There are a variety of statistical packages available. S-Plus,
DataDesk, JMP, MiniTab, etc.

You have made it sound like the packages directed at exploaroty data
analysis will be more likely to meet your needs for a variety of ad
hoc first looks in the presence of many data problems.


>
>Sent via Deja.com http://www.deja.com/
>Before you buy.



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Question: References for run charts.

2000-05-02 Thread Eric Scharin


All -

I'm looking for references on the analysis of run charts - that is, plots of
data arranged in time sequence.  They are similar to Shewhart (Control)
Charts, but are not as powerful and are typically used when the number of
time points is too small for control chart analysis.  I have a list of rules
which I came across which are used to determine "significant" trends such as
runs up or down and runs to the median, but I'm looking for a reference for
these rules.

If a list member could help, I'd greatly appreciate it.

Thanks.

Eric Scharin





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Statistical Software

2000-05-02 Thread mattcfenn

In article <[EMAIL PROTECTED]>,
  [EMAIL PROTECTED] (SAlbert) wrote:
> Cheryl makes a good point:  the "right" package depends on what the
user wants
> to do.  MINITAB might be a good choice -- or SPSS, or any of dozens of
others.
> Is the application area psychology?  Biology?  Economics?
Meteorology?
> Demography?  Chemistry?  Do we need regression?  Cross-tabs?  Time
series?
> Design of Experiments?
> The original question can't have a general answer that's correct
for
> everyone.  If the original poster could provide a little more
information about
> needs, we could be a lot more helpful.
>
> Steve Albert
>

I'm the original postee, and this is why I asked the question. I'm a
computer analyst for a large law firm. Most of my work is involved in
large class action lawsuit, where I need to gather, organize and store
mounds of data. From time to time I will need to perform some
statistical work on this data. Usually the law firm will contract out
this work, since as an employee I am not qualified to be an expert
witness when it comes to statistical evidence. However my job is a new
position for this law firm and they would like to perform some in house
statistical work not only for comparison to the outside consultants but
for internal questions as well. What type of statistical work I will be
performing is unknown at this time. Hopefully from the type of data I'm
collecting, someone can determine what statistical package is best
suited for my needs. I would ask the consultants we work with, but was
instructed not to.

Thanks

Sent via Deja.com http://www.deja.com/
Before you buy.

===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Roy's Largest...What?

2000-05-02 Thread Richard M. Barton


Tatsuoka, M. (1988), Multivariate Analysis, has a few pages that discusses some of the 
different situations where one criterion might be preferred over another.

rb

--- Mike and Michele Hewitt wrote:

Hope that got your attention:)

Can anyone tell me the conditions for using Roy's Largest Root for
multivariate repeated measures rather than the Pillai's, Wilks, or
Hotelling's which may be "more conservative and perhaps less powerful".
TIA, Mike



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===
--- end of quote ---


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Statistical Software

2000-05-02 Thread John Hendrickx


In article <[EMAIL PROTECTED]>, [EMAIL PROTECTED] says...
> It depends.
> What kinds of stat will  you do?
> How much value do you put on your time?
> What disciplines do you work with?
> Who can you get help from?
> Who will go over you syntax and outputs to check your work?
> 
> If you need to do a great deal of data transformation (e.g., recoding)
> and will be dealing with many kinds of data from different sources,
> then I would choose SPSS.  It has the best human factors in GUI,
> consistency of syntax across procedures, vocabulary choice, clarity of
> documentation, and clarity of syntax code.
> 
I don't agree with this description of SPSS at all. I would say its 
syntax is the worst I've seen (compared to SAS, Stata, GLIM, BMDP). SPSS 
syntax is unnecessarily verbose and certainly not consistent across 
procedures. SPSS is good at elementary operations such as recode but poor 
at advanced applications such as arrays, macros. It does have a good GUI 
and its documentation is excellent. An important advantage is that SPSS 
is well known, a lot of data is available in SPSS format and text books 
often contain sample SPSS code.

SAS is another of the big players, sample code is common although data in 
SAS format less so. It's got a much better command syntax than SPSS, very 
extensive. Unfortunately, SAS is weak at some of the elementary 
operations such as a recode, assigning value labels. In the 90s, SAS 
focussed on business solutions and its statistical capabilities 
stagnated. However, a new version was released this year, maybe they're 
picking it up again.

I've started using Stata recently and it's quite good. It has a very 
consistent syntax and a wide array of statistical procedures. It's very 
fast, but less suited to very large datasets. Its macro capabilities are 
excellent. It's also evolving at a faster rate than SAS or SPSS. An 
interesting feature is the ability to take the survey design into 
account, i.e. specify strata or clustering variables. It can do most of 
what SAS can do with a much smaller footprint and for a lower price.

However, enough stat software advocacy. The original poster wanted to 
extend the statistical capabilities of Excel. There have been posts to 
this group about a commercial add-in for Excel that will do this, I don't 
think it's been mentioned in this thread so far. Try a deja-news search. 
I've seen that NAG also sells statistical add-ins for Excel, see 
http://www.nag.co.uk/statistical_software.asp. There's also a freeware 
statistical package "R" which apparently can interface with Excel. See 
http://www.ci.tuwien.ac.at/R/contents.html. (R is an open source version 
of S-plus, yet another statistical package). I haven't tried any of these 
solutions, but I'd be interested in hearing other peoples experiences.

John Hendrickx


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Self-Studying Statistics Book

2000-05-02 Thread brianv


I finished two courses of statistics already.  I am looking for a good book
that is easy to study by myself during the summer.  The level would be after
the first two intro-stat classes, maybe like regression analyis...  If
anyone has any suggestion, I really appreciate.


Sincerely,

Brian V




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: What is the logarithmic distribution? (many questions)

2000-05-02 Thread Vincent Vinh-Hung

Lognormal I believe most often is used to describe a normal
distribution after logarithm transform, while logarithmic
distribution in the sense of Kendall-Stuart is else (I didn't
really grasp KS' formalism).

BTW, I queried how the fit was done because I can't find the same
figures as the Fisher 1943 example, assigning q=0.97293 I come
with 135.05 (ok), 65.7 (instead of the published 67.33),
42.6 (instead of 44.75), 31.1 (instead of 33.46), etc.

Thanks for your suggestion,
Vincent

Edzo Wisman wrote:
> 
> isn't the lognormal distribution the same as logarithmic?  Just guessing.
> Else maybe you could look in the direction of exponential distributions.
> I am just guessing though... :)
> good luck!
> Edzo
> 
> "Vincent Vinh-Hung" <[EMAIL PROTECTED]> wrote in message
> [EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> > General question,
> > I've seen two descriptions of "logarithmic distribution".
> > One is related to the frequency of digits called Benford's law (digit 1

===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

Re: Number of factors to be extracted

Re: Number of factors to be extracted

Re: no correlation assumption among X's in MLR

Question: Comparing two groups...

Re: Roy's Largest...What?

Re: Number of factors to be extracted

Re: Statistical Software

Number of factors to be extracted

Re: What is the logarithmic distribution? (many questions)

Re: Statistical Software

Re: Statistical Software

Question: References for run charts.

Re: Statistical Software

Re: Roy's Largest...What?

Re: Statistical Software

Self-Studying Statistics Book

Re: What is the logarithmic distribution? (many questions)

17 matches

Site Navigation

Mail list logo

Footer information