Re: Sample Size for an Audit

2000-01-19 Thread chendrix8685

In any situation of this sort, the amount of data you
need is related to the amount of variation you expect
to find in the data... and is also related to "how close
do you want your answer to be to the truth" and also
with what probablity do you want to be that close to the
truth.  By truth I mean "true average response" or the
population average.  Statistical sampling theory is usually
founded on taking a small sample from a large (infinite)
population so that the sample does not "disturb" the
population. If the sample size is a significant fraction
of the entire population (as it will be in this instance,
with only 20 people in the population), then a correction
is needed to the usual formula for determining the correct
sample size.  I don't have that formula/correction at
hand, but if you want it (or want a little paper I wrote
on this) let me know at  [EMAIL PROTECTED]  If you
want the paper, I'll need a fax number to send it to...
it is not digitized.   If the response you are measuring
is a "pass/fail" response, that makes life easier because
we can estimate the standard deviation quickly and
painlessly.  When all is said and done, with a population
of only 20, the sample will need to be a large fraction of
the population.  Perhaps as many as 10 or 12.
Charlie H.

In article <86536f$j77$[EMAIL PROTECTED]>,
  [EMAIL PROTECTED] wrote:
> We are going to do a quality system audit (like ISO 9000).  How do I
> choose the sample size for a particular group of people?  Let's say
> that there are 20 supervisors and I will audit their knowledge of SPC,
> how many should I choose for the audit?
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.
>


Sent via Deja.com http://www.deja.com/
Before you buy.



Re: Sample Size for an Audit

2000-01-19 Thread Richard A. Beldin

You have several dimensions here.

First you have the posibililty of variability among the supervisors. With
only 20, you may as well give them all the test because by chance you
could leave out those supervisors who will cause you trouble later on.

Next, and more difficult is the variation in knowledge. What are the most
important aspects for the audit? Is conceptual knowledge more important
than ability to express themselves before their subordinates? What about
their level of committment to the program? Is lip-service good enough?

Is a paper-and-pencil test the most appropriate? What about on-the-spot
inconspicuous observation of each work group for a period of time? Perhaps
the expert judgement of someone in the management chain is as valid as any
test?

If you take the task too seriously, you will run out of time. Perhaps
questions based on the instructional materials used are sufficient.
Perhaps not.

[EMAIL PROTECTED] wrote:

> We are going to do a quality system audit (like ISO 9000).  How do I
> choose the sample size for a particular group of people?  Let's say
> that there are 20 supervisors and I will audit their knowledge of SPC,
> how many should I choose for the audit?
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.



Re: statistical event definition?

2000-01-19 Thread Richard A. Beldin

Well, we wouldn't try to analyze apples and autos in the same data set. :-)
On the other hand, the similarity is sort of like what one requires for an
efficient tabular data base. Whatever the sample space of events is, it
should consist of events that we can think about together comfortably. If it
takes a paragraph to describe each event, statistical methods aren't likely
to be useful. We need to be able to code the data into compact data sets
like measurements on a sample of objects, or a series of measurements on a
single object. Sorry, I don't know of any formal criteria. Usually this is
only discussed when somebody violates some statistician's common sense. Then
it may be discussed in a rather combative atmosphere but privately, rather
than publicly.

Muriel Strand wrote:

> i gather that a collection of events which is analyzed with statistics
> must have sufficient similarity (between each event) for the analysis to
> be accurate/precise.  how similar is sufficient?  can anyone recommend
> refs (preferably books) that discuss this issue, and provide guidelines
> for assuring sufficient similarity?  does this consideration affect the
> appropriate choice of model?
>
> thanks in advance for sharing your wisdom & experience.
>
> --
> Any resemblance of any of the above opinions to anybody's official
> position is completely coincidental.
>
> Muriel Strand, P.E.
> Air Resources Engineer
> CA Air Resources Board
> 2020 L Street
> Sacramento, CA  59814
> 916-324-9661
> 916-327-8524 (fax)
> www.arb.ca.gov



Re: Skewed Data Problem

2000-01-19 Thread Donald F. Burrill

Have you plotted the data?  Impossible to tell much from a simple 
regression analysis;  especially without any definition of the two 
variables.  If I were compelled to guess, I'd suppose that BEHPROBS 
(your dependent variable) was the number of behavioral problems 
reported, probably over some defined time span (perhaps the week 
mentioned with respect to "how often the parent has spanked the 
child", which I presume to be the dependent variable SPANK9235?). 
But if you haven't even _looked_at_ the bivariate relationship, you 
can't tell whether a _linear_ functional relation makes any sense. 

On Wed, 19 Jan 2000, steinberg wrote:

> I am asking whether corporal punishment of children is associated
> with behavior problems.  

Controlling for what other variables?  The analysis you report 
below shows none;  but surely there are many that need to be controlled 
(such as propensity for administering punishment at all, propensity for 
corporal versus other kinds of punishment, whether corporal punishment 
is administered by only one parent or by both, the severity of the 
(alleged?) behavior problems, ...

> I am using data from the National
> Longitudinal Survey of Youth.  I am interested in the results of a 
> question that asks how often the parent has spanked the child in
> the last week.  This data is extremely right skewed with some
> extreme outliers.  Most of the responses are zeros and ones.
> Square root and log transforms have very little effect on the
> right skew.  (I added 1 to each score and took the log to avoid
> zeros.)

But the important question is, what effect (if any) do these 
transformations have on the bivariate relationship?  Does it look 
more (or less) linear in one form than in the others?

> The regression (output below) shows such a small R-squared that
> there would appear to be no meaningful association, although the
> slope is significantly different from zero. 

Again:  If you haven't examined the scatterplot, you cannot tell whether 
there is an association or not.  It is not at all clear that a simple 
linear association is to be expected;  especially if your respondents 
include parents who refuse to use corporal punishment at all, however 
great the behavioral provocation, as well as parents who believe firmly 
in the dictum "Spare the rod and spoil the child".  
With 1100 degrees of freedom, quite small effects can be found 
formally significant;  but your analysis reports  r = .226.

> ... However, on general principle:  Is there some way to properly 
> transform such skewed data? 

Sounds as though you've reasonably well addressed that, at least at the 
simple level of bivariate regression, insofar as one can without looking 
at the data.

> If not, can it still be used in a regression? 

Certainly.

> Of what errors must I be aware if I were to use it?

Mainly, oversimplified models, I should think.  You might profitably 
spend some time thinking about how the data you have might have arisen, 
and what other variables will affect the relationship you wish to 
consider.  AND you might also think about whether you've got the 
relationship the right way round.  You're using number of spankings in a 
week to predict (number of?) behavior problems;  it would not be 
unreasonable, from one point of view, to predict the number of spankings 
from the number (or intensity?) of the problems.
An assumption embedded in your analysis is that it makes sense to 
think of spanking as inducing (or causing) behavior problems.  Parents 
who spank, if asked, will ordinarily claim that they are trying to reduce 
or prevent behavior problems, and that spanking is a response to overt 
behavior problems, not a cause of them. 

> 
> 
> Dep Var: BEHPROBS   N: 1107   Multiple R: 0.226   
>   Squared multiple R: 0.051
>  
> Adjusted squared multiple R: 0.050  
> Standard error of estimate:  14.780
>  
> Effect  Coefficient  Std Error   Std Coef Tolerance t   P
>  
> CONSTANT102.839  0.538 0.000  . 191.2890.000
> SPANK9235 1.381  0.179 0.226 1.0007.7190.000 
>
>  Analysis of Variance
>  
> Source  Sum-of-Squares   df  Mean-Square F-ratio   P
>  
> Regression  13015.793 113015.793 59.583   0.000
> Residual   241384.857  1105  218.448


 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



Re: Skewed Data Problem

2000-01-19 Thread Rich Ulrich

On 19 Jan 2000 11:18:34 -0800, [EMAIL PROTECTED] (steinberg)
wrote:

>  < snip ... >. Most of the responses are zeros and ones.
> Square root and log transforms have very little effect on the
> right skew. (I added 1 to each score and took the log to avoid
> zeros.)
> 
> The regression (output below) shows such a small R-squared that
> there would appear to be no meaningful association, although the

 - see the topics about ZERO in my stats-FAQ.

"zero vs other"  is one likely comparison.

If most of the 'other' is 1, then "one vs other"  might be useful, or
the three-way comparison, "zero vs one vs other" -- Does this give an
ordered set of values/scores  for any of the other variables?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html



RE: statistical event definition?

2000-01-19 Thread Simon, Steve, PhD

Muriel Strand writes:

>i gather that a collection of events which is analyzed with statistics
>must have sufficient similarity (between each event) for the analysis to
>be accurate/precise.  how similar is sufficient?  can anyone recommend
>refs (preferably books) that discuss this issue, and provide guidelines
>for assuring sufficient similarity?  does this consideration affect the
>appropriate choice of model?

I'm not sure what you mean by "accurate/precise", but you will often see
excellent analyses done on very diverse populations. For example, a random
sample of people in California will have quite a mix of people. You can get
very precise estimates of things like income level and unemployment
percentages for all Californians, in spite of the huge difference between
residents of Los Angeles County compared to residents of Orange County.

In clinical trials, there is often a tension between defining the study
population narrowly and defining it broadly. A narrow population (e.g.,
excluding elderly patients or patients with co-morbid conditions) can reduce
variation and make it easy to discover trends and patterns. But such a
narrow population is often difficult to generalize from. Most doctors don't
have the luxury of excluding old patients or patients who are sick from
several conditions simultaneously.

If you want a good guideline, you need to consult subject matter experts and
not statisticians. For example, only a doctor could tell you the trade-offs
between defining the population of asthmatic children broadly or narrowly.

Steve Simon, [EMAIL PROTECTED], Standard Disclaimer.
STATS - Steve's Attempt to Teach Statistics: http://www.cmh.edu/stats



Sample Size for an Audit

2000-01-19 Thread lenin_vizcaino

We are going to do a quality system audit (like ISO 9000).  How do I
choose the sample size for a particular group of people?  Let's say
that there are 20 supervisors and I will audit their knowledge of SPC,
how many should I choose for the audit?


Sent via Deja.com http://www.deja.com/
Before you buy.



Factor Analysis 3

2000-01-19 Thread haytham siala

Hi,

I am sorry that I am sending a lot of questions related to this subject and
here is another question:

If some dissimilar iets load on a common factor (the factor does not seem to
make sense since it consists of some related and some completely unrrelated
items), should I ignore that factor or should I delete the unrrelated items
from the factor analysis?

Thanks in advance.





Skewed Data Problem

2000-01-19 Thread steinberg

I am asking whether corporal punishment of children is associated
with behavior problems. I am using data from the National
Longitudinal Survey of Youth. I am interested in the results of a
question that asks how often the parent has spanked the child in
the last week. This data is extremely right skewed with some
extreme outliers. Most of the responses are zeros and ones.
Square root and log transforms have very little effect on the
right skew. (I added 1 to each score and took the log to avoid
zeros.)

The regression (output below) shows such a small R-squared that
there would appear to be no meaningful association, although the
slope is significantly different from zero. However, on general
principle: Is there some way to properly transform such skewed
data? If not, can it still be used in a regression? Of what
errors must I be aware if I were to use it?

Milton Steinberg



Dep Var: BEHPROBS   N: 1107   Multiple R: 0.226   Squared
multiple R: 0.051
 
Adjusted squared multiple R: 0.050   Standard error of estimate:
14.780
 
Effect CoefficientStd Error Std Coef
Tolerance t   P(2 Tail)
 
CONSTANT   102.8390.5380.000  .
191.2890.000
SPANK92351.3810.1790.226 1.000   
7.7190.000
 
 Analysis of Variance
 
Source Sum-of-Squares   df  Mean-Square
F-ratio   P
 
Regression 13015.793 113015.793 
59.583   0.000
Residual  241384.857  1105  218.448



RE: Interrater reliability

2000-01-19 Thread Peter . Chen

Allen,
You might refer to this paper.

Burry-Stock, J. A., Shaw, D. G., Laurie, C., & Chissom, B. S.
(1996).  Rater agreement indexes for performance assessment.  Educational &
Psychological Measurement, 56, 251-262.
Peter Chen


-Original Message-
From:   Allen E Cornelius [SMTP:[EMAIL PROTECTED]]
Sent:   Wednesday, January 19, 2000 11:22 AM
To: [EMAIL PROTECTED]
Subject:Interrater reliability





Stat folks,

 I have an interrater reliability dilemma.  We are examining a
3-item
scale (each item scored 1 to 5) used to rate compliance behavior of
patients.  Two separate raters have used the scale to rate patients'
behavior, and we now want to calculate the interrater agreement for
the
scale.  Two problems:
   1) The majority of patients are compliant, and receive either
a 4 or
5 for each of the three items from both of the raters.  While this
is high
agreement, values for ICC are very low due to the limited range of
scores.
Are there any indexes that would reflect the high agreement of the
raters
under these conditions?  Perhaps something that accounts for the
full range
of the scale (1 to 5)?
 2)  The dataset contains a total of about 100 observations, but
there
are multiple observations on the same patients at different times,
probably
about 5 to 6 observations per patient.  Does this repeated
assessment need
to be accounted for in the interrater agreement, or can each
observation be
treated as independent for the purpose of interrater agreement?

 Any suggestions or references addressing this problem would be
appreciated.  Thanks.

Allen Cornelius



FACTOR ANALYSIS

2000-01-19 Thread haytham siala

When I perform a factor analysis on the items of a questionnaire should I
include items that make up the Dependent Variables (DVs) as well as the
Independent Variables (IVs) in the analysis or should I perform two separate
factor analysis, one on the items making up the Dependent Variables and
another on the items making up the Independent Variables.




FACTOR ANALYSIS

2000-01-19 Thread haytham siala

When I perform a factor analysis on the items of a questionnaire should I
include items that make up the Dependent Variables (DVs) as well as the
Independent Variables (IVs) in the analysis or should I perform two separate
factor analysis, one on the items making up the Dependent Variables and
another on the items making up the Independent Variables.






FACTOR ANALYSIS 2

2000-01-19 Thread haytham siala

When I perform a factor analysis on the items of a questionnaire should I
include items that make up the Dependent Variables (DVs) as well as the
Independent Variables (IVs) in the analysis or should I perform two separate
factor analysis, one on the items making up the Dependent Variables and
another on the items making up the Independent Variables.






Interrater reliability

2000-01-19 Thread Allen E Cornelius





Stat folks,

 I have an interrater reliability dilemma.  We are examining a 3-item
scale (each item scored 1 to 5) used to rate compliance behavior of
patients.  Two separate raters have used the scale to rate patients'
behavior, and we now want to calculate the interrater agreement for the
scale.  Two problems:
   1) The majority of patients are compliant, and receive either a 4 or
5 for each of the three items from both of the raters.  While this is high
agreement, values for ICC are very low due to the limited range of scores.
Are there any indexes that would reflect the high agreement of the raters
under these conditions?  Perhaps something that accounts for the full range
of the scale (1 to 5)?
 2)  The dataset contains a total of about 100 observations, but there
are multiple observations on the same patients at different times, probably
about 5 to 6 observations per patient.  Does this repeated assessment need
to be accounted for in the interrater agreement, or can each observation be
treated as independent for the purpose of interrater agreement?

 Any suggestions or references addressing this problem would be
appreciated.  Thanks.

Allen Cornelius




Re: FACTOR ANALYSIS

2000-01-19 Thread lthayer

If these factors were length measured in feet and in yards, would it
make sense to have both in the same model. No

If these factors were measure of ability like IQ, IQ test 1 and IQ test
2, then the question depends on how the two test are related.  If they
are highly correlated, drop one. If they measure different things then
they should be included, if significant.  If they overlap, look at your
hypothesis and make a judgment based on the results.


In article <864hr0$805$[EMAIL PROTECTED]>,
  "haytham siala" <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have a question related to factor analysis.
>
> If a questionnaire item was found to load significantly on more than
one
> factor and let us assume that each factor represents a potential
measurement
> scale for a particular construct, should I retain the same item for
both
> factors (scales) i.e should that same item be included in the two
> measurement scales? Or should I take the highest loading of the item
as the
> decisive solution to which factor it should belong?
>
> Cheers.
>
>


Sent via Deja.com http://www.deja.com/
Before you buy.



statistical event definition?

2000-01-19 Thread Muriel Strand

i gather that a collection of events which is analyzed with statistics
must have sufficient similarity (between each event) for the analysis to
be accurate/precise.  how similar is sufficient?  can anyone recommend
refs (preferably books) that discuss this issue, and provide guidelines
for assuring sufficient similarity?  does this consideration affect the
appropriate choice of model?

thanks in advance for sharing your wisdom & experience.

--
Any resemblance of any of the above opinions to anybody's official
position is completely coincidental.

Muriel Strand, P.E.
Air Resources Engineer
CA Air Resources Board
2020 L Street
Sacramento, CA  59814
916-324-9661
916-327-8524 (fax)
www.arb.ca.gov




FACTOR ANALYSIS

2000-01-19 Thread haytham siala

Hi,

I have a question related to factor analysis.

If a questionnaire item was found to load significantly on more than one
factor and let us assume that each factor represents a potential measurement
scale for a particular construct, should I retain the same item for both
factors (scales) i.e should that same item be included in the two
measurement scales? Or should I take the highest loading of the item as the
decisive solution to which factor it should belong?

Cheers.







Student Awards for STATISTICS AND HEALTH CONFERENCE

2000-01-19 Thread Biostat Research Group


The Biostatistics Research Group of the University of Alberta is
is pleased to announce that there will be several travel supplements
awarded to students presenting at the Statistics in Health Conference in
Edmonton, June 11-13, 2000.  These student awards are funded thru

  the Institute of Health Economics, Alberta,
  the Biostatistics Section of the Statistical Society of Canada,
  the Biometrics Section of the American Statistical Association

All students who present contributed papers are eligible to apply.
Please note the deadline of February 1, 2000 for submission of abstracts
to Statistics and Health conference http://www.stat.ualberta.ca/~brg/conf.html 
Some limitations apply.  The award amount will be determined on an
individual basis to a maximum of CD$500 per student. Interested students
are asked to apply before April 15, 2000.  The details of the terms
of the awards will be posted in the above web site shortly.

===
K.C. Carriere
Associate Professor of Statistics
Department of Mathematical Sciences
University of Alberta   (tel)780-492-4230
Edmonton, AB T6G 2G1(fax)780-492-6826

 Home: http://www.math.ualberta.ca/~kcarrier/kcarrier.html
 Visit: http://www.stat.ualberta.ca/~brg
===



Squared Multiple correlation

2000-01-19 Thread haytham siala

Can someone please tell me how to calculate the SMC (Squared Multiple
Correlation) in a factor analysis (SPSS)? I am not sure but could it be the
diagonal of a factor transformation matrix ?

Thanks.





Fellowship

2000-01-19 Thread srmillis



FELLOWSHIPS IN REHABILITATION OUTCOMES RESEARCH.  

The UMDNJ/New Jersey Medical School, Department of Physical
Medicine and Rehabilitation, and the Kessler Medical Rehabilitation
Research & Education Corp., announce a 1-2 year advanced research
training
program for professionals interested in objective and subjective
outcomes
experienced by persons with physical or neurological disabilities and
factors that affect these outcomes. 

Medical rehabilitation outcomes research encompasses research on
prognosis, measurement of function and health, treatment guidelines,
outcomes management strategies, disability economics, and issues of
health
policy.  Controlled research on the effectiveness and costs of
interventions is stressed.  Statistical and methodological skills are
stressed in our program (without excluding qualitative investigations,
which are necessary to explore certain topics.) 

The training program emphasizes the actual conduct of research,
including writing fundable research proposals and publications.  
Fellows
improve their research skills and knowledge of clinical rehabilitation
during the program.  The training program is based on an individualized
Research and Training Plans written by each Fellow in collaboration with
a
primary mentor and secondary mentors. Mentors may be chosen from among
many
researchers throughout New Jersey, with special strength in
neuropsychology, physiatry, PT, general outcomes methodology, traumatic
brain injury, spinal cord injury, and other neurological disabilities. 

Both pre-doctoral (dissertation level) and post-doctoral
positions
are currently available.   For substantive information, contact: 

Mark Johnston, PhD, (973) 243-6810, Project Director, 
[EMAIL PROTECTED]

For application forms and instructions, contact:

Heidi Castillo at (973) 243-6971, [EMAIL PROTECTED], or 


-- 
Scott R. Millis, PhD, ABPP
Kessler Medical Rehabilitation Research & Education Corp
1199 Pleasant Valley Way
West Orange, New Jersey 07052

Tel: 973.243.6976
Fax: 973.243.6990
Emails: [EMAIL PROTECTED]
[EMAIL PROTECTED]



[Q : Test bivariate normal distribution?]

2000-01-19 Thread D.W. Ryu

Dear Members fo News Group,

I always appreciate that I could have received your help.

As I know, I can apply Kolmogorov-Smirnov goodness-of-fit test to
univariate sample. But, I don't know which method can be applied to
multivariate samples, especially, when I got the samples assumed to be
bivariate normal distributions.

Please answer me .


Thanks in advances.

With my best regards,

D.W. Ryu




* Sent from RemarQ http://www.remarq.com The Internet's Discussion Network *
The fastest and easiest way to search and participate in Usenet - Free!