Re: Sample Size for an Audit
In any situation of this sort, the amount of data you need is related to the amount of variation you expect to find in the data... and is also related to "how close do you want your answer to be to the truth" and also with what probablity do you want to be that close to the truth. By truth I mean "true average response" or the population average. Statistical sampling theory is usually founded on taking a small sample from a large (infinite) population so that the sample does not "disturb" the population. If the sample size is a significant fraction of the entire population (as it will be in this instance, with only 20 people in the population), then a correction is needed to the usual formula for determining the correct sample size. I don't have that formula/correction at hand, but if you want it (or want a little paper I wrote on this) let me know at [EMAIL PROTECTED] If you want the paper, I'll need a fax number to send it to... it is not digitized. If the response you are measuring is a "pass/fail" response, that makes life easier because we can estimate the standard deviation quickly and painlessly. When all is said and done, with a population of only 20, the sample will need to be a large fraction of the population. Perhaps as many as 10 or 12. Charlie H. In article <86536f$j77$[EMAIL PROTECTED]>, [EMAIL PROTECTED] wrote: > We are going to do a quality system audit (like ISO 9000). How do I > choose the sample size for a particular group of people? Let's say > that there are 20 supervisors and I will audit their knowledge of SPC, > how many should I choose for the audit? > > Sent via Deja.com http://www.deja.com/ > Before you buy. > Sent via Deja.com http://www.deja.com/ Before you buy.
Re: Sample Size for an Audit
You have several dimensions here. First you have the posibililty of variability among the supervisors. With only 20, you may as well give them all the test because by chance you could leave out those supervisors who will cause you trouble later on. Next, and more difficult is the variation in knowledge. What are the most important aspects for the audit? Is conceptual knowledge more important than ability to express themselves before their subordinates? What about their level of committment to the program? Is lip-service good enough? Is a paper-and-pencil test the most appropriate? What about on-the-spot inconspicuous observation of each work group for a period of time? Perhaps the expert judgement of someone in the management chain is as valid as any test? If you take the task too seriously, you will run out of time. Perhaps questions based on the instructional materials used are sufficient. Perhaps not. [EMAIL PROTECTED] wrote: > We are going to do a quality system audit (like ISO 9000). How do I > choose the sample size for a particular group of people? Let's say > that there are 20 supervisors and I will audit their knowledge of SPC, > how many should I choose for the audit? > > Sent via Deja.com http://www.deja.com/ > Before you buy.
Re: statistical event definition?
Well, we wouldn't try to analyze apples and autos in the same data set. :-) On the other hand, the similarity is sort of like what one requires for an efficient tabular data base. Whatever the sample space of events is, it should consist of events that we can think about together comfortably. If it takes a paragraph to describe each event, statistical methods aren't likely to be useful. We need to be able to code the data into compact data sets like measurements on a sample of objects, or a series of measurements on a single object. Sorry, I don't know of any formal criteria. Usually this is only discussed when somebody violates some statistician's common sense. Then it may be discussed in a rather combative atmosphere but privately, rather than publicly. Muriel Strand wrote: > i gather that a collection of events which is analyzed with statistics > must have sufficient similarity (between each event) for the analysis to > be accurate/precise. how similar is sufficient? can anyone recommend > refs (preferably books) that discuss this issue, and provide guidelines > for assuring sufficient similarity? does this consideration affect the > appropriate choice of model? > > thanks in advance for sharing your wisdom & experience. > > -- > Any resemblance of any of the above opinions to anybody's official > position is completely coincidental. > > Muriel Strand, P.E. > Air Resources Engineer > CA Air Resources Board > 2020 L Street > Sacramento, CA 59814 > 916-324-9661 > 916-327-8524 (fax) > www.arb.ca.gov
Re: Skewed Data Problem
Have you plotted the data? Impossible to tell much from a simple regression analysis; especially without any definition of the two variables. If I were compelled to guess, I'd suppose that BEHPROBS (your dependent variable) was the number of behavioral problems reported, probably over some defined time span (perhaps the week mentioned with respect to "how often the parent has spanked the child", which I presume to be the dependent variable SPANK9235?). But if you haven't even _looked_at_ the bivariate relationship, you can't tell whether a _linear_ functional relation makes any sense. On Wed, 19 Jan 2000, steinberg wrote: > I am asking whether corporal punishment of children is associated > with behavior problems. Controlling for what other variables? The analysis you report below shows none; but surely there are many that need to be controlled (such as propensity for administering punishment at all, propensity for corporal versus other kinds of punishment, whether corporal punishment is administered by only one parent or by both, the severity of the (alleged?) behavior problems, ... > I am using data from the National > Longitudinal Survey of Youth. I am interested in the results of a > question that asks how often the parent has spanked the child in > the last week. This data is extremely right skewed with some > extreme outliers. Most of the responses are zeros and ones. > Square root and log transforms have very little effect on the > right skew. (I added 1 to each score and took the log to avoid > zeros.) But the important question is, what effect (if any) do these transformations have on the bivariate relationship? Does it look more (or less) linear in one form than in the others? > The regression (output below) shows such a small R-squared that > there would appear to be no meaningful association, although the > slope is significantly different from zero. Again: If you haven't examined the scatterplot, you cannot tell whether there is an association or not. It is not at all clear that a simple linear association is to be expected; especially if your respondents include parents who refuse to use corporal punishment at all, however great the behavioral provocation, as well as parents who believe firmly in the dictum "Spare the rod and spoil the child". With 1100 degrees of freedom, quite small effects can be found formally significant; but your analysis reports r = .226. > ... However, on general principle: Is there some way to properly > transform such skewed data? Sounds as though you've reasonably well addressed that, at least at the simple level of bivariate regression, insofar as one can without looking at the data. > If not, can it still be used in a regression? Certainly. > Of what errors must I be aware if I were to use it? Mainly, oversimplified models, I should think. You might profitably spend some time thinking about how the data you have might have arisen, and what other variables will affect the relationship you wish to consider. AND you might also think about whether you've got the relationship the right way round. You're using number of spankings in a week to predict (number of?) behavior problems; it would not be unreasonable, from one point of view, to predict the number of spankings from the number (or intensity?) of the problems. An assumption embedded in your analysis is that it makes sense to think of spanking as inducing (or causing) behavior problems. Parents who spank, if asked, will ordinarily claim that they are trying to reduce or prevent behavior problems, and that spanking is a response to overt behavior problems, not a cause of them. > > > Dep Var: BEHPROBS N: 1107 Multiple R: 0.226 > Squared multiple R: 0.051 > > Adjusted squared multiple R: 0.050 > Standard error of estimate: 14.780 > > Effect Coefficient Std Error Std Coef Tolerance t P > > CONSTANT102.839 0.538 0.000 . 191.2890.000 > SPANK9235 1.381 0.179 0.226 1.0007.7190.000 > > Analysis of Variance > > Source Sum-of-Squares df Mean-Square F-ratio P > > Regression 13015.793 113015.793 59.583 0.000 > Residual 241384.857 1105 218.448 Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128
Re: Skewed Data Problem
On 19 Jan 2000 11:18:34 -0800, [EMAIL PROTECTED] (steinberg) wrote: > < snip ... >. Most of the responses are zeros and ones. > Square root and log transforms have very little effect on the > right skew. (I added 1 to each score and took the log to avoid > zeros.) > > The regression (output below) shows such a small R-squared that > there would appear to be no meaningful association, although the - see the topics about ZERO in my stats-FAQ. "zero vs other" is one likely comparison. If most of the 'other' is 1, then "one vs other" might be useful, or the three-way comparison, "zero vs one vs other" -- Does this give an ordered set of values/scores for any of the other variables? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html
RE: statistical event definition?
Muriel Strand writes: >i gather that a collection of events which is analyzed with statistics >must have sufficient similarity (between each event) for the analysis to >be accurate/precise. how similar is sufficient? can anyone recommend >refs (preferably books) that discuss this issue, and provide guidelines >for assuring sufficient similarity? does this consideration affect the >appropriate choice of model? I'm not sure what you mean by "accurate/precise", but you will often see excellent analyses done on very diverse populations. For example, a random sample of people in California will have quite a mix of people. You can get very precise estimates of things like income level and unemployment percentages for all Californians, in spite of the huge difference between residents of Los Angeles County compared to residents of Orange County. In clinical trials, there is often a tension between defining the study population narrowly and defining it broadly. A narrow population (e.g., excluding elderly patients or patients with co-morbid conditions) can reduce variation and make it easy to discover trends and patterns. But such a narrow population is often difficult to generalize from. Most doctors don't have the luxury of excluding old patients or patients who are sick from several conditions simultaneously. If you want a good guideline, you need to consult subject matter experts and not statisticians. For example, only a doctor could tell you the trade-offs between defining the population of asthmatic children broadly or narrowly. Steve Simon, [EMAIL PROTECTED], Standard Disclaimer. STATS - Steve's Attempt to Teach Statistics: http://www.cmh.edu/stats
Sample Size for an Audit
We are going to do a quality system audit (like ISO 9000). How do I choose the sample size for a particular group of people? Let's say that there are 20 supervisors and I will audit their knowledge of SPC, how many should I choose for the audit? Sent via Deja.com http://www.deja.com/ Before you buy.
Factor Analysis 3
Hi, I am sorry that I am sending a lot of questions related to this subject and here is another question: If some dissimilar iets load on a common factor (the factor does not seem to make sense since it consists of some related and some completely unrrelated items), should I ignore that factor or should I delete the unrrelated items from the factor analysis? Thanks in advance.
Skewed Data Problem
I am asking whether corporal punishment of children is associated with behavior problems. I am using data from the National Longitudinal Survey of Youth. I am interested in the results of a question that asks how often the parent has spanked the child in the last week. This data is extremely right skewed with some extreme outliers. Most of the responses are zeros and ones. Square root and log transforms have very little effect on the right skew. (I added 1 to each score and took the log to avoid zeros.) The regression (output below) shows such a small R-squared that there would appear to be no meaningful association, although the slope is significantly different from zero. However, on general principle: Is there some way to properly transform such skewed data? If not, can it still be used in a regression? Of what errors must I be aware if I were to use it? Milton Steinberg Dep Var: BEHPROBS N: 1107 Multiple R: 0.226 Squared multiple R: 0.051 Adjusted squared multiple R: 0.050 Standard error of estimate: 14.780 Effect CoefficientStd Error Std Coef Tolerance t P(2 Tail) CONSTANT 102.8390.5380.000 . 191.2890.000 SPANK92351.3810.1790.226 1.000 7.7190.000 Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P Regression 13015.793 113015.793 59.583 0.000 Residual 241384.857 1105 218.448
RE: Interrater reliability
Allen, You might refer to this paper. Burry-Stock, J. A., Shaw, D. G., Laurie, C., & Chissom, B. S. (1996). Rater agreement indexes for performance assessment. Educational & Psychological Measurement, 56, 251-262. Peter Chen -Original Message- From: Allen E Cornelius [SMTP:[EMAIL PROTECTED]] Sent: Wednesday, January 19, 2000 11:22 AM To: [EMAIL PROTECTED] Subject:Interrater reliability Stat folks, I have an interrater reliability dilemma. We are examining a 3-item scale (each item scored 1 to 5) used to rate compliance behavior of patients. Two separate raters have used the scale to rate patients' behavior, and we now want to calculate the interrater agreement for the scale. Two problems: 1) The majority of patients are compliant, and receive either a 4 or 5 for each of the three items from both of the raters. While this is high agreement, values for ICC are very low due to the limited range of scores. Are there any indexes that would reflect the high agreement of the raters under these conditions? Perhaps something that accounts for the full range of the scale (1 to 5)? 2) The dataset contains a total of about 100 observations, but there are multiple observations on the same patients at different times, probably about 5 to 6 observations per patient. Does this repeated assessment need to be accounted for in the interrater agreement, or can each observation be treated as independent for the purpose of interrater agreement? Any suggestions or references addressing this problem would be appreciated. Thanks. Allen Cornelius
FACTOR ANALYSIS
When I perform a factor analysis on the items of a questionnaire should I include items that make up the Dependent Variables (DVs) as well as the Independent Variables (IVs) in the analysis or should I perform two separate factor analysis, one on the items making up the Dependent Variables and another on the items making up the Independent Variables.
FACTOR ANALYSIS
When I perform a factor analysis on the items of a questionnaire should I include items that make up the Dependent Variables (DVs) as well as the Independent Variables (IVs) in the analysis or should I perform two separate factor analysis, one on the items making up the Dependent Variables and another on the items making up the Independent Variables.
FACTOR ANALYSIS 2
When I perform a factor analysis on the items of a questionnaire should I include items that make up the Dependent Variables (DVs) as well as the Independent Variables (IVs) in the analysis or should I perform two separate factor analysis, one on the items making up the Dependent Variables and another on the items making up the Independent Variables.
Interrater reliability
Stat folks, I have an interrater reliability dilemma. We are examining a 3-item scale (each item scored 1 to 5) used to rate compliance behavior of patients. Two separate raters have used the scale to rate patients' behavior, and we now want to calculate the interrater agreement for the scale. Two problems: 1) The majority of patients are compliant, and receive either a 4 or 5 for each of the three items from both of the raters. While this is high agreement, values for ICC are very low due to the limited range of scores. Are there any indexes that would reflect the high agreement of the raters under these conditions? Perhaps something that accounts for the full range of the scale (1 to 5)? 2) The dataset contains a total of about 100 observations, but there are multiple observations on the same patients at different times, probably about 5 to 6 observations per patient. Does this repeated assessment need to be accounted for in the interrater agreement, or can each observation be treated as independent for the purpose of interrater agreement? Any suggestions or references addressing this problem would be appreciated. Thanks. Allen Cornelius
Re: FACTOR ANALYSIS
If these factors were length measured in feet and in yards, would it make sense to have both in the same model. No If these factors were measure of ability like IQ, IQ test 1 and IQ test 2, then the question depends on how the two test are related. If they are highly correlated, drop one. If they measure different things then they should be included, if significant. If they overlap, look at your hypothesis and make a judgment based on the results. In article <864hr0$805$[EMAIL PROTECTED]>, "haytham siala" <[EMAIL PROTECTED]> wrote: > Hi, > > I have a question related to factor analysis. > > If a questionnaire item was found to load significantly on more than one > factor and let us assume that each factor represents a potential measurement > scale for a particular construct, should I retain the same item for both > factors (scales) i.e should that same item be included in the two > measurement scales? Or should I take the highest loading of the item as the > decisive solution to which factor it should belong? > > Cheers. > > Sent via Deja.com http://www.deja.com/ Before you buy.
statistical event definition?
i gather that a collection of events which is analyzed with statistics must have sufficient similarity (between each event) for the analysis to be accurate/precise. how similar is sufficient? can anyone recommend refs (preferably books) that discuss this issue, and provide guidelines for assuring sufficient similarity? does this consideration affect the appropriate choice of model? thanks in advance for sharing your wisdom & experience. -- Any resemblance of any of the above opinions to anybody's official position is completely coincidental. Muriel Strand, P.E. Air Resources Engineer CA Air Resources Board 2020 L Street Sacramento, CA 59814 916-324-9661 916-327-8524 (fax) www.arb.ca.gov
FACTOR ANALYSIS
Hi, I have a question related to factor analysis. If a questionnaire item was found to load significantly on more than one factor and let us assume that each factor represents a potential measurement scale for a particular construct, should I retain the same item for both factors (scales) i.e should that same item be included in the two measurement scales? Or should I take the highest loading of the item as the decisive solution to which factor it should belong? Cheers.
Student Awards for STATISTICS AND HEALTH CONFERENCE
The Biostatistics Research Group of the University of Alberta is is pleased to announce that there will be several travel supplements awarded to students presenting at the Statistics in Health Conference in Edmonton, June 11-13, 2000. These student awards are funded thru the Institute of Health Economics, Alberta, the Biostatistics Section of the Statistical Society of Canada, the Biometrics Section of the American Statistical Association All students who present contributed papers are eligible to apply. Please note the deadline of February 1, 2000 for submission of abstracts to Statistics and Health conference http://www.stat.ualberta.ca/~brg/conf.html Some limitations apply. The award amount will be determined on an individual basis to a maximum of CD$500 per student. Interested students are asked to apply before April 15, 2000. The details of the terms of the awards will be posted in the above web site shortly. === K.C. Carriere Associate Professor of Statistics Department of Mathematical Sciences University of Alberta (tel)780-492-4230 Edmonton, AB T6G 2G1(fax)780-492-6826 Home: http://www.math.ualberta.ca/~kcarrier/kcarrier.html Visit: http://www.stat.ualberta.ca/~brg ===
Squared Multiple correlation
Can someone please tell me how to calculate the SMC (Squared Multiple Correlation) in a factor analysis (SPSS)? I am not sure but could it be the diagonal of a factor transformation matrix ? Thanks.
Fellowship
FELLOWSHIPS IN REHABILITATION OUTCOMES RESEARCH. The UMDNJ/New Jersey Medical School, Department of Physical Medicine and Rehabilitation, and the Kessler Medical Rehabilitation Research & Education Corp., announce a 1-2 year advanced research training program for professionals interested in objective and subjective outcomes experienced by persons with physical or neurological disabilities and factors that affect these outcomes. Medical rehabilitation outcomes research encompasses research on prognosis, measurement of function and health, treatment guidelines, outcomes management strategies, disability economics, and issues of health policy. Controlled research on the effectiveness and costs of interventions is stressed. Statistical and methodological skills are stressed in our program (without excluding qualitative investigations, which are necessary to explore certain topics.) The training program emphasizes the actual conduct of research, including writing fundable research proposals and publications. Fellows improve their research skills and knowledge of clinical rehabilitation during the program. The training program is based on an individualized Research and Training Plans written by each Fellow in collaboration with a primary mentor and secondary mentors. Mentors may be chosen from among many researchers throughout New Jersey, with special strength in neuropsychology, physiatry, PT, general outcomes methodology, traumatic brain injury, spinal cord injury, and other neurological disabilities. Both pre-doctoral (dissertation level) and post-doctoral positions are currently available. For substantive information, contact: Mark Johnston, PhD, (973) 243-6810, Project Director, [EMAIL PROTECTED] For application forms and instructions, contact: Heidi Castillo at (973) 243-6971, [EMAIL PROTECTED], or -- Scott R. Millis, PhD, ABPP Kessler Medical Rehabilitation Research & Education Corp 1199 Pleasant Valley Way West Orange, New Jersey 07052 Tel: 973.243.6976 Fax: 973.243.6990 Emails: [EMAIL PROTECTED] [EMAIL PROTECTED]
[Q : Test bivariate normal distribution?]
Dear Members fo News Group, I always appreciate that I could have received your help. As I know, I can apply Kolmogorov-Smirnov goodness-of-fit test to univariate sample. But, I don't know which method can be applied to multivariate samples, especially, when I got the samples assumed to be bivariate normal distributions. Please answer me . Thanks in advances. With my best regards, D.W. Ryu * Sent from RemarQ http://www.remarq.com The Internet's Discussion Network * The fastest and easiest way to search and participate in Usenet - Free!