[R] Removing empty (or very underpopulated) sub-populations

Kamil Sijko Fri, 16 Apr 2010 07:28:45 -0700

Hi,

I'm trying to develop a function that will simplify the most common analyses
in my area of interest (social sciences) by computing all required
statistics at one run (for exaple in case of a factor and numeric variable:
1) normality test, then in case variable are normal 2) ANOVA 3) with
efect-size estimation and aprropriate graph).
I test normality in each group with this code:


are.normal <- c()
group <- as.factor(group)
for (i in 1:length(levels(factor(group)))) {
 are.normal[i] <- normality(response[group==levels(factor(group))[i]])
}

whrere: 1) response is response (numeric variable), 2) group is grouping
variable (factor), 4) normality is a function which takes one variable as
argument, and the tries to figure out wheter it's normal (TRUE) or not
(FALSE).

My problem is that sometimes, some combinations of response~group produce
empty populations or very underpopulated (eg. situation when you examine
relation between country of origin and age of respondents, and it turns out,
that you have only one guy from some country). It causes a failure of my
function.

I've been wondering wheter there is some way to exclude those underpopulated
groups from analysis?

Best regards,
Kamil Sijko

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Removing empty (or very underpopulated) sub-populations

Reply via email to