I tried
> lapply(split(Clean,list(Clean$TERM,Clean$INST_NUM)),function(x)
shapiro.test(x$GRADE))
 and I got
>Error in shapiro.test(x$GRADE.) : sample size must be between 3 and 5000

I also tried
with(Clean, aggregate(GRADE,list(TERM,INST_NUM),FUN=shapiro.test))

and got
  Group.1 Group.2         x
1   201001  689809 0.9546164
2   201201  689809 0.9521624
3   201301  689809 0.9106206
4   200701  994474 0.8862705
5   200710  994474 0.9176743
6   201203 1105752 0.9382688
.
.
.
72  201001 1759272 0.9291295
73  201101 1759272 0.9347072
74  201110 1897809 0.9395375
Warning message:
In format.data.frame(x, digits = digits, na.encode = FALSE) :
  corrupt data frame: columns will be truncated or padded with NAs

I am not sure how to interpret the output of the second.

Thanks!


On Tue, Aug 13, 2013 at 11:01 AM, arun <smartpink...@yahoo.com> wrote:

> Hi,
> You could try:
>  lapply(split(Clean,list(Clean$TERM,Clean$INST_NUM)),function(x)
> shapiro.test(x$GRADE))
> A.K.
>
>
>
>
> ----- Original Message -----
> From: Robert Lynch <robert.b.ly...@gmail.com>
> To: r-help@r-project.org
> Cc:
> Sent: Tuesday, August 13, 2013 1:46 PM
> Subject: [R] ave function
>
> I've written the following function
> CoursePrep <- function (Source, SaveName) {
>
>
>   Clean$TERM <- as.factor(Clean$TERM)
>
>   Clean$INST_NUM <- as.factor(Clean$INST_NUM)
>   Clean$zGrade <- with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN =
> scale))
>   write.csv(Clean,paste(SaveName, "csv", sep ="."), row.names = FALSE)
>   return(Clean)
> }
>
> which is all well and good, but I wan't to throw a shapiro.test in before I
> normalize.  that is I don't really understand quite how I did ( I got help)
> what I wanted to in the
> Clean$zGrade <- with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN = scale))
> that code for the whole of Clean finds all sets of GRADE.'s that have the
> same INST_NUM and TERM computes a mean, subtracts off the mean and divides
> by the standard deviation. I would like to for each one of those sets of
> grades to call shapiro.test() on the set, to see if it is normal *before* I
> assume it is.
>
> I know the naive
> with(Clean, shapiro.test( list(TERM, INST_NUM)))
> doesn't work.
> with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN =
> function(x)shapiro.test(x)))
>
> which returns
> Error in shapiro.test(x) : sample size must be between 3 and 5000
> and I have checked that the sets selected are all of length between 3 and
> 5000.
> using the following on my full data
>
> ClassSize <- with(Clean, ave(GRADE., list(TERM, INST_NUM), FUN =
> function(x)length(x)))
> > summary(ClassSize)
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>    22.0   198.0   241.0   244.4   279.0   466.0
>
> here is some sample data
> GRADE     TERM     INST_NUM
> 1,              9,           1
> 2,              9,           1
> 3,              9,           1
> 1.5,           8,           2
> 1.75,         8,           2
> 2,              8,          2
> 0.5,           9,           2
> 2,              9,          2
> 3.5,           9,          2
> 3.5,            8,         1
> 3.75,          8,         1
> 4,               8,          1
>
> and hopefully the code would test the following set of grades
> (1,2,3)(1.5,1.75,2)(0.5,2,3.5)(3.5,3.75,4)
>
> Thanks Robert
>
>     [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to