I am performing some analysis over a large data frame and would like to conduct repeated analysis over grouped-up subsets. How can I do that?
Here some example code for clarification: require("flexmix") # for Kullback-Leibler divergence n <- 23 groups <- c(1,2,3) mydata <- data.frame( sequence=c(1:n), data1=c(rnorm(n)), data2=c(rnorm(n)), group=rep(sample(groups, n, replace=TRUE)) ) # Part 1: full stats (works fine) dataOnly <- cbind(mydata$data1, mydata$data2, mydata$group) KLdiv(dataOnly) # # Part 2: again - but once for each group (error) # by(dataOnly, groups, KLdiv(dataOnly)) The error I am getting is: Error in tapply(1L:23L, list(INDICES = c(1, 2, 3)), function (x) : arguments must have same length Are there better ways than 'by' ? I would like to use different stats and functions and therefore I am looking for a splitter whose output I can hand to any statical function I want. Any ideas? Ralf ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.