Hi Dennis and R-users Thank you for more help. I am pretty close, but challenge still remain is forcing the output with different length to output dataframe.
> x <- data.frame(apply(datafr1, 2, fout)) Error in data.frame(var1 = c(-0.70777998321315, 0.418602152926712, 2.08356737154810, : arguments imply differing number of rows: 28, 12, 20, 19 As I need to work with >2000 variables, my intension here is to save this output to such way that it would be further manipulated. Topline is to save in dataframe that have extreme values for the variable concerned and bottomline is automate to save the output printed in the screen to a textfile. Thank you for help once again. Ram On Fri, Mar 18, 2011 at 3:16 AM, Dennis Murphy <djmu...@gmail.com> wrote: > Hi: > > Is this what you're after? > > fout <- function(x) { > lim <- median(x) + c(-2, 2) * mad(x) > x[x < lim[1] | x > lim[2]] > } > > apply(datafr1, 2, fout) > $var1 > [1] 17.5462078 18.4548214 0.7083442 1.9207578 -1.2296787 17.4948240 > [7] 19.5702558 1.6181150 20.9791652 -1.3542099 1.8215087 -1.0296303 > [13] 20.5237930 17.5366497 18.5657566 0.9335419 19.7519983 17.8607968 > [19] 19.1307524 19.6145711 21.8037136 19.1532175 -2.6688409 19.6949309 > [25] 1.9712347 > > $var2 > [1] 37.3822087 35.6490641 35.6000785 38.5981086 -1.6504275 > 37.1419290 > [7] 37.7605230 40.3508689 0.6639900 2.4695841 38.8209491 > 39.9087921 > [13] 38.9907585 35.8279437 2.7870799 37.0941113 0.6308583 > 36.4556638 > [19] -10.2384849 2.8480199 -7.7680457 35.7076539 -0.5467739 > 3.4702765 > [25] 40.4818580 3.2864273 1.4917174 > > $var3 > [1] 74.252563 68.396391 68.845461 -5.006545 66.083402 76.036577 > [7] 75.112586 -6.374241 63.883549 64.041216 -19.764360 -15.051017 > [13] -9.782767 64.696013 70.970648 -4.562031 -22.135003 70.549310 > [19] 69.495915 -4.095587 86.612375 87.029526 70.072126 -6.421695 > [25] 65.737536 > > $var4 > [1] 81.476483 87.098767 -10.451616 91.927329 86.588952 85.080950 > [7] 84.958645 -9.456368 86.270876 -22.936779 83.314032 > > Double checks: > > apply(datafr1, 2, function(x) median(x) + c(-2, 2) * mad(x)) > var1 var2 var3 var4 > [1,] 2.12167 3.779415 -3.736066 -3.471752 > [2,] 17.37176 34.929800 62.969733 80.224799 > > apply(datafr1, 2, range) > var1 var2 var3 var4 > [1,] -2.668841 -10.23848 -22.13500 -22.93678 > [2,] 21.803714 40.48186 87.02953 91.92733 > > Assuming you wanted to do this columnwise (by variable), it appears to be > doing the right thing. > > HTH, > Dennis > > > On Thu, Mar 17, 2011 at 7:04 PM, Ram H. Sharma <sharma.ra...@gmail.com>wrote: > >> Dear R community members >> >> I have been struggling on this simple question, but never get appropriate >> solution. So please help. >> >> # my data, though I have a large number of variables >> var1 <- rnorm(500, 10,4) >> var2 <- rnorm(500, 20, 8) >> var3 <- rnorm(500, 30, 18) >> var4 <- rnorm(500, 40, 20) >> datafr1 <- data.frame(var1, var2, var3, var4) >> >> # my unsuccessful codes >> nvar <- ncol(datafr1) >> for (i in 1:nvar) { >> out1 <- NULL >> out2 <- NULL >> medianx <- median(getdata[,i], na.rm = TRUE) >> show(madx <- mad(getdata[,i], na.rm = TRUE)) >> MD1 <- c(medianx + 2*madx) >> MD2 <- c(medianx - 2*madx) >> out1[i] <- which(getdata[,i] > MD1) # store data that are >> greater than median + 2 mad >> out2[i] <- which (getdata[,1] < MD2) # store data that are >> greater than median - 2 mad >> resultdf <- data.frame(out1, out2) >> write.table (resultdf, "out.csv", sep=",") >> } >> >> >> My idea here is to store those value which are either greater than median >> + >> 2 *MAD or less than median - 2*MAD. Each variable have different length of >> output. >> >> The following last error message: >> Error in data.frame(out1, out2) : >> arguments imply differing number of rows: 2, 0 >> In addition: Warning messages: >> 1: In out1[i] <- which(getdata[, i] > MD1) : >> number of items to replace is not a multiple of replacement length >> 2: In out2[i] <- which(getdata[, 1] < MD2) : >> number of items to replace is not a multiple of replacement length >> 3: In out1[i] <- which(getdata[, i] > MD1) : >> number of items to replace is not a multiple of replacement length >> >> Thank you in advance for helping me. >> >> Best regards; >> RHS >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.