The easiest thing is to use 'save' so that you write the object out as binary. If you don't need 'text', then save/load is the way to operate with the data.
On Fri, Mar 18, 2011 at 10:53 AM, Ram H. Sharma <sharma.ra...@gmail.com> wrote: > Thanks, Jim for the idea. > > I tried with save as list. I can not write to a table with "write.table", I > could not find a function that is write.list or equivalent. Even if it is > list I think it would be difficult to post-processing than as table. > > outx<- as.list(apply(datafr1, 2, fout)) > write.table (outx, "outlier.csv", sep=",") > > Ram > > > On Fri, Mar 18, 2011 at 10:04 AM, jim holtman <jholt...@gmail.com> wrote: >> >> I think it was suggested that you save your output to a 'list' and >> then you will have it in a format that can accept variable numbers of >> items in each element and it is also in a form that you can easily >> process it to create whatever other output you might need. >> >> On Fri, Mar 18, 2011 at 7:24 AM, Ram H. Sharma <sharma.ra...@gmail.com> >> wrote: >> > Hi Dennis and R-users >> > >> > Thank you for more help. I am pretty close, but challenge still remain >> > is >> > forcing the output with different length to output dataframe. >> > >> >> x <- data.frame(apply(datafr1, 2, fout)) >> > Error in data.frame(var1 = c(-0.70777998321315, 0.418602152926712, >> > 2.08356737154810, : >> > arguments imply differing number of rows: 28, 12, 20, 19 >> > >> > As I need to work with >2000 variables, my intension here is to save >> > this >> > output to such way that it would be further manipulated. Topline is to >> > save >> > in dataframe that have extreme values for the variable concerned and >> > bottomline is automate to save the output printed in the screen to a >> > textfile. >> > >> > Thank you for help once again. >> > >> > Ram >> > >> > >> > On Fri, Mar 18, 2011 at 3:16 AM, Dennis Murphy <djmu...@gmail.com> >> > wrote: >> > >> >> Hi: >> >> >> >> Is this what you're after? >> >> >> >> fout <- function(x) { >> >> lim <- median(x) + c(-2, 2) * mad(x) >> >> x[x < lim[1] | x > lim[2]] >> >> } >> >> > apply(datafr1, 2, fout) >> >> $var1 >> >> [1] 17.5462078 18.4548214 0.7083442 1.9207578 -1.2296787 17.4948240 >> >> [7] 19.5702558 1.6181150 20.9791652 -1.3542099 1.8215087 -1.0296303 >> >> [13] 20.5237930 17.5366497 18.5657566 0.9335419 19.7519983 17.8607968 >> >> [19] 19.1307524 19.6145711 21.8037136 19.1532175 -2.6688409 19.6949309 >> >> [25] 1.9712347 >> >> >> >> $var2 >> >> [1] 37.3822087 35.6490641 35.6000785 38.5981086 -1.6504275 >> >> 37.1419290 >> >> [7] 37.7605230 40.3508689 0.6639900 2.4695841 38.8209491 >> >> 39.9087921 >> >> [13] 38.9907585 35.8279437 2.7870799 37.0941113 0.6308583 >> >> 36.4556638 >> >> [19] -10.2384849 2.8480199 -7.7680457 35.7076539 -0.5467739 >> >> 3.4702765 >> >> [25] 40.4818580 3.2864273 1.4917174 >> >> >> >> $var3 >> >> [1] 74.252563 68.396391 68.845461 -5.006545 66.083402 76.036577 >> >> [7] 75.112586 -6.374241 63.883549 64.041216 -19.764360 -15.051017 >> >> [13] -9.782767 64.696013 70.970648 -4.562031 -22.135003 70.549310 >> >> [19] 69.495915 -4.095587 86.612375 87.029526 70.072126 -6.421695 >> >> [25] 65.737536 >> >> >> >> $var4 >> >> [1] 81.476483 87.098767 -10.451616 91.927329 86.588952 85.080950 >> >> [7] 84.958645 -9.456368 86.270876 -22.936779 83.314032 >> >> >> >> Double checks: >> >> > apply(datafr1, 2, function(x) median(x) + c(-2, 2) * mad(x)) >> >> var1 var2 var3 var4 >> >> [1,] 2.12167 3.779415 -3.736066 -3.471752 >> >> [2,] 17.37176 34.929800 62.969733 80.224799 >> >> > apply(datafr1, 2, range) >> >> var1 var2 var3 var4 >> >> [1,] -2.668841 -10.23848 -22.13500 -22.93678 >> >> [2,] 21.803714 40.48186 87.02953 91.92733 >> >> >> >> Assuming you wanted to do this columnwise (by variable), it appears to >> >> be >> >> doing the right thing. >> >> >> >> HTH, >> >> Dennis >> >> >> >> >> >> On Thu, Mar 17, 2011 at 7:04 PM, Ram H. Sharma >> >> <sharma.ra...@gmail.com>wrote: >> >> >> >>> Dear R community members >> >>> >> >>> I have been struggling on this simple question, but never get >> >>> appropriate >> >>> solution. So please help. >> >>> >> >>> # my data, though I have a large number of variables >> >>> var1 <- rnorm(500, 10,4) >> >>> var2 <- rnorm(500, 20, 8) >> >>> var3 <- rnorm(500, 30, 18) >> >>> var4 <- rnorm(500, 40, 20) >> >>> datafr1 <- data.frame(var1, var2, var3, var4) >> >>> >> >>> # my unsuccessful codes >> >>> nvar <- ncol(datafr1) >> >>> for (i in 1:nvar) { >> >>> out1 <- NULL >> >>> out2 <- NULL >> >>> medianx <- median(getdata[,i], na.rm = TRUE) >> >>> show(madx <- mad(getdata[,i], na.rm = TRUE)) >> >>> MD1 <- c(medianx + 2*madx) >> >>> MD2 <- c(medianx - 2*madx) >> >>> out1[i] <- which(getdata[,i] > MD1) # store data that are >> >>> greater than median + 2 mad >> >>> out2[i] <- which (getdata[,1] < MD2) # store data that >> >>> are >> >>> greater than median - 2 mad >> >>> resultdf <- data.frame(out1, out2) >> >>> write.table (resultdf, "out.csv", sep=",") >> >>> } >> >>> >> >>> >> >>> My idea here is to store those value which are either greater than >> >>> median >> >>> + >> >>> 2 *MAD or less than median - 2*MAD. Each variable have different >> >>> length of >> >>> output. >> >>> >> >>> The following last error message: >> >>> Error in data.frame(out1, out2) : >> >>> arguments imply differing number of rows: 2, 0 >> >>> In addition: Warning messages: >> >>> 1: In out1[i] <- which(getdata[, i] > MD1) : >> >>> number of items to replace is not a multiple of replacement length >> >>> 2: In out2[i] <- which(getdata[, 1] < MD2) : >> >>> number of items to replace is not a multiple of replacement length >> >>> 3: In out1[i] <- which(getdata[, i] > MD1) : >> >>> number of items to replace is not a multiple of replacement length >> >>> >> >>> Thank you in advance for helping me. >> >>> >> >>> Best regards; >> >>> RHS >> >>> >> >>> [[alternative HTML version deleted]] >> >>> >> >>> ______________________________________________ >> >>> R-help@r-project.org mailing list >> >>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>> PLEASE do read the posting guide >> >>> http://www.R-project.org/posting-guide.html >> >>> and provide commented, minimal, self-contained, reproducible code. >> >>> >> >> >> >> >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Jim Holtman >> Data Munger Guru >> >> What is the problem that you are trying to solve? > > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.