Re: [R] help please: put output into dataframe

David Winsemius Fri, 18 Mar 2011 08:50:29 -0700


On Mar 18, 2011, at 10:53 AM, Ram H. Sharma wrote:

Thanks, Jim for the idea.
I tried with save as list. I can not write to a table with"write.table", Icould not find a function that is write.list or equivalent. Even ifit is
list I think it would be difficult to post-processing than as table.

outx<- as.list(apply(datafr1, 2, fout))
write.table (outx, "outlier.csv", sep=",")

Use `dump` to save as an R an object that can later be `source`()-eded, which is what I think you want,.... or `capture.output` to save as the text representation you wouldsee at the console which would suffer from difficulty in restoring asan R object.


--
David.

Ram

On Fri, Mar 18, 2011 at 10:04 AM, jim holtman <jholt...@gmail.com>wrote:

I think it was suggested that you save your output to a 'list' and
then you will have it in a format that can accept variable numbers of
items in each element and it is also in a form that you can easily
process it to create whatever other output you might need.

On Fri, Mar 18, 2011 at 7:24 AM, Ram H. Sharma <sharma.ra...@gmail.com>

wrote:

Hi Dennis and R-users
Thank you for more help. I am pretty close, but challenge stillremain is
forcing the output with different length to output dataframe.
x <- data.frame(apply(datafr1, 2, fout))
Error in data.frame(var1 = c(-0.70777998321315, 0.418602152926712,
2.08356737154810,  :
arguments imply differing number of rows: 28, 12, 20, 19
As I need to work with >2000 variables, my intension here is tosave thisoutput to such way that it would be further manipulated. Toplineis to

save

in dataframe that have extreme values for the variable concerned and
bottomline is automate to save the output printed in the screen to a
textfile.

Thank you for help once again.

Ram


On Fri, Mar 18, 2011 at 3:16 AM, Dennis Murphy <djmu...@gmail.com>

wrote:

Hi:

Is this what you're after?

fout <- function(x) {
    lim <- median(x) + c(-2, 2) * mad(x)
    x[x < lim[1] | x > lim[2]]
  }
apply(datafr1, 2, fout)
$var1
[1] 17.5462078 18.4548214 0.7083442 1.9207578 -1.229678717.4948240[7] 19.5702558 1.6181150 20.9791652 -1.3542099 1.8215087-1.0296303[13] 20.5237930 17.5366497 18.5657566 0.9335419 19.751998317.8607968[19] 19.1307524 19.6145711 21.8037136 19.1532175 -2.668840919.6949309
[25] 1.9712347

$var2
[1]  37.3822087  35.6490641  35.6000785  38.5981086  -1.6504275
37.1419290
[7]  37.7605230  40.3508689   0.6639900   2.4695841  38.8209491
39.9087921
[13]  38.9907585  35.8279437   2.7870799  37.0941113   0.6308583
36.4556638
[19] -10.2384849   2.8480199  -7.7680457  35.7076539  -0.5467739
3.4702765
[25]  40.4818580   3.2864273   1.4917174

$var3
[1] 74.252563 68.396391 68.845461 -5.006545 66.08340276.036577[7] 75.112586 -6.374241 63.883549 64.041216 -19.764360-15.051017[13] -9.782767 64.696013 70.970648 -4.562031 -22.13500370.549310[19] 69.495915 -4.095587 86.612375 87.029526 70.072126-6.421695
[25] 65.737536

$var4
[1] 81.476483 87.098767 -10.451616 91.927329 86.58895285.080950
[7]  84.958645  -9.456368  86.270876 -22.936779  83.314032

Double checks:
apply(datafr1, 2, function(x) median(x) + c(-2, 2) * mad(x))
        var1      var2      var3      var4
[1,]  2.12167  3.779415 -3.736066 -3.471752
[2,] 17.37176 34.929800 62.969733 80.224799
apply(datafr1, 2, range)
         var1      var2      var3      var4
[1,] -2.668841 -10.23848 -22.13500 -22.93678
[2,] 21.803714  40.48186  87.02953  91.92733
Assuming you wanted to do this columnwise (by variable), itappears to

be

doing the right thing.

HTH,
Dennis


On Thu, Mar 17, 2011 at 7:04 PM, Ram H. Sharma <sharma.ra...@gmail.com

wrote:

Dear R community members

I have been struggling on this simple question, but never get

appropriate

solution. So please help.

# my data, though I have a large number of variables
var1 <- rnorm(500, 10,4)
var2 <- rnorm(500, 20, 8)
var3 <- rnorm(500, 30, 18)
var4 <- rnorm(500, 40, 20)
datafr1 <- data.frame(var1, var2, var3, var4)

# my unsuccessful codes
nvar <- ncol(datafr1)
for (i in 1:nvar) {
            out1 <- NULL
            out2 <- NULL
            medianx <- median(getdata[,i], na.rm = TRUE)
            show(madx <- mad(getdata[,i], na.rm = TRUE))
            MD1 <- c(medianx + 2*madx)
            MD2 <- c(medianx - 2*madx)

out1[i] <- which(getdata[,i] > MD1) # store datathat are

greater than median + 2 mad

out2[i] <- which (getdata[,1] < MD2) # store datathat are

greater than median - 2 mad
           resultdf <- data.frame(out1, out2)
           write.table (resultdf, "out.csv", sep=",")
            }


My idea here is to store those value which are either greater than

median

+
2 *MAD or less than median - 2*MAD. Each variable have differentlength

of

output.

The following last error message:
Error in data.frame(out1, out2) :
arguments imply differing number of rows: 2, 0
In addition: Warning messages:
1: In out1[i] <- which(getdata[, i] > MD1) :
number of items to replace is not a multiple of replacement length
2: In out2[i] <- which(getdata[, 1] < MD2) :
number of items to replace is not a multiple of replacement length
3: In out1[i] <- which(getdata[, i] > MD1) :
number of items to replace is not a multiple of replacement length

Thank you in advance for helping me.

Best regards;
RHS

      [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>

and provide commented, minimal, self-contained, reproducible code.


      [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>

and provide commented, minimal, self-contained, reproducible code.




--
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help please: put output into dataframe

Reply via email to