Re: [R] Writing to a file
Hi, Try this: lst1-lapply(1:5,function(i) {pdf(paste0(i,.pdf)); hist(rnorm(100),main=paste0(Histogram_,i));dev.off()}) #you can change the numbers A.K. I'm trying to generate a pdf called 1.pdf, 2.pdf, 3.pdf etc and it isn't working. My code is: x - 0 for(i in 1:1000){ x - x + 1 pdf(as.character(x),.pdf) #writes out to pdf for(i in 1:100){ hist(rnorm(1)) # graphs histogram, writen to the file } dev.off() } Also, I triedto just do a- 1 a between the pdf() and dev.off() line and it wouldnt add it to the file, even with a name as foo.pdf. 1.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing data to file
look up dump On Thu, Mar 22, 2012 at 11:35 AM, mail me mailme...@googlemail.com wrote: Hi: I created a data frame df - data.frame( person = c('John','Bob','Mary'), team = c('a','b','c'), stringsAsFactors = F); and obtained the expected output df person team 1 John a 2 Bob b 3 Mary c now I want to save the whole content of df preserving its row and column order to a file in disk with the following command: write(df, file = testfile, append=FALSE, sep= ); and I get the error message Error in cat(list(...), file, sep, fill, labels, append) : argument 1 (type 'list') cannot be handled by 'cat' Can you help to solve the problem? Thanks in advance. deb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing data to file
On 2012-03-22 08:35, mail me wrote: Hi: I created a data frame df- data.frame( person = c('John','Bob','Mary'), team = c('a','b','c'), stringsAsFactors = F); and obtained the expected output df person team 1 John a 2Bob b 3 Mary c now I want to save the whole content of df preserving its row and column order to a file in disk with the following command: write(df, file = testfile, append=FALSE, sep= ); and I get the error message Error in cat(list(...), file, sep, fill, labels, append) : argument 1 (type 'list') cannot be handled by 'cat' Can you help to solve the problem? Thanks in advance. deb You're using the wrong function; use write.table() instead. You may want to set either or both of the arguments 'quote' and 'row.names' to FALSE. Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing a .pdf file within a function - what do I need to return()?
See R FAQ 7.22 -- in short, you need to print() your plot to the graphics device -- just wrap xyplot() in print() and it should work. Michael On Tue, Mar 13, 2012 at 3:55 PM, Dgnn sharkbrain...@gmail.com wrote: I am trying to write a function that generates one PDf containing plots from several .csv files within a directory. When I manually execute the code it seems to work, but not when it is a function. I think I need to return() something, but haven't had much luck figuring out what/how. plot.isi-function(csv.path=~/project/csv by cell) { csv.files-grep('.csv', list.files(path = csv.path, full.names=T), value=T) pdf(file='plots/isi plots.pdf', width=10, height=8) #par(mfrow=c(2,1)) #ideally 2 plots per page, but will work on details after fx. works for (i in 1:length(csv.files)){ raw.df-read.csv(csv.files[i]) names(raw.df)-c('t','isi','logic','cond') xyplot(isi ~ t, raw.df, ylim=c(0,1500), ylab='isi', xlab='time', main=basename(csv.files[i])) } dev.off() } Thank you all for the help, Jason Deignan -- View this message in context: http://r.789695.n4.nabble.com/Writing-a-pdf-file-within-a-function-what-do-I-need-to-return-tp4470165p4470165.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
Thank you a lot for answering so fast! but..what do you mean by example? I 've mentioned above the loop I used and I also show how the file looks like 'cause its huge. the way i read the file is x=read.table(filename.txt,header=FALSE,sep=\t,fill=TRUE) y=x[1:45,] (i use only some rows in order to test if it works ) -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4364034.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
Hi Thank you a lot for answering so fast! but..what do you mean by example? I 've mentioned above the loop I used and I also show how the file looks like I do not see any loop. I do not archive all posts from R help, only those with interesting answers :-) and if you do not keep the context in future mails for those not using nabble it is lost and it would be necessary to dig in r help archive. 'cause its huge. the way i read the file is x=read.table(filename.txt,header=FALSE,sep=\t,fill=TRUE) y=x[1:45,] maybe you can use even smaller fraction for a data example y-x[1:10,] dput(y) and copy the output from dput to your mail is the easiest way. Regards Petr (i use only some rows in order to test if it works ) -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a- file-tp3070617p4364034.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
Thanks a lot for the interest :) My loop is the following counter = 0 for (i in 1:nrow(y)) { for (j in 1:ncol(y)) { if (y[i,j]==Func_0005634) { counter = counter + 1 } if(y[i,j]==Func_0005737){ counter = counter + 1 } if(y[i,j]==Func_0005515){ counter = counter + 1 } } if(counter == 3 ){ cat(y[i,1], file = foo.csv, \n) } counter = 0 } and after read.table(foo.csv) I get V1 1 45 which is the last result why does it overwrite? how can I have all the results? Eager to a reply from you! -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4364149.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
As I said to you a while back, use append = TRUE. Michael On Tue, Feb 7, 2012 at 4:18 AM, Felicity felicity...@hotmail.com wrote: Thanks a lot for the interest :) My loop is the following counter = 0 for (i in 1:nrow(y)) { for (j in 1:ncol(y)) { if (y[i,j]==Func_0005634) { counter = counter + 1 } if(y[i,j]==Func_0005737){ counter = counter + 1 } if(y[i,j]==Func_0005515){ counter = counter + 1 } } if(counter == 3 ){ cat(y[i,1], file = foo.csv, \n) } counter = 0 } and after read.table(foo.csv) I get V1 1 45 which is the last result why does it overwrite? how can I have all the results? Eager to a reply from you! -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4364149.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
Hi now you omitted data, but never mind :-) My loop is the following counter = 0 for (i in 1:nrow(y)) { for (j in 1:ncol(y)) { if (y[i,j]==Func_0005634) { counter = counter + 1 } if(y[i,j]==Func_0005737){ counter = counter + 1 } if(y[i,j]==Func_0005515){ counter = counter + 1 } } if(counter == 3 ){ cat(y[i,1], file = foo.csv, \n) } counter = 0 } If I remember correctly you want to inspect each row if it contains any of Func values and how many of them. dput(y) structure(list(prot = c(1, 2, 3, 4), X1 = structure(c(1L, 1L, 1L, 2L), .Label = c(a, ), class = factor), X2 = structure(c(3L, 2L, 3L, 3L), .Label = c(a, b, ), class = factor), X3 = structure(c(3L, 3L, 3L, 2L), .Label = c(b, c, ), class = factor), X4 = structure(c(1L, 1L, 1L, 3L), .Label = c(c, d, ), class = factor), X5 = structure(c(2L, 1L, 1L, 1L), .Label = c(d, ), class = factor)), .Names = c(prot, X1, X2, X3, X4, X5), row.names = c(NA, 4L), class = data.frame) So rowSums((y==a) | (y==b) | (y==d)) 1 2 3 4 1 3 2 1 gives you number of values (a,b,d) in each row. Your construction comes from some differnt programming world. Regards Petr and after read.table(foo.csv) I get V1 1 45 which is the last result why does it overwrite? how can I have all the results? Eager to a reply from you! -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a- file-tp3070617p4364149.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
Dear All!! I am also new in R and trying to write my results into a file I post here..hopefully is the proper place To be more secific I have this loop counter = 0 for (i in 1:nrow(y)) { for (j in 1:ncol(y)) { if (y[i,j]==Func_0005634) { counter = counter + 1 } if(y[i,j]==Func_0005737){ counter = counter + 1 } if(y[i,j]==Func_0005515){ counter = counter + 1 } } if(counter == 2) { k-structure(list(print(y[i,1])), class = data.frame) } if(counter == 3 ){ l-structure(list(print(y[i,1])), class = data.frame) } counter = 0 } for counter==2 or counter ==3 I want to get print(y[i,1]) where in column 1 exists the name of the protein whereas in the rest columns exist somewhere randomly the strings im looking for I want to get the names of the proteins in a file and those that have either 2 or 3 functions be named as cancer. the specific part of code gives me as a result in the command line this (is a sample cause im working on 8500lines) [1] Prot_10035 8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ... Prot_9996 [1] Prot_10041 8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ... Prot_9996 [1] Prot_10045 8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ... Prot_9996 which is fine i can see the names of the proteins but i cant use them so to label them When I try to write it in a file ..then is kept only the last result because unfortunatelly he overwrites himself :( How can I use those data? How can I write them in a file and add as an extra column the word cancel for those containing the specific functions? Any hint you may give me it would be more than helpful for me! Thank you a lot in advance! Looking forward to your reply :) -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4360889.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
You don't say how you are writing to a file, but some methods have an append = TRUE option that might be helpful. Your code looks really inefficient as well: I don't have time to look at it fully now, but it seems to me that you can vectorize the inner loops quite directly: for(j in ncol(y)){ if(y[i,j]==Func_0005515){ counter = counter + 1 } } } could become counter = counter + sum(y[i, ] == Func_0005515) Michael On Mon, Feb 6, 2012 at 5:50 AM, Felicity felicity...@hotmail.com wrote: Dear All!! I am also new in R and trying to write my results into a file I post here..hopefully is the proper place To be more secific I have this loop counter = 0 for (i in 1:nrow(y)) { for (j in 1:ncol(y)) { if (y[i,j]==Func_0005634) { counter = counter + 1 } if(y[i,j]==Func_0005737){ counter = counter + 1 } if(y[i,j]==Func_0005515){ counter = counter + 1 } } if(counter == 2) { k-structure(list(print(y[i,1])), class = data.frame) } if(counter == 3 ){ l-structure(list(print(y[i,1])), class = data.frame) } counter = 0 } for counter==2 or counter ==3 I want to get print(y[i,1]) where in column 1 exists the name of the protein whereas in the rest columns exist somewhere randomly the strings im looking for I want to get the names of the proteins in a file and those that have either 2 or 3 functions be named as cancer. the specific part of code gives me as a result in the command line this (is a sample cause im working on 8500lines) [1] Prot_10035 8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ... Prot_9996 [1] Prot_10041 8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ... Prot_9996 [1] Prot_10045 8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ... Prot_9996 which is fine i can see the names of the proteins but i cant use them so to label them When I try to write it in a file ..then is kept only the last result because unfortunatelly he overwrites himself :( How can I use those data? How can I write them in a file and add as an extra column the word cancel for those containing the specific functions? Any hint you may give me it would be more than helpful for me! Thank you a lot in advance! Looking forward to your reply :) -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4360889.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
maybe I could keep each line (having the strings) in a file or somewhere and then call a print function that prints them all together from where I saved them? Please let me know as soon as Possible!! thank you! -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4362340.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
You can easily do that, but the question is what is the problem you are trying to solve? What do you want to do with the lines you are writing out? Are you going to read them back in or process them with some other program? So save them in a character vector and then write them out with 'cat'. On Mon, Feb 6, 2012 at 1:49 PM, Felicity felicity...@hotmail.com wrote: maybe I could keep each line (having the strings) in a file or somewhere and then call a print function that prints them all together from where I saved them? Please let me know as soon as Possible!! thank you! -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4362340.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
Honestly thank you for the prompt responding and you are right I will tellyou what I want to do and not the way ..since I dont know much from R I have a txt with Proteins Prot_10035Func_0005874 Func_0016787 Func_0003774 Func_0006898 Func_0005856 Func_0005525 Func_0005737 Func_0003924 Func_0005515 Func_166 Prot_10036Func_0005739 Func_0003735 Func_0006412 Func_0005763 Func_0005840 Prot_10037Func_0005739 Func_0005515 Prot_10039Func_0005576 Func_0009615 Func_0050832 Func_0005615 Func_0006955 Func_0042742 Func_0031640 Func_0006935 Prot_1004
Re: [R] Writing to a file
Hi Honestly thank you for the prompt responding and you are right I will tellyou what I want to do and not the way ..since I dont know much from R I have a txt with Proteins Prot_10035 Func_0005874 Func_0016787 Func_0003774 Func_0006898 Func_0005856 Func_0005525 Func_0005737 Func_0003924 Func_0005515 Func_166 Prot_10036 Func_0005739 Func_0003735 Func_0006412 Func_0005763 Func_0005840 Prot_10037 Func_0005739 Func_0005515 Prot_10039 Func_0005576 Func_0009615 Func_0050832 Func_0005615 Func_0006955 Func_0042742 Func_0031640 Func_0006935 Prot_1004 Func_0046872 Func_0003887 Func_0003684 Func_0016740 Func_0006281 Func_0006260 Func_0016779 Func_0005634 Prot_10040 Func_0005886 Func_0046488 Func_0016301 Func_0007409 Func_0005524 Func_0016740 Func_0016308 Func_166 which is 8527 lines and 145 columns (not all the proteins have the same number of proteins) functions? First of all you need to read this file into R properly. I would try readLines with some further polishing to feed list structure with protein names as labels for each part of a list. After that some cycle/lapply checking with regular expression could be a way to populate a data frame with protein names in first column and score in the second. After that you can compare such score with other values in another data frame. However without an example you hardly get detailed help. Regards Petr What I want is to predict whether those proteins are related to cancer or not depending on whether they have some functions. I found that there are 3 functions very often related to cancer and in case a protein has 2/3 or 3/3 to label it (somehow-maybe adding an extra column) as cancer related The names of the Proteins are always in the 1st column but the names of the functions can be at any of the next columns So what I did is to use this loop, but I cant write properly the way I want it to print the results so to use them again (I need to know the name of the proteins having the functions in a column so as next step to compare it with another file -test data set- and conclude to true positive, false positive, true negative, false negative It cant be as hard as I see it :):) -- View this message in context: http://r.789695.n4.nabble.com/Writing-to-a- file-tp3070617p4363940.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing a summary file in R
Just a very simple follow-up. In the summary table (listed as summ below), the TR column I would like to display the total number of rows (i.e. counts) which I have done via NROW() function. However, in the RG1 I would only like to count the number of rows with a 'totalread' count = 1 (i.e. rows that don't contain zero). This may be confusing given the data I've provided, but values in the 'totalreads' column don't have to be 1 or 0, they can be any value. Therefore using sum() won't work in every case. As you can see I've tried using NROW() below for RG1 but it didn't work out like I had planned. For example, given the input data, chr4 100 300 should have RG1=1 and percent=0.5. Instead, it just counts every row regardless of value. The solution is probably something very simple I'm overlooking, but if you could help I'd appreciate it. Below is the code I've slightly modified from David's reply: ###Code## colnames(data) - c(chr,start,end,base1,base2,totalreads,methylation,strand) data #this is the input file chr start end base1 base2 totalreads methylation strand 1 chr1 100 159 104 104 10.05 + 2 chr1 100 159 145 145 10.04 + 3 chr1 200 260 205 205 10.12 + 4 chr1 500 750 600 600 10.09 + 5 chr3 450 700 500 500 10.03 + 6 chr4 100 300 150 150 10.05 + 7 chr4 100 300 175 175 00.00 + 8 chr7 350 600 400 400 10.06 + 9 chr7 350 600 550 550 00.00 + 10 chr9 100 125 100 100 10.10 + 11 chr11 679 687 680 680 10.07 + 12 chr11 679 687 681 681 00.00 + 13 chr22 100 200 105 105 10.03 + 14 chr22 100 200 110 110 10.08 + 15 chr22 300 400 350 350 00.00 + splinp - split(data, paste(data$chr, data$start)) df - as.data.frame(t(sapply(splinp, function(x) list(end=x$end[1], TR=NROW(x[['totalreads']]), RG1=NROW(x[['totalreads']]=1), percent=(NROW(x[['totalreads']]=1)/NROW(x[['totalreads']])) df ### end TR RG1 percent chr1 100 159 2 2 1 chr1 200 260 1 1 1 chr1 500 750 1 1 1 chr11 679 687 2 2 1 chr22 100 200 2 2 1 chr22 300 400 1 1 1 chr3 450 700 1 1 1 chr4 100 300 2 2 1 chr7 350 600 2 2 1 chr9 100 125 1 1 1 ### df.summ - as.data.frame(t(sapply(splinp, function(x) summary(x$methylation summ-cbind(df,df.summ) summ #the finished output ### end TR RG1 percent Min. 1st Qu. Median Mean 3rd Qu. Max. chr1 100 159 2 2 1 0.04 0.0425 0.045 0.045 0.0475 0.05 chr1 200 260 1 1 1 0.12 0.1200 0.120 0.120 0.1200 0.12 chr1 500 750 1 1 1 0.09 0.0900 0.090 0.090 0.0900 0.09 chr11 679 687 2 2 1 0.00 0.0175 0.035 0.035 0.0525 0.07 chr22 100 200 2 2 1 0.03 0.0425 0.055 0.055 0.0675 0.08 chr22 300 400 1 1 1 0.00 0. 0.000 0.000 0. 0.00 chr3 450 700 1 1 1 0.03 0.0300 0.030 0.030 0.0300 0.03 chr4 100 300 2 2 1 0.00 0.0125 0.025 0.025 0.0375 0.05 chr7 350 600 2 2 1 0.00 0.0150 0.030 0.030 0.0450 0.06 chr9 100 125 1 1 1 0.10 0.1000 0.100 0.100 0.1000 0.10 ## David Winsemius wrote: On Jul 27, 2011, at 9:42 PM, Dennis Murphy wrote: Hi: Is this more or less what you're after? ## Note: This is the preferred way to send your data by e-mail. ## I used dput(data-frame-name) to produce this, ## where data-frame-name = 'df' on my end. df - structure(list(V1 = c(chr1, chr1, chr1, chr1, chr3, chr4, chr4, chr7, chr7, chr9, chr11, chr11, chr22, chr22, chr22), V2 = c(100L, 100L, 200L, 500L, 450L, 100L, 100L, 350L, 350L, 100L, 679L, 679L, 100L, 100L, 300L), V3 = c(159L, 159L, 260L, 750L, 700L, 300L, 300L, 600L, 600L, 125L, 687L, 687L, 200L, 200L, 400L), V4 = c(104L, 145L, 205L, 600L, 500L, 150L, 175L, 400L, 550L, 100L, 680L, 681L, 105L, 110L, 350L), V5 = c(104L, 145L, 205L, 600L, 500L, 150L, 175L, 400L, 550L, 100L, 680L, 681L, 105L, 110L, 350L), V6 = c(1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L), V7 = c(0.05, 0.04, 0.12, 0.09, 0.03, 0.05, 0, 0.06, 0, 0.1, 0.07, 0, 0.03, 0.08, 0), V8 = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, + )), .Names = c(V1, V2, V3, V4, V5, V6, V7, V8 ), class = data.frame, row.names = c(NA, -15L)) # This is the structure you should see: str(df) 'data.frame': 15
Re: [R] Writing a summary file in R
On Jul 27, 2011, at 7:02 PM, a217 wrote: Hello, I have an input file: http://r.789695.n4.nabble.com/file/n3700031/testOut.txt testOut.txt where col 1 is chromosome, column2 is start of region, column 3 is end of region, column 4 and 5 is base position, column 6 is total reads, column 7 is methylation data, and column 8 is the strand. I would like a summary output file such as: http://r.789695.n4.nabble.com/file/n3700031/out.summary.txt out.summary.txt where column 1 is chromosome, column 2 is start of region, column 3 is end of region, column 4 is total reads in general, column 5 is total reads =1, column 6 is (col4/col5) or the percentage, and at the end I'd like to list 6 more columns based on summary results from summary() function in R. The summary() function will be used to analyze all of the methylation data (col7 from input) for each region (bounded by col2 and col3). For example for chr1 100 159 summary() gives: Min. 1st Qu. MedianMean 3rd Qu.Max. 0.0400 0.0425 0.0450 0.0450 0.0475 0.0500 which is simply the methylation data input into summary() only in the region of chr1 100 159. I know how to perform all of the required functions line-by-line, but the hard part for me is essentially taking the input data with multiple positions in each region and assigning all of the summary results to one line identified by the region. If any of you have any suggestions I would appreciate it. So essentially you want to drop columns 4:5 and column 8 and calculate a proportion of counts = 1 and get summary stats within separate categories of start-of-region. Is that correct? This is probably a job for aggregate or for ddply in plyr if I felt comfortable with it, which I don't in general. Its documentation through the help pages is s not great IMO but there are those who love it. And I admit the melt function is a major contributor to human happiness. Why don't you read up on aggregate which is a base function (in the r-sense, not in the biological sense.) I will see what I can come up with in the meantime. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing a summary file in R
Yes, that is the general objective. I'll look-into aggregates in R and see if anything helps. Thanks, a217 -- View this message in context: http://r.789695.n4.nabble.com/Writing-a-summary-file-in-R-tp3700031p3700071.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing a summary file in R
Hi: Is this more or less what you're after? ## Note: This is the preferred way to send your data by e-mail. ## I used dput(data-frame-name) to produce this, ## where data-frame-name = 'df' on my end. df - structure(list(V1 = c(chr1, chr1, chr1, chr1, chr3, chr4, chr4, chr7, chr7, chr9, chr11, chr11, chr22, chr22, chr22), V2 = c(100L, 100L, 200L, 500L, 450L, 100L, 100L, 350L, 350L, 100L, 679L, 679L, 100L, 100L, 300L), V3 = c(159L, 159L, 260L, 750L, 700L, 300L, 300L, 600L, 600L, 125L, 687L, 687L, 200L, 200L, 400L), V4 = c(104L, 145L, 205L, 600L, 500L, 150L, 175L, 400L, 550L, 100L, 680L, 681L, 105L, 110L, 350L), V5 = c(104L, 145L, 205L, 600L, 500L, 150L, 175L, 400L, 550L, 100L, 680L, 681L, 105L, 110L, 350L), V6 = c(1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L), V7 = c(0.05, 0.04, 0.12, 0.09, 0.03, 0.05, 0, 0.06, 0, 0.1, 0.07, 0, 0.03, 0.08, 0), V8 = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, + )), .Names = c(V1, V2, V3, V4, V5, V6, V7, V8 ), class = data.frame, row.names = c(NA, -15L)) # This is the structure you should see: str(df) 'data.frame': 15 obs. of 8 variables: $ V1: chr chr1 chr1 chr1 chr1 ... $ V2: int 100 100 200 500 450 100 100 350 350 100 ... $ V3: int 159 159 260 750 700 300 300 600 600 125 ... $ V4: int 104 145 205 600 500 150 175 400 550 100 ... $ V5: int 104 145 205 600 500 150 175 400 550 100 ... $ V6: int 1 1 1 1 1 1 0 1 0 1 ... $ V7: num 0.05 0.04 0.12 0.09 0.03 0.05 0 0.06 0 0.1 ... $ V8: chr + + + + ... # Method 1: Write a function and call ddply() summfun - function(d) { dsum - as.data.frame(as.list(summary(d[['V7']]))) names(dsum) - c('Min', 'Q1', 'Median', 'Mean', 'Q3', 'Max') data.frame(V3 = d[1, 'V3'], dsum) } library('plyr') ddply(df, .(V1, V2), summfun) The idea behind summfun is this: ddply() prefers functions that take a data frame as input and a data frame (or scalar) as output. dsum converts summary(V7) to a data frame by first coercing it into a list and then to a data frame. The names are changed for convenience. dsum has one line, so we add V3 to the data frame before outputting it. ddply() will attach the grouping variables to the output automatically; however, you can put them into the output data frame and ddply() will not duplicate the grouping variables in the output. The alternative in ddply(), which is simpler code, outputs the results from summary() in different rows for each grouping. In this event, it is useful to carry along the names of the summaries so that one can recast the data with the cast() function from the reshape package: # Method 2: Summarize and reshape # V3 is unnecessary but it is useful to carry it along for the output u - ddply(df, .(V1, V2, V3), summarise, summ = summary(V7), summtype = names(summary(V7))) library('reshape') cast(u, V1 + V2 + V3 ~ summtype, value = 'summ') HTH, Dennis PS: I may be one of those folks to whom David was referring in relation to plyr :) On Wed, Jul 27, 2011 at 4:02 PM, a217 aj...@case.edu wrote: Hello, I have an input file: http://r.789695.n4.nabble.com/file/n3700031/testOut.txt testOut.txt where col 1 is chromosome, column2 is start of region, column 3 is end of region, column 4 and 5 is base position, column 6 is total reads, column 7 is methylation data, and column 8 is the strand. I would like a summary output file such as: http://r.789695.n4.nabble.com/file/n3700031/out.summary.txt out.summary.txt where column 1 is chromosome, column 2 is start of region, column 3 is end of region, column 4 is total reads in general, column 5 is total reads =1, column 6 is (col4/col5) or the percentage, and at the end I'd like to list 6 more columns based on summary results from summary() function in R. The summary() function will be used to analyze all of the methylation data (col7 from input) for each region (bounded by col2 and col3). For example for chr1 100 159 summary() gives: Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0400 0.0425 0.0450 0.0450 0.0475 0.0500 which is simply the methylation data input into summary() only in the region of chr1 100 159. I know how to perform all of the required functions line-by-line, but the hard part for me is essentially taking the input data with multiple positions in each region and assigning all of the summary results to one line identified by the region. If any of you have any suggestions I would appreciate it. -- View this message in context: http://r.789695.n4.nabble.com/Writing-a-summary-file-in-R-tp3700031p3700031.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing
Re: [R] Writing a summary file in R
On Jul 27, 2011, at 9:42 PM, Dennis Murphy wrote: Hi: Is this more or less what you're after? ## Note: This is the preferred way to send your data by e-mail. ## I used dput(data-frame-name) to produce this, ## where data-frame-name = 'df' on my end. df - structure(list(V1 = c(chr1, chr1, chr1, chr1, chr3, chr4, chr4, chr7, chr7, chr9, chr11, chr11, chr22, chr22, chr22), V2 = c(100L, 100L, 200L, 500L, 450L, 100L, 100L, 350L, 350L, 100L, 679L, 679L, 100L, 100L, 300L), V3 = c(159L, 159L, 260L, 750L, 700L, 300L, 300L, 600L, 600L, 125L, 687L, 687L, 200L, 200L, 400L), V4 = c(104L, 145L, 205L, 600L, 500L, 150L, 175L, 400L, 550L, 100L, 680L, 681L, 105L, 110L, 350L), V5 = c(104L, 145L, 205L, 600L, 500L, 150L, 175L, 400L, 550L, 100L, 680L, 681L, 105L, 110L, 350L), V6 = c(1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L), V7 = c(0.05, 0.04, 0.12, 0.09, 0.03, 0.05, 0, 0.06, 0, 0.1, 0.07, 0, 0.03, 0.08, 0), V8 = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, + )), .Names = c(V1, V2, V3, V4, V5, V6, V7, V8 ), class = data.frame, row.names = c(NA, -15L)) # This is the structure you should see: str(df) 'data.frame': 15 obs. of 8 variables: $ V1: chr chr1 chr1 chr1 chr1 ... $ V2: int 100 100 200 500 450 100 100 350 350 100 ... $ V3: int 159 159 260 750 700 300 300 600 600 125 ... $ V4: int 104 145 205 600 500 150 175 400 550 100 ... $ V5: int 104 145 205 600 500 150 175 400 550 100 ... $ V6: int 1 1 1 1 1 1 0 1 0 1 ... $ V7: num 0.05 0.04 0.12 0.09 0.03 0.05 0 0.06 0 0.1 ... $ V8: chr + + + + ... # Method 1: Write a function and call ddply() summfun - function(d) { dsum - as.data.frame(as.list(summary(d[['V7']]))) names(dsum) - c('Min', 'Q1', 'Median', 'Mean', 'Q3', 'Max') data.frame(V3 = d[1, 'V3'], dsum) } library('plyr') ddply(df, .(V1, V2), summfun) The idea behind summfun is this: ddply() prefers functions that take a data frame as input and a data frame (or scalar) as output. dsum converts summary(V7) to a data frame by first coercing it into a list and then to a data frame. The names are changed for convenience. dsum has one line, so we add V3 to the data frame before outputting it. ddply() will attach the grouping variables to the output automatically; however, you can put them into the output data frame and ddply() will not duplicate the grouping variables in the output. The alternative in ddply(), which is simpler code, outputs the results from summary() in different rows for each grouping. In this event, it is useful to carry along the names of the summaries so that one can recast the data with the cast() function from the reshape package: # Method 2: Summarize and reshape # V3 is unnecessary but it is useful to carry it along for the output u - ddply(df, .(V1, V2, V3), summarise, summ = summary(V7), summtype = names(summary(V7))) library('reshape') cast(u, V1 + V2 + V3 ~ summtype, value = 'summ') HTH, Dennis PS: I may be one of those folks to whom David was referring in relation to plyr :) I've been really impressed at Dennis' facility with plyr, reshape, and reshape2. Note that the 'reshape' function has nothing to do with the 'reshape' package. Here's what I came up with using base functions: str(inpdat) 'data.frame': 15 obs. of 8 variables: $ chromosome : chr chr1 chr1 chr1 chr1 ... $ startreg : int 100 100 200 500 450 100 100 350 350 100 ... $ endreg : int 159 159 260 750 700 300 300 600 600 125 ... $ base1 : int 104 145 205 600 500 150 175 400 550 100 ... $ base2 : int 104 145 205 600 500 150 175 400 550 100 ... $ totalreads : int 1 1 1 1 1 1 0 1 0 1 ... $ methylation: num 0.05 0.04 0.12 0.09 0.03 0.05 0 0.06 0 0.1 ... $ strand : chr + + + + ... # The split into distinct 'chromosome' and 'startreg' categories: splinp - split(inpdat, paste(inpdat$chromosome, inpdat$startreg) ) # Process within separate categories: the tapply, aggragate and by functions are all related df - as.data.frame( t(sapply(splinp, function(x) list(chr=x $chromosome[1], strt=x$startreg[1], end=x$endreg[1], frac=sum(x[['totalreads']]=1)/nrow(x) )) ) ) # You often need the t() function when working with apply functions df chr strt end frac chr1 100 chr1 100 1591 chr1 200 chr1 200 2601 chr1 500 chr1 500 7501 chr11 679 chr11 679 687 0.5 chr22 100 chr22 100 2001 chr22 300 chr22 300 4000 chr3 450 chr3 450 7001 chr4 100 chr4 100 300 0.5 chr7 350 chr7 350 600 0.5 chr9 100 chr9 100 1251 as.data.frame(t(sapply(splinp, function(x) summary(x $methylation )) ) ) Min. 1st Qu. Median Mean 3rd Qu. Max. chr1 100 0.04 0.0425 0.045 0.045 0.0475 0.05 chr1 200 0.12 0.1200 0.120 0.120 0.1200 0.12 chr1 500 0.09 0.0900 0.090 0.090 0.0900 0.09 chr11 679 0.00 0.0175 0.035 0.035 0.0525 0.07 chr22 100 0.03 0.0425 0.055 0.055 0.0675 0.08 chr22 300 0.00 0. 0.000 0.000 0. 0.00 chr3 450 0.03 0.0300 0.030
Re: [R] Writing a summary file in R
Thank you both very much! The codes are pretty slick and should greatly help me in my task. -- View this message in context: http://r.789695.n4.nabble.com/Writing-a-summary-file-in-R-tp3700031p3700382.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing to a file
Hi Thomas, If x contains your current results, one way to do what you want is the following: # data x - read.table(textConnection(X2403,0.006049271 X2403,0.000118622 X2403,50.99600705 X2403,7.62E-150 X2419,0.012464215 X2419,9.07E-05 X2419,137.4022573 X2419,6.45E-273), sep = ,) closeAllConnections() x # results data.frame(output = with(x, tapply(V2, V1, paste, sep = , collapse = ','))) HTH, Jorge On Thu, Dec 2, 2010 at 11:00 PM, Thomas Parr wrote: From: Thomas Parr [mailto:thomas.p...@maine.edu] Sent: Thursday, December 02, 2010 10:52 PM To: r-help-requ...@stat.math.ethz.ch Subject: Writing to a file I am trying to get my script to write to a file from the for loop. It is working, but the problem is at it is outputting to two columns and I want it to output to 5. Current results X2403,0.006049271 X2403,0.000118622 X2403,50.99600705 X2403,7.62E-150 X2419,0.012464215 X2419,9.07E-05 X2419,137.4022573 X2419,6.45E-273 ... Desired/expected results X2403,0.0060492710.000118622,50.99600705,7.62E-150 X2419,0.012464215,9.07E-05,137.4022573,6.45E-273 ... Data is being extracted from nls output with summary, nls uses fit. a-summary(nls(acoeff ~ aref*exp(-S*(alam-375)), trace=T, start=list(S=0.0015))) cat(sites[v-1],a$coefficients[1,1],a$coefficients[1,2],a$coefficients[1,3],a $coefficients[1,4],sep=,,append=TRUE, file=paste(dirpath,/results.csv,sep=)) The idea is that it is looping through the data sites and as nls generates parameter estimates, summary extracts them and cat writes them to a CSV file. Note: have tried write.csv, write.table, and write I thing they all call cat at some point. Any help would be appreciated and if you have a different solution I am all ears. Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing to a file
The HELP page for 'sink' is pretty clear about this: sink() or sink(file=NULL) ends the last diversion (of the specified type). There is a stack of diversions for normal output, so output reverts to the previous diversion (if there was one). The stack is of up to 21 connections (20 diversions). On Sat, Nov 13, 2010 at 11:12 PM, Gregory Ryslik rsa...@comcast.net wrote: Hi, I have a fairly complex object that I have written a print function for. Thus when I do print(results), the R console shows me a whole bunch of stuff already formatted. What I want to do is to take whatever print(results) shows to console and then put that in a file. I am doing this using the sink command. However, I am unsure as to how to unsink. Eg, how do I restore output to the normal console? Thanks, Greg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.