Re: [R] Writing to a file

2013-05-21 Thread arun
Hi,
Try this:
 lst1-lapply(1:5,function(i) {pdf(paste0(i,.pdf));  
hist(rnorm(100),main=paste0(Histogram_,i));dev.off()}) #you can change the 
numbers
A.K.


I'm trying to generate a pdf called 1.pdf, 2.pdf, 3.pdf etc and it isn't 
working. My code is: 
x - 0 
for(i in 1:1000){ 
x - x + 1 
pdf(as.character(x),.pdf) #writes out to pdf 
for(i in 1:100){ 
hist(rnorm(1)) # graphs histogram, writen to the file 
} 

dev.off() 
} 

Also, I triedto just do 
a- 1 
a 
 between the pdf() and dev.off() line and it wouldnt add it to the file, even 
with a name as foo.pdf. 

1.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] writing data to file

2012-03-22 Thread Gaurav Sood
look up dump

On Thu, Mar 22, 2012 at 11:35 AM, mail me mailme...@googlemail.com wrote:
 Hi:

 I created a data frame

 df - data.frame( person = c('John','Bob','Mary'), team =
 c('a','b','c'), stringsAsFactors = F);

 and obtained the expected  output

  df
  person   team
 1   John      a
 2    Bob      b
 3   Mary      c

 now I want to save the whole content of df preserving its row and
 column order to a file in disk with the following command:

 write(df, file = testfile,  append=FALSE, sep= );

 and I get the error message

 Error in cat(list(...), file, sep, fill, labels, append) :   argument
 1 (type 'list') cannot be handled by 'cat'

 Can you help to solve the problem? Thanks in advance.

 deb

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] writing data to file

2012-03-22 Thread Peter Ehlers

On 2012-03-22 08:35, mail me wrote:

Hi:

I created a data frame

df- data.frame( person = c('John','Bob','Mary'), team =
c('a','b','c'), stringsAsFactors = F);

and obtained the expected  output

  df
   person   team
1   John  a
2Bob  b
3   Mary  c

now I want to save the whole content of df preserving its row and
column order to a file in disk with the following command:

write(df, file = testfile,  append=FALSE, sep= );

and I get the error message

Error in cat(list(...), file, sep, fill, labels, append) :   argument
1 (type 'list') cannot be handled by 'cat'

Can you help to solve the problem? Thanks in advance.

deb


You're using the wrong function; use write.table() instead.
You may want to set either or both of the arguments 'quote'
and 'row.names' to FALSE.

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing a .pdf file within a function - what do I need to return()?

2012-03-13 Thread R. Michael Weylandt
See R FAQ 7.22 -- in short, you need to print() your plot to the
graphics device -- just wrap xyplot() in print() and it should work.

Michael

On Tue, Mar 13, 2012 at 3:55 PM, Dgnn sharkbrain...@gmail.com wrote:
 I am trying to write a function that generates one PDf containing plots from
 several .csv files within a directory.  When I manually execute the code it
 seems to work, but not when it is a function. I think I need to return()
 something, but haven't had much luck figuring out what/how.

 plot.isi-function(csv.path=~/project/csv by cell) {
        csv.files-grep('.csv', list.files(path = csv.path, full.names=T), 
 value=T)
        pdf(file='plots/isi plots.pdf', width=10, height=8)
        #par(mfrow=c(2,1)) #ideally 2 plots per page, but will work on details
 after fx. works
        for (i in 1:length(csv.files)){
                raw.df-read.csv(csv.files[i])
                names(raw.df)-c('t','isi','logic','cond')
                xyplot(isi ~ t, raw.df, ylim=c(0,1500), ylab='isi', 
 xlab='time',
                                main=basename(csv.files[i]))
        }
        dev.off()
 }

 Thank you all for the help,

 Jason Deignan



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Writing-a-pdf-file-within-a-function-what-do-I-need-to-return-tp4470165p4470165.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-07 Thread Felicity
Thank you a lot for answering so fast!
but..what do you mean by example?
I 've mentioned above the loop I used and I also show how the file looks
like
'cause its huge.

the way i read the file is 
x=read.table(filename.txt,header=FALSE,sep=\t,fill=TRUE)
y=x[1:45,]
(i use only some rows in order to test if it works )

--
View this message in context: 
http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4364034.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-07 Thread Petr PIKAL
Hi
 
 Thank you a lot for answering so fast!
 but..what do you mean by example?
 I 've mentioned above the loop I used and I also show how the file looks
 like

I do not see any loop. I do not archive all posts from R help, only those 
with interesting answers :-) and if you do not keep the context in future 
mails for those not using nabble it is lost and it would be necessary to 
dig in r help archive.

 'cause its huge.
 
 the way i read the file is 
 x=read.table(filename.txt,header=FALSE,sep=\t,fill=TRUE)
 y=x[1:45,]

maybe you can use even smaller fraction for a data example
y-x[1:10,]
dput(y)

and copy the output from dput to your mail is the easiest way.

Regards
Petr

 (i use only some rows in order to test if it works )
 
 --
 View this message in context: 
http://r.789695.n4.nabble.com/Writing-to-a-
 file-tp3070617p4364034.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-07 Thread Felicity
Thanks a lot for the interest :)

My loop is the following 
counter = 0
 for (i in 1:nrow(y))
 {


 for (j in 1:ncol(y))
 {
 if (y[i,j]==Func_0005634) {
 counter = counter + 1 }
 if(y[i,j]==Func_0005737){
 counter = counter + 1 } 
 if(y[i,j]==Func_0005515){
counter = counter + 1 }

}
 if(counter == 3 ){
cat(y[i,1],  file = foo.csv,  \n)
}
 counter = 0

}

and after read.table(foo.csv)

I get 
  V1
1 45

which is the last result

why does it overwrite? how can I have all the results?

Eager to a reply from you!

--
View this message in context: 
http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4364149.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-07 Thread R. Michael Weylandt
As I said to you a while back, use append = TRUE.

Michael

On Tue, Feb 7, 2012 at 4:18 AM, Felicity felicity...@hotmail.com wrote:
 Thanks a lot for the interest :)

 My loop is the following
 counter = 0
  for (i in 1:nrow(y))
  {


  for (j in 1:ncol(y))
  {
  if (y[i,j]==Func_0005634) {
         counter = counter + 1 }
  if(y[i,j]==Func_0005737){
         counter = counter + 1 }
  if(y[i,j]==Func_0005515){
        counter = counter + 1 }

 }
  if(counter == 3 ){
        cat(y[i,1],  file = foo.csv,  \n)
        }
  counter = 0

 }

 and after read.table(foo.csv)

 I get
  V1
 1 45

 which is the last result

 why does it overwrite? how can I have all the results?

 Eager to a reply from you!

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4364149.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-07 Thread Petr PIKAL
Hi

now you omitted data, but never mind :-)

 My loop is the following 
 counter = 0
  for (i in 1:nrow(y))
  {
 
 
  for (j in 1:ncol(y))
  {
  if (y[i,j]==Func_0005634) {
  counter = counter + 1 }
  if(y[i,j]==Func_0005737){
  counter = counter + 1 } 
  if(y[i,j]==Func_0005515){
 counter = counter + 1 }
 
 }
  if(counter == 3 ){
cat(y[i,1],  file = foo.csv,  \n)
 }
  counter = 0
 
 }
 

If I remember correctly you want to inspect each row if it contains any of 
Func values and how many of them.

 dput(y)
structure(list(prot = c(1, 2, 3, 4), X1 = structure(c(1L, 1L, 
1L, 2L), .Label = c(a,  ), class = factor), X2 = structure(c(3L, 
2L, 3L, 3L), .Label = c(a, b,  ), class = factor), X3 = 
structure(c(3L, 
3L, 3L, 2L), .Label = c(b, c,  ), class = factor), X4 = 
structure(c(1L, 
1L, 1L, 3L), .Label = c(c, d,  ), class = factor), X5 = 
structure(c(2L, 
1L, 1L, 1L), .Label = c(d,  ), class = factor)), .Names = c(prot, 
X1, X2, X3, X4, X5), row.names = c(NA, 4L), class = 
data.frame)


So 

 rowSums((y==a) | (y==b) | (y==d))
1 2 3 4 
1 3 2 1 

gives you number of values (a,b,d) in each row.

Your construction comes from some differnt programming world.

Regards
Petr


 and after read.table(foo.csv)
 
 I get 
   V1
 1 45
 
 which is the last result
 
 why does it overwrite? how can I have all the results?
 
 Eager to a reply from you!
 
 --
 View this message in context: 
http://r.789695.n4.nabble.com/Writing-to-a-
 file-tp3070617p4364149.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-06 Thread Felicity
Dear All!!

I am also new in R 
and trying to write my results into a file I post here..hopefully is the
proper place
To be more secific  I have this loop 


counter = 0
 for (i in 1:nrow(y))
 {
 for (j in 1:ncol(y))
 {
 if (y[i,j]==Func_0005634) {
 counter = counter + 1 }
 if(y[i,j]==Func_0005737){
 counter = counter + 1 } 
 if(y[i,j]==Func_0005515){
counter = counter + 1 }
}
 if(counter == 2) {
 k-structure(list(print(y[i,1])), class = data.frame)
}
 if(counter == 3 ){
 l-structure(list(print(y[i,1])), class = data.frame)
}
 counter = 0
 }


for counter==2 or counter ==3 
I want to get print(y[i,1]) 
where in column 1 exists the name of the protein
whereas in the rest columns exist somewhere randomly the strings im looking
for 

I want to get the names of the proteins in a file and those that have either
2 or 3 functions be named as cancer.

the specific part of code gives me as a result in the command line this (is
a sample cause im working on 8500lines)
[1] Prot_10035
8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ...
Prot_9996
[1] Prot_10041
8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ...
Prot_9996
[1] Prot_10045
8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ...
Prot_9996


which is fine i can see the names of the proteins but i cant use them so to
label them
When I try to write it in a file ..then is kept only the last result because
unfortunatelly 
he overwrites himself :(

How can I use those data? How can I write them in a file and add as an extra
column the word cancel
for those containing the specific functions?

Any hint you may give me it would be more than helpful for me!
Thank you a lot in advance! 
Looking forward to your reply :)


--
View this message in context: 
http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4360889.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-06 Thread R. Michael Weylandt
You don't say how you are writing to a file, but some methods have an
append = TRUE option that might be helpful.

Your code looks really inefficient as well: I don't have time to look
at it fully now, but it seems to me that you can vectorize the inner
loops quite directly:

for(j in ncol(y)){
   if(y[i,j]==Func_0005515){
   counter = counter + 1 }
   }
}

could become

counter = counter + sum(y[i, ] == Func_0005515)

Michael

On Mon, Feb 6, 2012 at 5:50 AM, Felicity felicity...@hotmail.com wrote:
 Dear All!!

 I am also new in R
 and trying to write my results into a file I post here..hopefully is the
 proper place
 To be more secific  I have this loop


 counter = 0
  for (i in 1:nrow(y))
  {
  for (j in 1:ncol(y))
  {
  if (y[i,j]==Func_0005634) {
         counter = counter + 1 }
  if(y[i,j]==Func_0005737){
         counter = counter + 1 }
  if(y[i,j]==Func_0005515){
        counter = counter + 1 }
 }
  if(counter == 2) {
         k-structure(list(print(y[i,1])), class = data.frame)
 }
  if(counter == 3 ){
         l-structure(list(print(y[i,1])), class = data.frame)
        }
  counter = 0
  }


 for counter==2 or counter ==3
 I want to get print(y[i,1])
 where in column 1 exists the name of the protein
 whereas in the rest columns exist somewhere randomly the strings im looking
 for

 I want to get the names of the proteins in a file and those that have either
 2 or 3 functions be named as cancer.

 the specific part of code gives me as a result in the command line this (is
 a sample cause im working on 8500lines)
 [1] Prot_10035
 8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ...
 Prot_9996
 [1] Prot_10041
 8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ...
 Prot_9996
 [1] Prot_10045
 8527 Levels: Prot_0 Prot_1 Prot_10 Prot_100 Prot_1000 Prot_1 ...
 Prot_9996


 which is fine i can see the names of the proteins but i cant use them so to
 label them
 When I try to write it in a file ..then is kept only the last result because
 unfortunatelly
 he overwrites himself :(

 How can I use those data? How can I write them in a file and add as an extra
 column the word cancel
 for those containing the specific functions?

 Any hint you may give me it would be more than helpful for me!
 Thank you a lot in advance!
 Looking forward to your reply :)


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4360889.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-06 Thread Felicity
maybe I could keep each line (having the strings) 
in a file or somewhere and then 
call a print function that prints them all together
from where I saved them?
Please let me know as soon as Possible!!
thank you!

--
View this message in context: 
http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4362340.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-06 Thread jim holtman
You can easily do that, but the question is what is the problem you
are trying to solve?  What do you want to do with the lines you are
writing out?  Are you going to read them back in or process them with
some other program?  So save them in a character vector and then write
them out with 'cat'.

On Mon, Feb 6, 2012 at 1:49 PM, Felicity felicity...@hotmail.com wrote:
 maybe I could keep each line (having the strings)
 in a file or somewhere and then
 call a print function that prints them all together
 from where I saved them?
 Please let me know as soon as Possible!!
 thank you!

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Writing-to-a-file-tp3070617p4362340.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2012-02-06 Thread Felicity
Honestly thank you for the prompt responding
and you are right I will tellyou what I want to do 
and not the way ..since I dont know much from R


I have a txt with Proteins

Prot_10035Func_0005874  Func_0016787  Func_0003774  Func_0006898
Func_0005856  Func_0005525  Func_0005737  Func_0003924  Func_0005515
Func_166  













Prot_10036Func_0005739  Func_0003735  Func_0006412  Func_0005763
Func_0005840  













Prot_10037Func_0005739  Func_0005515  














Prot_10039Func_0005576  Func_0009615  Func_0050832  Func_0005615
Func_0006955  Func_0042742  Func_0031640  Func_0006935  














Prot_1004 

Re: [R] Writing to a file

2012-02-06 Thread Petr PIKAL
Hi
 
 Honestly thank you for the prompt responding
 and you are right I will tellyou what I want to do 
 and not the way ..since I dont know much from R
 
 
 I have a txt with Proteins
 
 Prot_10035   Func_0005874   Func_0016787   Func_0003774 
Func_0006898
 Func_0005856   Func_0005525   Func_0005737   Func_0003924 
Func_0005515
 Func_166  
 Prot_10036   Func_0005739   Func_0003735   Func_0006412 
Func_0005763
 Func_0005840  
 Prot_10037   Func_0005739   Func_0005515  
 Prot_10039   Func_0005576   Func_0009615   Func_0050832 
Func_0005615
 Func_0006955   Func_0042742   Func_0031640   Func_0006935   
 Prot_1004   Func_0046872   Func_0003887   Func_0003684 
Func_0016740
 Func_0006281   Func_0006260   Func_0016779   Func_0005634   
 Prot_10040   Func_0005886   Func_0046488   Func_0016301 
Func_0007409
 Func_0005524   Func_0016740   Func_0016308   Func_166 
 
 which is 8527 lines and 145 columns (not all the proteins have the same
 number of proteins)
functions?

First of all you need to read this file into R properly. I would try 
readLines with some further polishing to feed list structure with protein 
names as labels for each part of a list. After that some cycle/lapply 
checking with regular expression could be a way to populate a data frame 
with protein names in first column and score in the second. After that you 
can compare such score with other values in another data frame.

However without an example you hardly get detailed help.

Regards
Petr


 What I want is to predict whether those proteins are related to cancer 
or
 not 
 depending on whether they have some functions. I found that there are 3
 functions very often related to cancer
 and in case a protein has 2/3 or 3/3 to label it (somehow-maybe adding 
an
 extra column) as cancer related
 The names of the Proteins are always in the 1st column but the names of 
the
 functions can be at any of the next columns 
 
 So what I did is to use this loop, but I cant write properly the way I 
want
 it to print the results so to use them again
 (I need to know the name of the proteins having the functions in a 
column so
 as next step to compare it with another file
 -test data set- and conclude to true positive, false positive, true
 negative, false negative
 
 It cant be as hard as I see it :):) 
 
 --
 View this message in context: 
http://r.789695.n4.nabble.com/Writing-to-a-
 file-tp3070617p4363940.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing a summary file in R

2011-08-03 Thread a217
Just a very simple follow-up. In the summary table (listed as summ below),
the TR column I would like to display the total number of rows (i.e.
counts) which I have done via NROW() function.

However, in the RG1 I would only like to count the number of rows with a
'totalread' count = 1 (i.e. rows that don't contain zero).

This may be confusing given the data I've provided, but values in the
'totalreads' column don't have to be 1 or 0, they can be any value.
Therefore using sum() won't work in every case.

As you can see I've tried using NROW() below for RG1 but it didn't work
out like I had planned.

For example, given the input data, chr4 100 300 should have RG1=1 and
percent=0.5. Instead, it just counts every row regardless of value.

The solution is probably something very simple I'm overlooking, but if you
could help I'd appreciate it.

Below is the code I've slightly modified from David's reply:
###Code##


 colnames(data) -
 c(chr,start,end,base1,base2,totalreads,methylation,strand)
 data
#this is the input file

 chr start end base1 base2 totalreads methylation strand
1   chr1   100 159   104   104  10.05  +
2   chr1   100 159   145   145  10.04  +
3   chr1   200 260   205   205  10.12  +
4   chr1   500 750   600   600  10.09  +
5   chr3   450 700   500   500  10.03  +
6   chr4   100 300   150   150  10.05  +
7   chr4   100 300   175   175  00.00  +
8   chr7   350 600   400   400  10.06  +
9   chr7   350 600   550   550  00.00  +
10  chr9   100 125   100   100  10.10  +
11 chr11   679 687   680   680  10.07  +
12 chr11   679 687   681   681  00.00  +
13 chr22   100 200   105   105  10.03  +
14 chr22   100 200   110   110  10.08  +
15 chr22   300 400   350   350  00.00  +


 splinp - split(data, paste(data$chr, data$start))
 df - as.data.frame(t(sapply(splinp, function(x) list(end=x$end[1],
 TR=NROW(x[['totalreads']]), RG1=NROW(x[['totalreads']]=1),
 percent=(NROW(x[['totalreads']]=1)/NROW(x[['totalreads']]))
 df

###
  end TR RG1 percent
chr1 100  159  2   2   1
chr1 200  260  1   1   1
chr1 500  750  1   1   1
chr11 679 687  2   2   1
chr22 100 200  2   2   1
chr22 300 400  1   1   1
chr3 450  700  1   1   1
chr4 100  300  2   2   1
chr7 350  600  2   2   1
chr9 100  125  1   1   1
###

 df.summ - as.data.frame(t(sapply(splinp, function(x)
 summary(x$methylation
 summ-cbind(df,df.summ)
 summ

#the finished output
###
  end TR RG1 percent Min. 1st Qu. Median  Mean 3rd Qu. Max.
chr1 100  159  2   2   1 0.04  0.0425  0.045 0.045  0.0475 0.05
chr1 200  260  1   1   1 0.12  0.1200  0.120 0.120  0.1200 0.12
chr1 500  750  1   1   1 0.09  0.0900  0.090 0.090  0.0900 0.09
chr11 679 687  2   2   1 0.00  0.0175  0.035 0.035  0.0525 0.07
chr22 100 200  2   2   1 0.03  0.0425  0.055 0.055  0.0675 0.08
chr22 300 400  1   1   1 0.00  0.  0.000 0.000  0. 0.00
chr3 450  700  1   1   1 0.03  0.0300  0.030 0.030  0.0300 0.03
chr4 100  300  2   2   1 0.00  0.0125  0.025 0.025  0.0375 0.05
chr7 350  600  2   2   1 0.00  0.0150  0.030 0.030  0.0450 0.06
chr9 100  125  1   1   1 0.10  0.1000  0.100 0.100  0.1000 0.10


##







David Winsemius wrote:
 
 On Jul 27, 2011, at 9:42 PM, Dennis Murphy wrote:
 
 Hi:

 Is this more or less what you're after?

 ## Note: This is the preferred way to send your data by e-mail.
 ## I used dput(data-frame-name) to produce this,
 ## where data-frame-name = 'df' on my end.
 df - structure(list(V1 = c(chr1, chr1, chr1, chr1, chr3,
 chr4, chr4, chr7, chr7, chr9, chr11, chr11, chr22,
 chr22, chr22), V2 = c(100L, 100L, 200L, 500L, 450L, 100L,
 100L, 350L, 350L, 100L, 679L, 679L, 100L, 100L, 300L), V3 = c(159L,
 159L, 260L, 750L, 700L, 300L, 300L, 600L, 600L, 125L, 687L, 687L,
 200L, 200L, 400L), V4 = c(104L, 145L, 205L, 600L, 500L, 150L,
 175L, 400L, 550L, 100L, 680L, 681L, 105L, 110L, 350L), V5 = c(104L,
 145L, 205L, 600L, 500L, 150L, 175L, 400L, 550L, 100L, 680L, 681L,
 105L, 110L, 350L), V6 = c(1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L,
 1L, 1L, 0L, 1L, 1L, 0L), V7 = c(0.05, 0.04, 0.12, 0.09, 0.03,
 0.05, 0, 0.06, 0, 0.1, 0.07, 0, 0.03, 0.08, 0), V8 = c(+, +,
 +, +, +, +, +, +, +, +, +, +, +, +, +
 )), .Names = c(V1, V2, V3, V4, V5, V6, V7, V8
 ), class = data.frame, row.names = c(NA, -15L))

 
 # This is the structure you should see:
 str(df)
 'data.frame':   15 

Re: [R] Writing a summary file in R

2011-07-27 Thread David Winsemius


On Jul 27, 2011, at 7:02 PM, a217 wrote:


Hello,

I have an input file:
http://r.789695.n4.nabble.com/file/n3700031/testOut.txt testOut.txt

where col 1 is chromosome, column2 is start of region, column 3 is  
end of
region, column 4 and 5 is base position, column 6 is total reads,  
column 7

is methylation data, and column 8 is the strand.


I would like a summary output file such as:
http://r.789695.n4.nabble.com/file/n3700031/out.summary.txt  
out.summary.txt


where column 1 is chromosome, column 2 is start of region, column 3  
is end
of region, column 4 is total reads in general, column 5 is total  
reads =1,
column 6 is (col4/col5) or the percentage, and at the end I'd like  
to list 6

more columns based on summary results from summary() function in R.

The summary() function will be used to analyze all of the  
methylation data

(col7 from input) for each region (bounded by col2 and col3).

For example for chr1 100 159 summary() gives:
Min. 1st Qu.  MedianMean 3rd Qu.Max.
0.0400  0.0425  0.0450  0.0450  0.0475  0.0500

which is simply the methylation data input into summary() only in  
the region

of chr1 100 159.

I know how to perform all of the required functions line-by-line,  
but the

hard part for me is essentially taking the input data with multiple
positions in each region and assigning all of the summary results to  
one

line identified by the region.

If any of you have any suggestions I would appreciate it.


So essentially you want to drop columns 4:5 and column 8 and calculate  
a proportion of counts = 1 and get summary stats within  separate  
categories of start-of-region. Is that correct?


This is probably  a job for aggregate or for ddply in plyr if I felt  
comfortable with it, which I don't in general. Its documentation  
through the help pages is s not great IMO but there are those who love  
it. And I admit the melt function is a major contributor to human  
happiness.  Why don't you read up on aggregate which is a base  
function (in the r-sense, not in the biological sense.) I will see  
what I can come up with in the meantime.


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing a summary file in R

2011-07-27 Thread a217
Yes, that is the general objective. I'll look-into aggregates in R and see if
anything helps.

Thanks,
a217

--
View this message in context: 
http://r.789695.n4.nabble.com/Writing-a-summary-file-in-R-tp3700031p3700071.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing a summary file in R

2011-07-27 Thread Dennis Murphy
Hi:

Is this more or less what you're after?

## Note: This is the preferred way to send your data by e-mail.
## I used dput(data-frame-name) to produce this,
## where data-frame-name = 'df' on my end.
df - structure(list(V1 = c(chr1, chr1, chr1, chr1, chr3,
chr4, chr4, chr7, chr7, chr9, chr11, chr11, chr22,
chr22, chr22), V2 = c(100L, 100L, 200L, 500L, 450L, 100L,
100L, 350L, 350L, 100L, 679L, 679L, 100L, 100L, 300L), V3 = c(159L,
159L, 260L, 750L, 700L, 300L, 300L, 600L, 600L, 125L, 687L, 687L,
200L, 200L, 400L), V4 = c(104L, 145L, 205L, 600L, 500L, 150L,
175L, 400L, 550L, 100L, 680L, 681L, 105L, 110L, 350L), V5 = c(104L,
145L, 205L, 600L, 500L, 150L, 175L, 400L, 550L, 100L, 680L, 681L,
105L, 110L, 350L), V6 = c(1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L,
1L, 1L, 0L, 1L, 1L, 0L), V7 = c(0.05, 0.04, 0.12, 0.09, 0.03,
0.05, 0, 0.06, 0, 0.1, 0.07, 0, 0.03, 0.08, 0), V8 = c(+, +,
+, +, +, +, +, +, +, +, +, +, +, +, +
)), .Names = c(V1, V2, V3, V4, V5, V6, V7, V8
), class = data.frame, row.names = c(NA, -15L))


# This is the structure you should see:
 str(df)
'data.frame':   15 obs. of  8 variables:
 $ V1: chr  chr1 chr1 chr1 chr1 ...
 $ V2: int  100 100 200 500 450 100 100 350 350 100 ...
 $ V3: int  159 159 260 750 700 300 300 600 600 125 ...
 $ V4: int  104 145 205 600 500 150 175 400 550 100 ...
 $ V5: int  104 145 205 600 500 150 175 400 550 100 ...
 $ V6: int  1 1 1 1 1 1 0 1 0 1 ...
 $ V7: num  0.05 0.04 0.12 0.09 0.03 0.05 0 0.06 0 0.1 ...
 $ V8: chr  + + + + ...


# Method 1: Write a function and call ddply()
summfun - function(d)  {
dsum - as.data.frame(as.list(summary(d[['V7']])))
names(dsum) - c('Min', 'Q1', 'Median', 'Mean', 'Q3', 'Max')
data.frame(V3 = d[1, 'V3'], dsum)
  }
library('plyr')
ddply(df, .(V1, V2), summfun)

The idea behind summfun is this: ddply() prefers functions that take a
data frame as input and a data frame (or scalar) as output. dsum
converts summary(V7) to a data frame by first coercing it into a list
and then to a data frame. The names are changed for convenience. dsum
has one line, so we add V3 to the data frame before outputting it.
ddply() will attach the grouping variables to the output
automatically; however, you can put them into the output data frame
and ddply() will not duplicate the grouping variables in the output.

The alternative in ddply(), which is simpler code, outputs the results
from summary() in different rows for each grouping. In this event, it
is useful to carry along the names of the summaries so that one can
recast the data with the cast() function from the reshape package:

# Method 2: Summarize and reshape
# V3 is unnecessary but it is useful to carry it along for the output
u - ddply(df, .(V1, V2, V3), summarise, summ = summary(V7),
   summtype = names(summary(V7)))
library('reshape')
cast(u, V1 + V2 + V3 ~ summtype, value = 'summ')

HTH,
Dennis

PS: I may be one of those folks to whom David was referring in
relation to plyr :)

On Wed, Jul 27, 2011 at 4:02 PM, a217 aj...@case.edu wrote:
 Hello,

 I have an input file:
 http://r.789695.n4.nabble.com/file/n3700031/testOut.txt testOut.txt

 where col 1 is chromosome, column2 is start of region, column 3 is end of
 region, column 4 and 5 is base position, column 6 is total reads, column 7
 is methylation data, and column 8 is the strand.


 I would like a summary output file such as:
 http://r.789695.n4.nabble.com/file/n3700031/out.summary.txt out.summary.txt

 where column 1 is chromosome, column 2 is start of region, column 3 is end
 of region, column 4 is total reads in general, column 5 is total reads =1,
 column 6 is (col4/col5) or the percentage, and at the end I'd like to list 6
 more columns based on summary results from summary() function in R.

 The summary() function will be used to analyze all of the methylation data
 (col7 from input) for each region (bounded by col2 and col3).

 For example for chr1 100 159 summary() gives:
  Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  0.0400  0.0425  0.0450  0.0450  0.0475  0.0500

 which is simply the methylation data input into summary() only in the region
 of chr1 100 159.

 I know how to perform all of the required functions line-by-line, but the
 hard part for me is essentially taking the input data with multiple
 positions in each region and assigning all of the summary results to one
 line identified by the region.

 If any of you have any suggestions I would appreciate it.

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Writing-a-summary-file-in-R-tp3700031p3700031.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing 

Re: [R] Writing a summary file in R

2011-07-27 Thread David Winsemius


On Jul 27, 2011, at 9:42 PM, Dennis Murphy wrote:


Hi:

Is this more or less what you're after?

## Note: This is the preferred way to send your data by e-mail.
## I used dput(data-frame-name) to produce this,
## where data-frame-name = 'df' on my end.
df - structure(list(V1 = c(chr1, chr1, chr1, chr1, chr3,
chr4, chr4, chr7, chr7, chr9, chr11, chr11, chr22,
chr22, chr22), V2 = c(100L, 100L, 200L, 500L, 450L, 100L,
100L, 350L, 350L, 100L, 679L, 679L, 100L, 100L, 300L), V3 = c(159L,
159L, 260L, 750L, 700L, 300L, 300L, 600L, 600L, 125L, 687L, 687L,
200L, 200L, 400L), V4 = c(104L, 145L, 205L, 600L, 500L, 150L,
175L, 400L, 550L, 100L, 680L, 681L, 105L, 110L, 350L), V5 = c(104L,
145L, 205L, 600L, 500L, 150L, 175L, 400L, 550L, 100L, 680L, 681L,
105L, 110L, 350L), V6 = c(1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L,
1L, 1L, 0L, 1L, 1L, 0L), V7 = c(0.05, 0.04, 0.12, 0.09, 0.03,
0.05, 0, 0.06, 0, 0.1, 0.07, 0, 0.03, 0.08, 0), V8 = c(+, +,
+, +, +, +, +, +, +, +, +, +, +, +, +
)), .Names = c(V1, V2, V3, V4, V5, V6, V7, V8
), class = data.frame, row.names = c(NA, -15L))


# This is the structure you should see:

str(df)

'data.frame':   15 obs. of  8 variables:
$ V1: chr  chr1 chr1 chr1 chr1 ...
$ V2: int  100 100 200 500 450 100 100 350 350 100 ...
$ V3: int  159 159 260 750 700 300 300 600 600 125 ...
$ V4: int  104 145 205 600 500 150 175 400 550 100 ...
$ V5: int  104 145 205 600 500 150 175 400 550 100 ...
$ V6: int  1 1 1 1 1 1 0 1 0 1 ...
$ V7: num  0.05 0.04 0.12 0.09 0.03 0.05 0 0.06 0 0.1 ...
$ V8: chr  + + + + ...


# Method 1: Write a function and call ddply()
summfun - function(d)  {
   dsum - as.data.frame(as.list(summary(d[['V7']])))
   names(dsum) - c('Min', 'Q1', 'Median', 'Mean', 'Q3', 'Max')
   data.frame(V3 = d[1, 'V3'], dsum)
 }
library('plyr')
ddply(df, .(V1, V2), summfun)

The idea behind summfun is this: ddply() prefers functions that take a
data frame as input and a data frame (or scalar) as output. dsum
converts summary(V7) to a data frame by first coercing it into a list
and then to a data frame. The names are changed for convenience. dsum
has one line, so we add V3 to the data frame before outputting it.
ddply() will attach the grouping variables to the output
automatically; however, you can put them into the output data frame
and ddply() will not duplicate the grouping variables in the output.

The alternative in ddply(), which is simpler code, outputs the results
from summary() in different rows for each grouping. In this event, it
is useful to carry along the names of the summaries so that one can
recast the data with the cast() function from the reshape package:

# Method 2: Summarize and reshape
# V3 is unnecessary but it is useful to carry it along for the output
u - ddply(df, .(V1, V2, V3), summarise, summ = summary(V7),
  summtype = names(summary(V7)))
library('reshape')
cast(u, V1 + V2 + V3 ~ summtype, value = 'summ')

HTH,
Dennis

PS: I may be one of those folks to whom David was referring in
relation to plyr :)


I've been really impressed at Dennis' facility with plyr, reshape, and  
reshape2. Note that the 'reshape' function has nothing to do with the  
'reshape' package.  Here's what I came up with using base functions:


 str(inpdat)
'data.frame':   15 obs. of  8 variables:
 $ chromosome : chr  chr1 chr1 chr1 chr1 ...
 $ startreg   : int  100 100 200 500 450 100 100 350 350 100 ...
 $ endreg : int  159 159 260 750 700 300 300 600 600 125 ...
 $ base1  : int  104 145 205 600 500 150 175 400 550 100 ...
 $ base2  : int  104 145 205 600 500 150 175 400 550 100 ...
 $ totalreads : int  1 1 1 1 1 1 0 1 0 1 ...
 $ methylation: num  0.05 0.04 0.12 0.09 0.03 0.05 0 0.06 0 0.1 ...
 $ strand : chr  + + + + ...
# The split into distinct 'chromosome' and 'startreg' categories:
splinp - split(inpdat, paste(inpdat$chromosome, inpdat$startreg) )

# Process within separate categories: the tapply, aggragate and by  
functions are all related


 df - as.data.frame( t(sapply(splinp, function(x) list(chr=x 
$chromosome[1], strt=x$startreg[1], end=x$endreg[1],  
frac=sum(x[['totalreads']]=1)/nrow(x) )) ) )

# You often need the t() function when working with apply functions
 df
chr strt end frac
chr1 100   chr1  100 1591
chr1 200   chr1  200 2601
chr1 500   chr1  500 7501
chr11 679 chr11  679 687  0.5
chr22 100 chr22  100 2001
chr22 300 chr22  300 4000
chr3 450   chr3  450 7001
chr4 100   chr4  100 300  0.5
chr7 350   chr7  350 600  0.5
chr9 100   chr9  100 1251

 as.data.frame(t(sapply(splinp, function(x) summary(x 
$methylation )) ) )

  Min. 1st Qu. Median  Mean 3rd Qu. Max.
chr1 100  0.04  0.0425  0.045 0.045  0.0475 0.05
chr1 200  0.12  0.1200  0.120 0.120  0.1200 0.12
chr1 500  0.09  0.0900  0.090 0.090  0.0900 0.09
chr11 679 0.00  0.0175  0.035 0.035  0.0525 0.07
chr22 100 0.03  0.0425  0.055 0.055  0.0675 0.08
chr22 300 0.00  0.  0.000 0.000  0. 0.00
chr3 450  0.03  0.0300  0.030 

Re: [R] Writing a summary file in R

2011-07-27 Thread a217
Thank you both very much! The codes are pretty slick and should greatly help
me in my task.

--
View this message in context: 
http://r.789695.n4.nabble.com/Writing-a-summary-file-in-R-tp3700031p3700382.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing to a file

2010-12-02 Thread Jorge Ivan Velez
Hi Thomas,

If x contains your current results, one way to do what you want is the
following:

# data
x - read.table(textConnection(X2403,0.006049271
X2403,0.000118622
X2403,50.99600705
X2403,7.62E-150
X2419,0.012464215
X2419,9.07E-05
X2419,137.4022573
X2419,6.45E-273), sep = ,)
closeAllConnections()
x

# results
data.frame(output = with(x, tapply(V2, V1, paste, sep = , collapse =
',')))

HTH,
Jorge


On Thu, Dec 2, 2010 at 11:00 PM, Thomas Parr  wrote:

 From: Thomas Parr [mailto:thomas.p...@maine.edu]
 Sent: Thursday, December 02, 2010 10:52 PM
 To: r-help-requ...@stat.math.ethz.ch
 Subject: Writing to a file

 I am trying to get my script to write to a file from the for loop.  It is
 working, but the problem is at it is outputting to two columns and I want
 it to output to 5.

 Current results
 X2403,0.006049271
 X2403,0.000118622
 X2403,50.99600705
 X2403,7.62E-150
 X2419,0.012464215
 X2419,9.07E-05
 X2419,137.4022573
 X2419,6.45E-273
 ...

 Desired/expected results
 X2403,0.0060492710.000118622,50.99600705,7.62E-150
 X2419,0.012464215,9.07E-05,137.4022573,6.45E-273
 ...

 Data is being extracted from nls output with summary, nls uses fit.

 a-summary(nls(acoeff ~ aref*exp(-S*(alam-375)), trace=T,
 start=list(S=0.0015)))

 cat(sites[v-1],a$coefficients[1,1],a$coefficients[1,2],a$coefficients[1,3],a
 $coefficients[1,4],sep=,,append=TRUE,
 file=paste(dirpath,/results.csv,sep=))

 The idea is that it is looping through the data sites and as nls
 generates
 parameter estimates, summary extracts them and cat writes them to a CSV
 file.
 Note: have tried write.csv, write.table, and write  I thing they all call
 cat at some point.

 Any help would be appreciated and if you have a different solution I am all
 ears.

 Thomas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] writing to a file

2010-11-13 Thread jim holtman
The HELP page for 'sink' is pretty clear about this:

sink() or sink(file=NULL) ends the last diversion (of the specified
type). There is a stack of diversions for normal output, so output
reverts to the previous diversion (if there was one). The stack is of
up to 21 connections (20 diversions).



On Sat, Nov 13, 2010 at 11:12 PM, Gregory Ryslik rsa...@comcast.net wrote:
 Hi,

 I have a fairly complex object that I have written a print function for.

 Thus when I do print(results), the R console shows me a whole bunch of stuff 
 already formatted. What I want to do is to take whatever print(results) shows 
 to console and then put that in a file. I am doing this using the sink 
 command.

 However, I am unsure as to how to unsink. Eg, how do I restore output to 
 the normal console?

 Thanks,
 Greg
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.