[R] Unusual separators
Hi all, I have a list that I got from a web page that I would like to crunch. Unfortunately, the list has some unusual separators in it. I believe the columns are separated by 1 space and 1 tab. I tried to insert this into the read.table( ..., sep= \t, ...) but got an error that said something like 'only one byte separators can be used. I have thought about using a gsub to 'swap out' the space + tab and replace it with commas, etc but thought there might be another way. Any suggestions? M -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reshape::rename package unable to install !?!
Greetings all, I have been working with RStudio and R only for a little while. I came across a package called 'reshape' that helped me 'rename' columns. Unfortunately, my computer got hosed (too much playing with linux too late at nite) and I had to re-install everything, BUT when I tried to reinstall 'reshape' or 'reshape2' I COULDN't. Is there a way to get over this hurdle with reshape or is there another command I can use. I am stuck because my programs up to this point used 'rename' and now I have to redo some work. M -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Which is more efficient?
Greetings all, I am curious to know if either of these two sets of code is more efficient? Example1: ## t-test ## colA - temp [ , j ] colB - temp [ , k ] ttr - t.test ( colA, colB, var.equal=TRUE) tt_pvalue [ i ] - ttr$p.value or Example2: tt_pvalue [ i ] - t.test ( temp[ , j ], temp[ , k ], var.equal=TRUE) - I have three loops, i, j, k. One to test the all of i files in a directory. One to tease out column j and compare it by means of t-test to column k in each of the files. --- for ( i in 1:num_files ) { temp - read.table ( files_to_test [ i ], header=TRUE, sep=\t) num_cols - ncol ( temp ) ## Define Columns To Compare ## for ( j in 2 : num_cols ) { for ( k in 3 : num_cols ) { ## t-test ## colA - temp [ , j ] colB - temp [ , k ] ttr - t.test ( colA, colB, var.equal=TRUE) tt_pvalue [ i ] - ttr$p.value } } } I am a novice writer of code and am interested to hear if there are any (dis)advantages to one way or the other. M Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error message for MCC
Greetings all, I am getting an error message that is stifling me. Any ideas? ## Define Directories ## load_from - /home/mcc/Dropbox/abrodsky/kegg_combine_data/ save_to - /home/mcc/Dropbox/abrodsky/ttest_results/ ### ## Define Columns To Compare ## compareA - log_b_rich compareB - Fc_cdt_rich_tot ## Collect Files To Compare ## setwd(load_from) files_to_test - list.files(pattern = combine.kegg) ## ## Initialize Variables ## vl - length(files_to_test) temp - vector(mode=numeric, length = vl) colA - vector(mode=numeric, length = vl) colB - vector(mode=numeric, length = vl) tt - vector(mode=numeric, length = vl) ## Calculate P-values ## for (i in 1:3){ +temp1 - read.table(files_to_test[i], header=TRUE, sep= ) +numrows - nrow(temp1) +tt_pvalue - matrix(data=temp, nrow=numrows, ncol=vl) +colA - temp[,compareA] +colB - temp[,compareB] +tt - t.test(colA, colB, var.equal=TRUE) +tt_pvalue - tt$p.value + } Error in temp[, compareA] : incorrect number of dimensions -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use dump or write? or what?
Greetings all, Thanks for all your help so far. Let me give a better idea of what I am doing. I have hundreds of files that I need to plow thru with a t-test and correlation test. BTW, 'tempA' and tempB' are simply columns of numbers from a gene-chip experiment that spits out dna 'amounts'. So I have set up a loop to read the files and carry out the tests but need to save it for later inspection (and Jim H-you are probably right, for later inspection). By inspection I mean I don't know what I want to do with it yet, Remember: That's why they call it Research. So it seems that 'save/load' might be a good alternative for my work. Any suggestions, M On Sun, Jul 31, 2011 at 11:41 PM, Matt Curcio matt.curcio...@gmail.com wrote: Greetings all, I am calculating two t-test values for each of many files then save it to file calculate another set and append, repeat. But I can't figure out how to write it to file and then append subsequent t-tests. (maybe too tired ;} ) I have tried to use dump and file.append to no avial. ttest_results = tempfile() two_sample_ttest - t.test (tempA, tempB, var.equal = TRUE) welch_ttest - t.test (tempA, tempB, var.equal = FALSE) dump (two_sample_ttest, file = dumpdata.txt, append=TRUE) ttest_results - file.append (ttest_results, two_sample_ttest) Any suggestions, M -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Errors, driving me nuts
Greetings all, I am getting this error that is driving me nuts... (not a long trip, haha) I have a set of files and in these files I want to calculate ttests on rows 'compareA' and 'compareB' (these will change over time there I want a variable here). Also these files are in many different directories so I want a way filter out the junk... Anyway I don't believe that this is related to my errors but I mention it none the less. files_to_test - list.files (pattern = kegg.combine) for (i in 1:length (files_to_test)) { +raw_data - read.table (files_to_test[i], header=TRUE, sep= ) +tmpA - raw_data[,compareA] +tmpB - raw_data[,compareB] +tt - t.test (tmpA, tmpB, var.equal=TRUE) +tt_pvalue[i] - tt$p.value + } Error in tt_pvalue[i] - tt$p.value : object 'tt_pvalue' not found # I tried setting up a vector... # as.vector(tt_pvalue, mode=any) ### but NO GO file.name = paste(ttest.results., compareA, compareB, ) setwd(save_to) write.table(tt_pvalue, file=file.name, sep=\t ) Error in inherits(x, data.frame) : object 'tt_pvalue' not found # No idea?? What is going wrong?? M Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Appending 4 Digits On A File Name
Greetings all, I would like to append a 4 digit number suffix to the names of my files for later use. What I am using now only produces 1 or 2 or 3 or 4 digits. for (i in 1:1000) { temp - (kegg [i,]) temp - merge (temp, subrichcdt, by=gene) file.name - paste (kegg.subrichcdt., i, .txt, sep=) write.table(temp, file=file.name) } ### But I want: kegg.subrichcdt.0001.txt kegg.subrichcdt.0002.txt, ... Any suggestions M -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Appending 4 Digits On A File Name
Hmmm... Got this error Error in formatC(i, width = 4, format = d, flat = 0) : unused argument(s) (flat = 0) Any ideas, M On Sun, Jul 31, 2011 at 1:30 PM, Matt Curcio matt.curcio...@gmail.com wrote: Greetings all, I would like to append a 4 digit number suffix to the names of my files for later use. What I am using now only produces 1 or 2 or 3 or 4 digits. for (i in 1:1000) { temp - (kegg [i,]) temp - merge (temp, subrichcdt, by=gene) file.name - paste (kegg.subrichcdt., i, .txt, sep=) write.table(temp, file=file.name) } ### But I want: kegg.subrichcdt.0001.txt kegg.subrichcdt.0002.txt, ... Any suggestions M -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Appending 4 Digits On A File Name
Michael, Got it, thanks. Looking over the man file realized it is FLAG not flat. Cheers, M On Sun, Jul 31, 2011 at 2:26 PM, Matt Curcio matt.curcio...@gmail.com wrote: Hmmm... Got this error Error in formatC(i, width = 4, format = d, flat = 0) : unused argument(s) (flat = 0) Any ideas, M On Sun, Jul 31, 2011 at 1:30 PM, Matt Curcio matt.curcio...@gmail.com wrote: Greetings all, I would like to append a 4 digit number suffix to the names of my files for later use. What I am using now only produces 1 or 2 or 3 or 4 digits. for (i in 1:1000) { temp - (kegg [i,]) temp - merge (temp, subrichcdt, by=gene) file.name - paste (kegg.subrichcdt., i, .txt, sep=) write.table(temp, file=file.name) } ### But I want: kegg.subrichcdt.0001.txt kegg.subrichcdt.0002.txt, ... Any suggestions M -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Use dump or write? or what?
Greetings all, I am calculating two t-test values for each of many files then save it to file calculate another set and append, repeat. But I can't figure out how to write it to file and then append subsequent t-tests. (maybe too tired ;} ) I have tried to use dump and file.append to no avial. ttest_results = tempfile() two_sample_ttest - t.test (tempA, tempB, var.equal = TRUE) welch_ttest - t.test (tempA, tempB, var.equal = FALSE) dump (two_sample_ttest, file = dumpdata.txt, append=TRUE) ttest_results - file.append (ttest_results, two_sample_ttest) Any suggestions, M -- Matt Curcio M: 401-316-5358 E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Movie Question
Greetings all, I am wondering if anyone is aware of any studies that draw a relationship between an actor and their box office gross for a movie. In other words, is anybody aware of any databases that contain box office movie grosses, actor director info., advertising budget, etc, etc. [ I did a quick google search and did not find much right off but will keep looking.] I would assume that movie companies, and even actors managers must have done or (more realistically) have access to statistical analysis on the average returns for any one actor, director, etc, etc. I would think that this would be a great little project. Cheers, M Matt Curcio E: matt.curcio...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data.frame Vs Matrix Vs Array: Definitions Please
Hi All, I am learning R and having a little trouble with the usage and proper definitions of data.frames vs. matrix vs vectors. I have read many R tutorials, and looked over ump-teen 'cheat' sheets and have found that no one has articulated a really good definition of the differences between 'data.frames', 'matrix', and 'arrays' and even 'factors'. I realize that I might have missed someones R tutorial, and actually would like to receive 'your' most concise or most useful tutorial. Any help would be appreciated. My particular favorite explanation and helpful hint is from the 'R-Inferno'. Don't get me wrong... I think this pdf is great and some tables are excellent. Overall it is a very good primer but this one section leaves me puzzled. This quote belies the lack of hard and fast rules for what and when to use 'data.frames', 'matrix', and 'arrays'. It discusses ways in which to simplify your work. Here are a few possibilities for simplifying: • Don’t use a list when an atomic vector will do. • Don’t use a data frame when a matrix will do. • Don’t try to use an atomic vector when a list is needed. • Don’t try to use a matrix when a data frame is needed. Cheers, Matt C __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.xls??
Greeting all, I am having a little trouble finding the 'right' package that will read in .xls Excel spreadsheets. My Ubuntu base does not seem to have the ability to read them. Any suggestions? Cheers, M __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.