[R] Unusual separators

2011-08-16 Thread Matt Curcio
Hi all,
I have a list that I got from a web page that I would like to crunch.
Unfortunately, the list has some unusual separators in it.  I believe
the columns are separated by 1 space and 1 tab.  I tried to insert
this into the read.table( ..., sep= \t, ...) but got an error that
said something like 'only one byte separators can be used.
I have thought about using a gsub to 'swap out' the space + tab and
replace it with commas, etc but thought there might be another way.
Any suggestions?
M
-- 


Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reshape::rename package unable to install !?!

2011-08-07 Thread Matt Curcio
Greetings all,
I have been working with RStudio and R only for a little while.  I
came across a package called 'reshape' that helped me 'rename'
columns.  Unfortunately, my computer got hosed (too much playing with
linux too late at nite) and I had to re-install everything, BUT when I
tried to reinstall 'reshape' or 'reshape2' I COULDN't.  Is there a way
to get over this hurdle with reshape or is there another command I can
use.  I am stuck because my programs up to this point used 'rename'
and now I have to redo some work.
M
-- 


Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Which is more efficient?

2011-08-04 Thread Matt Curcio
Greetings all,
I am curious to know if either of these two sets of code is more efficient?

Example1:
 ## t-test ##
colA - temp [ , j ]
colB - temp [ , k ]
ttr - t.test ( colA, colB, var.equal=TRUE)
tt_pvalue [ i ] - ttr$p.value

or
Example2:
tt_pvalue [ i ] - t.test ( temp[ , j ], temp[ , k ], var.equal=TRUE)
-
I have three loops, i, j, k.
One to test the all of i files in a directory.  One to tease out
column j and compare it by means of t-test to column k in each of
the files.
---
for ( i in 1:num_files ) {
   temp - read.table ( files_to_test [ i ], header=TRUE, sep=\t)
   num_cols - ncol ( temp )
   ## Define Columns To Compare ##
   for ( j in 2 : num_cols ) {
  for ( k in 3 : num_cols ) {
  ## t-test ##
  colA - temp [ , j ]
  colB - temp [ , k ]
  ttr - t.test ( colA, colB, var.equal=TRUE)
  tt_pvalue [ i ] - ttr$p.value
  }
   }
}

I am a novice writer of code and am interested to hear if there are
any (dis)advantages to one way or the other.
M


Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error message for MCC

2011-08-03 Thread Matt Curcio
Greetings all,
I am getting an error message that is stifling me.
Any ideas?

 ## Define Directories ##
 load_from - /home/mcc/Dropbox/abrodsky/kegg_combine_data/
 save_to - /home/mcc/Dropbox/abrodsky/ttest_results/

 ###
 ## Define Columns To Compare ##
 compareA - log_b_rich
 compareB - Fc_cdt_rich_tot

 
 ## Collect Files To Compare ##
 setwd(load_from)
 files_to_test - list.files(pattern = combine.kegg)

 ##
 ## Initialize Variables ##
 vl - length(files_to_test)
 temp - vector(mode=numeric, length = vl)
 colA - vector(mode=numeric, length = vl)
 colB - vector(mode=numeric, length = vl)
 tt - vector(mode=numeric, length = vl)


 
 ## Calculate P-values ##
 for (i in 1:3){
+temp1 - read.table(files_to_test[i], header=TRUE, sep= )
+numrows - nrow(temp1)
+tt_pvalue - matrix(data=temp, nrow=numrows, ncol=vl)
+colA - temp[,compareA]
+colB - temp[,compareB]
+tt - t.test(colA, colB, var.equal=TRUE)
+tt_pvalue - tt$p.value
+ }
Error in temp[, compareA] : incorrect number of dimensions

-- 


Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use dump or write? or what?

2011-08-01 Thread Matt Curcio
Greetings all,
Thanks for all your help so far.
Let me give a better idea of what I am doing.  I have hundreds of
files that I need to plow thru with a t-test and correlation test.
BTW, 'tempA' and tempB' are simply columns of numbers from a gene-chip
experiment that spits out dna 'amounts'. So I have set up a loop to
read the files and carry out the tests but need to save it for later
inspection (and Jim H-you are probably right, for later inspection).
By inspection I mean I don't know what I want to do with it yet,
Remember: That's why they call it Research.

So it seems that 'save/load' might be a good alternative for my work.
Any suggestions,
M

On Sun, Jul 31, 2011 at 11:41 PM, Matt Curcio matt.curcio...@gmail.com wrote:
 Greetings all,
 I am calculating two t-test values for each of many files then save it
 to file calculate another set and append, repeat.
 But I can't figure out how to write it to file and then append
 subsequent t-tests.
 (maybe too tired ;} )
 I have tried to use dump and file.append to no avial.

 ttest_results = tempfile()

 two_sample_ttest - t.test (tempA, tempB, var.equal = TRUE)
 welch_ttest - t.test (tempA, tempB, var.equal = FALSE)

 dump (two_sample_ttest, file = dumpdata.txt, append=TRUE)
 ttest_results - file.append (ttest_results, two_sample_ttest)

 Any suggestions,
 M
 --



 Matt Curcio
 M: 401-316-5358
 E: matt.curcio...@gmail.com




-- 


Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Errors, driving me nuts

2011-08-01 Thread Matt Curcio
Greetings all,
I am getting this error that is driving me nuts... (not a long trip, haha)

I have a set of files and in these files I want to calculate ttests on
rows 'compareA' and 'compareB' (these will change over time there I
want a variable here). Also these files are in many different
directories so I want a way filter out the junk...  Anyway I don't
believe that this is related to my errors but I mention it none the
less.

 files_to_test - list.files (pattern = kegg.combine)
 for (i in 1:length (files_to_test)) {
+raw_data - read.table (files_to_test[i], header=TRUE, sep= )
+tmpA - raw_data[,compareA]
+tmpB - raw_data[,compareB]
+tt - t.test (tmpA, tmpB, var.equal=TRUE)
+tt_pvalue[i] - tt$p.value
+ }
Error in tt_pvalue[i] - tt$p.value : object 'tt_pvalue' not found
# I tried setting up a vector...
# as.vector(tt_pvalue, mode=any) ### but NO GO
 file.name = paste(ttest.results., compareA, compareB, )
 setwd(save_to)
 write.table(tt_pvalue, file=file.name, sep=\t )
Error in inherits(x, data.frame) : object 'tt_pvalue' not found
# No idea??

What is going wrong??
M


Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Appending 4 Digits On A File Name

2011-07-31 Thread Matt Curcio
Greetings all,
I would like to append a 4 digit number suffix to the names of my
files for later use.  What I am using now only produces 1 or 2 or 3 or
4 digits.


for (i in 1:1000) {
   temp - (kegg [i,])
   temp - merge (temp, subrichcdt, by=gene)
  file.name - paste (kegg.subrichcdt., i, .txt, sep=)
  write.table(temp, file=file.name)
}
###
But I want:
kegg.subrichcdt.0001.txt
kegg.subrichcdt.0002.txt, ...


Any suggestions
M
-- 


Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Appending 4 Digits On A File Name

2011-07-31 Thread Matt Curcio
Hmmm...
Got this error

Error in formatC(i, width = 4, format = d, flat = 0) :
  unused argument(s) (flat = 0)

Any ideas,
M

On Sun, Jul 31, 2011 at 1:30 PM, Matt Curcio matt.curcio...@gmail.com wrote:
 Greetings all,
 I would like to append a 4 digit number suffix to the names of my
 files for later use.  What I am using now only produces 1 or 2 or 3 or
 4 digits.

 
 for (i in 1:1000) {
   temp - (kegg [i,])
   temp - merge (temp, subrichcdt, by=gene)
      file.name - paste (kegg.subrichcdt., i, .txt, sep=)
      write.table(temp, file=file.name)
 }
 ###
 But I want:
 kegg.subrichcdt.0001.txt
 kegg.subrichcdt.0002.txt, ...


 Any suggestions
 M
 --


 Matt Curcio
 M: 401-316-5358
 E: matt.curcio...@gmail.com




-- 


Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Appending 4 Digits On A File Name

2011-07-31 Thread Matt Curcio
Michael,
Got it, thanks.  Looking over the man file realized it is FLAG not flat.
Cheers,
M

On Sun, Jul 31, 2011 at 2:26 PM, Matt Curcio matt.curcio...@gmail.com wrote:
 Hmmm...
 Got this error

 Error in formatC(i, width = 4, format = d, flat = 0) :
  unused argument(s) (flat = 0)

 Any ideas,
 M

 On Sun, Jul 31, 2011 at 1:30 PM, Matt Curcio matt.curcio...@gmail.com wrote:
 Greetings all,
 I would like to append a 4 digit number suffix to the names of my
 files for later use.  What I am using now only produces 1 or 2 or 3 or
 4 digits.

 
 for (i in 1:1000) {
   temp - (kegg [i,])
   temp - merge (temp, subrichcdt, by=gene)
      file.name - paste (kegg.subrichcdt., i, .txt, sep=)
      write.table(temp, file=file.name)
 }
 ###
 But I want:
 kegg.subrichcdt.0001.txt
 kegg.subrichcdt.0002.txt, ...


 Any suggestions
 M
 --


 Matt Curcio
 M: 401-316-5358
 E: matt.curcio...@gmail.com




 --


 Matt Curcio
 M: 401-316-5358
 E: matt.curcio...@gmail.com




-- 


Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Use dump or write? or what?

2011-07-31 Thread Matt Curcio
Greetings all,
I am calculating two t-test values for each of many files then save it
to file calculate another set and append, repeat.
But I can't figure out how to write it to file and then append
subsequent t-tests.
(maybe too tired ;} )
I have tried to use dump and file.append to no avial.

ttest_results = tempfile()

two_sample_ttest - t.test (tempA, tempB, var.equal = TRUE)
welch_ttest - t.test (tempA, tempB, var.equal = FALSE)

dump (two_sample_ttest, file = dumpdata.txt, append=TRUE)
ttest_results - file.append (ttest_results, two_sample_ttest)

Any suggestions,
M
-- 



Matt Curcio
M: 401-316-5358
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Movie Question

2011-01-22 Thread Matt Curcio
Greetings all,
I am wondering if anyone is aware of any studies that draw a
relationship between an actor and their box office gross for a movie.
In other words, is anybody aware of any databases that contain box
office movie grosses, actor  director info., advertising budget, etc,
etc. [ I did a quick google search and did not find much right off but
will keep looking.]  I would assume that movie companies, and even
actors managers must have done or (more realistically) have access to
statistical analysis on the average returns for any one actor,
director, etc, etc.

I would think that this would be a great little project.
Cheers,
M

Matt Curcio
E: matt.curcio...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data.frame Vs Matrix Vs Array: Definitions Please

2010-10-26 Thread Matt Curcio
Hi All,
I am learning R and having a little trouble with the usage and proper
definitions of data.frames vs. matrix vs vectors. I have read many R
tutorials, and looked over ump-teen 'cheat' sheets and have found that
no one has articulated a really good definition of the differences
between 'data.frames', 'matrix', and 'arrays' and even 'factors'.  I
realize that I might have missed someones R tutorial, and actually
would like to receive 'your' most concise or most useful tutorial.
Any help would be appreciated.

My particular favorite explanation and helpful hint is from the
'R-Inferno'.  Don't get me wrong...  I think this pdf is great and
some tables are excellent. Overall it is a very good primer but this
one section leaves me puzzled.  This quote belies the lack of hard and
fast rules for what and when to use 'data.frames', 'matrix', and
'arrays'.  It discusses ways in which to simplify your work.

Here are a few possibilities for simplifying:
• Don’t use a list when an atomic vector will do.
• Don’t use a data frame when a matrix will do.
• Don’t try to use an atomic vector when a list is needed.
• Don’t try to use a matrix when a data frame is needed.

Cheers,
Matt C

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.xls??

2010-10-09 Thread Matt Curcio
Greeting all,
I am having a little trouble finding the 'right' package that will
read in .xls Excel spreadsheets. My Ubuntu base does not seem to have
the ability to read them.

Any suggestions?
Cheers,
M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.