Re: [R] Help with ape - read.GenBank()

2011-01-28 Thread Ben Bolker
James Meadow jfmeadow at gmail.com writes:

 
 Hi,
 
 I am trying to work with the ape package, and there is one thing I am
 struggling with.  When calling the *read.GenBank()* function, I can get it
 to work with an object created like this:
 
 *x - c(AY395554,AY611035, ...)*
 *read.GenBank(x)*
 
 However, I am trying to use the function to fetch several hundred sequences
 at once.  So I have been testing with small files, but cannot get the right
 format.  I tried to enter accession numbers with or without headers as a
 .csv and just as a text file, and I have tried *as.vector*, but I always get
 this error:
 
 *x - read.csv(numbers.csv,head=T)*
 *attach(x)*
 *x.vect - as.vector(t(x)**)*
 *x.seqs - read.GenBank(x.vect)*
 
 Error in FI[i]:LA[i] : NA/NaN argument
 
 I have been trying this with an identical set of accession numbers, and it
 always works with the c() vector but not with the table, so the content is
 not the problem.  I assume I am just missing a simple data structure detail,
 but I am at a loss.  I have even tried a c() vector of more than a hundred
 accession numbers, and it worked quite well, so the length is not a
 problem.

It's not entirely clear how your CSV file is structured, but I'm supposing
that you have all your accession numbers in a single row (with appropriately
matching headers).  (It would be a little bit more R-ish if you had
them all in a column, and would make life a little easier, but let's
work with what you have.)

After you use read.csv(), try str(x)

This should tell you that you have a data frame with a lot of columns,
each of which is a factor with a single element and a single level.

From there try 

m - as.matrix(x)

str(m) should tell you you have a 1-row matrix of type 'character'

then 

v - c(m)

should drop the dimensions and leave you with a character vector
(try str(v) again to be sure)

then

read.GenBank(v)

should work

For future phylogenetic/comparative-methods type questions,
it's probably better to write to the r-sig-phylo mailing list ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with ape - read.GenBank()

2011-01-28 Thread James Meadow
Thanks, Ben!  That worked.  I will join r-sig-phylo, I didn't know it was
there.

James

On Fri, Jan 28, 2011 at 1:46 PM, Ben Bolker bbol...@gmail.com wrote:

 James Meadow jfmeadow at gmail.com writes:

 
  Hi,
 
  I am trying to work with the ape package, and there is one thing I am
  struggling with.  When calling the *read.GenBank()* function, I can get
 it
  to work with an object created like this:
 
  *x - c(AY395554,AY611035, ...)*
  *read.GenBank(x)*
 
  However, I am trying to use the function to fetch several hundred
 sequences
  at once.  So I have been testing with small files, but cannot get the
 right
  format.  I tried to enter accession numbers with or without headers as a
  .csv and just as a text file, and I have tried *as.vector*, but I always
 get
  this error:
 
  *x - read.csv(numbers.csv,head=T)*
  *attach(x)*
  *x.vect - as.vector(t(x)**)*
  *x.seqs - read.GenBank(x.vect)*
 
  Error in FI[i]:LA[i] : NA/NaN argument
 
  I have been trying this with an identical set of accession numbers, and
 it
  always works with the c() vector but not with the table, so the content
 is
  not the problem.  I assume I am just missing a simple data structure
 detail,
  but I am at a loss.  I have even tried a c() vector of more than a
 hundred
  accession numbers, and it worked quite well, so the length is not a
  problem.

 It's not entirely clear how your CSV file is structured, but I'm supposing
 that you have all your accession numbers in a single row (with
 appropriately
 matching headers).  (It would be a little bit more R-ish if you had
 them all in a column, and would make life a little easier, but let's
 work with what you have.)

 After you use read.csv(), try str(x)

 This should tell you that you have a data frame with a lot of columns,
 each of which is a factor with a single element and a single level.

 From there try

 m - as.matrix(x)

 str(m) should tell you you have a 1-row matrix of type 'character'

 then

 v - c(m)

 should drop the dimensions and leave you with a character vector
 (try str(v) again to be sure)

 then

 read.GenBank(v)

 should work

 For future phylogenetic/comparative-methods type questions,
 it's probably better to write to the r-sig-phylo mailing list ...

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
James Meadow
Land Resources and Environmental Sciences
Montana State University
(406) 370-7157
jfmea...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.