Re: [R-sig-phylo] subset DNAbin

Liam J. Revell Mon, 05 Sep 2016 05:25:54 -0700

Hi Kirston.

Note that read.fas is in the package ips.

This solution will work only if your sequences are all of the samelength, in which case your sequences are stored in a matrix & the same

result would be obtained using:

obj<-read.dna("filename",format="fasta")
obj<-obj[grep("1101",rownames(obj)),]

read.dna(...,format="fasta") or read.fas will return a list if sequencesare of different lengths, in which case you can do:


obj<-obj[grep("1101",names(obj))]

To check if your "DNAbin" object is a list or a matrix do:

is.list(obj)

It is also possible to convert between the two object modes usingas.list or as.matrix, if your sequences match in length.


All the best, Liam

Liam J. Revell, Associate Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 9/5/2016 1:38 AM, Andreas Kolter wrote:

Hello Kirston,
try this:

x <- read.fas("example.fasta")
y <- x[grep("1011",rownames(x)),]

Greetings,
Andreas

Am 2016-09-05 08:07, schrieb Kirston Barton:

Hi,

I have my data in a fasta file and am importing it into R using
read.dna, which creates a DNAbin matrix object. I would like to subset
my file depending on the sequence name so that I can generate the
nucleotide pairwise distance using dist.dna. I have attempted to do
this using grep, but all I get is a list of the numbers of the
sequences with the correct name and no sequences or sequence names.
Does anyone have a suggestions for an easy way to do this?

For example, my DNAbin object has the following row names:

[1] "01011-DNA1.Contig1"    "01011-DNA11.Contig1"
"01011-DNA12.Contig1"   "01011-DNA13.Contig1"   "01011-DNA14.Contig1"
  [6] "01011-DNA16.Contig1"   "01011-DNA17.Contig1"
"01011-DNA18.Contig1"   "01011-DNA19.Contig1"   "01011-DNA2.Contig1"
 [11] "01011-DNA20.Contig1"   "01011-DNA21.Contig1"
"01011-DNA22.Contig1"   "01011-DNA23.Contig1"   "01011-DNA24.Contig1"
 [16] "01011-DNA25.Contig1"   "01011-DNA26.Contig1"
"0103-PRNA2.Contig1"    "01011-DNA3.Contig1"    "01011-DNA33.Contig1"
 [21] "01011-DNA4.Contig1"    "01011-DNA5.Contig1"
"01011-DNA6.Contig1"    "01011-DNA7.Contig1"    "01011-DNA8.Contig1"
 [26] "01011-DNA9.Contig1"    "01011-RNA10.Contig1"
"01011-RNA13.Contig1"   "01011-RNA14.Contig1"   "01011-RNA17.Contig1"
 [31] "01011-RNA18.Contig1"   "01011-RNA19.Contig1"
"01011-RNA21.Contig1"   "01011-RNA23.Contig1"   "01011-RNA24.Contig1"
 [36] "01011-RNA26.Contig1"   "01011-RNA28.Contig1"
"01011-RNA29.Contig1"   "01011-RNA30.Contig1"   "01011-RNA31.Contig1"
 [41] "01011-RNA32.Contig1"   "01011-RNA33.Contig1"
"01011-RNA35.Contig1"   "01011-RNA38.Contig1"   "01011-RNA4.Contig1"
 [46] "01011-RNA5.Contig1"    "01011-RNA6.Contig1"
"01011-RNA8.Contig1"    "01011-RNA9.Contig1"
"0102A-CRNA103.Contig1"
 [51] "0102A-CRNA105.Contig1" "0102A-CRNA110.Contig1"
"0102A-CRNA113.Contig1" "0102A-CRNA115.Contig1"
"0102A-CRNA118.Contig1"
 [56] "0102a-DNA10.Contig1”


I would like a new DNAbin object with sequences that have 1011
anywhere in their row name.

Please let me know if i have left out any pertinent information. Thank
you in advance for any suggestions or help with this matter.

Kind regards,
Kirston
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/


_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] subset DNAbin

Reply via email to