Yes, Fabricia, read.dna will only read your first dataset, but it has an
option that allows you to skip a certain number of lines before the
dataset. If you know exactly how many datasets (not lines) you have in
your input file, you could try this function:
read.multi.dna<-function(file,N){
X<-list()
skip<-0
for(i in 1:N){
if(i==1) X[[i]]<-read.dna(file,format="sequential")
else {
skip<-skip+nrow(X[[i-1]])+1
X[[i]]<-read.dna(file,format="sequential",skip=skip)
}
}
X
}
Then, if you want to get a NJ tree for each dataset, you could do:
X<-read.multi.dna("examplefile.dna",N=100) ## for instance
library(phangorn)
trees<-lapply(X,function(x) NJ(dist.dna(x)))
class(trees)<-"multiPhylo"
which I think means you would be using a JC model to get the distances.
Of course, remember that the estimated trees are unrooted.
All the best, Liam
All the best, Liam
Liam J. Revell, Assistant Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org
On 3/13/2014 1:30 PM, Fabricia Nascimento wrote:
Hi Gwennaël,
Thanks very much!
I did look at what might be reading which I get the following:
50 DNA sequences in binary format stored in a matrix.
All sequences of same length: 10000
Labels: Seq0 Seq49 Seq48 Seq47 Seq46 Seq45 ...
Base composition:
   a    c    g    t
0.254 0.244 0.251 0.251Â
It does appear to be reading only the first set of sequences...
But thanks!
Fabricia.
________________________________
De: Gwennaël Bataille <gwennael.batai...@uclouvain.be>
Enviadas: Quinta-feira, 13 de Março de 2014 13:18
Assunto: Re: [R-sig-phylo] Reading sets of multiple DNA sequence alignments
Dear Fabricia,
I don't know if this will help or not since I'm not used to import
multiple alignments in R.
Let's say that you import your alignment as a variable that you call
"alignment" :
alignment <- read.dna("NAME_OF_YOUR_FILE.extension")
Then, I would say that you could try :
alignment[1]
or
alignment[[1]]
If this does not work, try to type in :
alignment
and see what you get. R tends to inform you of the syntax you should
use. For example, let's imagine a list called alignment with
different sections, and sequences 1, 2, 3... If R tells :
alignment
$names
... (whatever names are stored)
$seq
$seq[[1]]
aaaactgcg....
$seq[[2]]
actgggac....
...
Then, in order to get all the sequences, you have to type in :
alignment$seq
To only have the second sequence : alignment$seq[[1]]
and so on
Hope this helps !
Gwennaël
Le 13/03/2014 17:56, Fabricia Nascimento a écrit :
Hi,
ÂÂ
I am new to R and I am
trying to read a file containing sets of multiple sequence alignments as in
the example below. The difference is that for my files, each set of sequences
may contain up to 50 sequences of 10,000bp each. ÂÂ
Example:  5 10
Seq0    ATCGTTATTA
Seq1    AGCGTTATTA
Seq15   ATCCTTATTA
Seq20   ATCGATATTA
Seq30   ATCGTAATTA
 5 10
Seq1    ATCGTTATTA
Seq2    ATCGTTATTA
Seq4    ATCCTTATTA
Seq5   ATCGTTGTTA
Seq7   ATCGTTACCA
 5 10
Seq1    ATCGTTATTA
Seq2    ATTGTTATTA
Seq4    ATCGTTATTA
Seq19   AACGTTATTA
Seq2    ATCGAAATTA
ÂÂ
ÂÂ
I did try to use read.dna
from the ape package. Although it seems to read the whole file, I don’t
know
how to access the individual multiple alignments and create a NJ tree for each
of these alignments.
ÂÂ
I would appreciate very much
if someone knows of any function that can do that. Or whether I will need to
pre-process my files first.
I will need to create a NJ
tree for each set of alignment and then I will use the apTreeshape package in R
to calculate index for tree shape.
ÂÂ
[[elided Yahoo spam]]
Fabricia. [[alternative HTML version deleted]]
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/