[R] duplicate rows with rbind in a loop

2015-10-07 Thread Cara Fiore
Dear R users,

I wrote a simple script to change the header lines in a fasta file that
contains DNA sequences in a format:

>header1
sequence1
>header2
sequence2

I am basically trying to replace the "header" in this file with a line from
another file (taxonomy file). In order to do that I have to find the
matching header in the taxonomy file.

The output should be in fasta format and it is, but the rows repeat so the
output file is huge and it looks like:

>header1
sequence1
>header1
sequence1
>header2
sequence2

The code I have is:

tax=read.table("taxonomy_file.txt", header=F, quote="", sep="\t")
tax2=data.frame(tax)

library("Biostrings")
seqs=readDNAStringSet("File.fasta")
header=names(seqs)
seqs2=paste(seqs)

new.final=NULL
i=1

#Go through tax file and match the header in tax file to header in seqs file
for(i in 1:length(tax[,1])){
  sampleID=NULL
  match=NULL
  sampleID=as.character(tax2[i,1])  #sample ID in taxonomy header
  match=which(sampleID==header) #index for match in header file
  if(match>0){
newH1=NULL
newH2=NULL
seqline=NULL
new.header=NULL
newH1=as.character(tax2[i,1])
newH2=as.character(tax2[i,2])
seqline=seqs2[match]
new.header=paste(">",newH1,"|",newH2, sep="")
new.final=rbind(new.final, new.header, seqline)
  }
  print(paste("percent complete =", round((i/length(tax2[,1]))*100,3),
"%",sep=" "))
  write.table(new.final, file="Test_output.txt", quote=FALSE, sep="\n",
col.names=FALSE, row.names=FALSE, append=TRUE)
  i=i+1
}


Something about rbind is repeating all of the rows every time it writes to
the output file. I have not been able to find anything about this online or
in the r help for rbind, although perhaps I am missing something obvious
about this.

I greatly appreciate any help with this!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] distance matrix from metaMDS

2014-08-29 Thread Cara Fiore
Yes I looked at this and I tried metaMDSdist but got an error and for some
reason I didn't try metaMDSredist which seems to be the right thing. So the
main thing I was confused about was what to call dist() on -i.e., getting
the correct ordinal distance, and then if I assume that the NMDS scores are
the coordinates, which I believe they are, then how do I call dist() on one
column? But, I just found the answer in a translation from matlab to R -
you have to use drop=FALSE (and hopefully I am calling dist() on the right
thing)

euc.dist.axis1=dist(NMDS2[,1, drop=FALSE], method=euclidean)


Maybe this is obvious to other folks but just in case there is anyone like
me out there I figured I'd write back. Thanks for the info I have never
written to this list before because I always found what I needed online. I
appreciate your help and patience.


Cara



On Thu, Aug 28, 2014 at 10:19 PM, David L Carlson dcarl...@tamu.edu wrote:

 Don't the functions metaMDSdist() and metaMDSredist() that are documented
 on the metaMDS manual page give you the distance matrix? If you want to
 compute the distances based on a single axis, you could use vegdist().

 David C

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Cara Fiore
 Sent: Thursday, August 28, 2014 7:02 PM
 To: r-help@r-project.org
 Subject: [R] distance matrix from metaMDS

 Dear R users,

 I would like to access the distance matrix generated by metaMDS as well as
 use the dist function to calculate the euclidean distance for each axis in
 the NMDS. I am having trouble finding a way to access these variables and
 any help is greatly appreciated.

 For the distance matrix I know I could just calculate the bray-curtis
 distance but it would be nice to know how to get it from the NMDS function.
 For the euclidean distance, the only thing I can find within metaMDS is
 the score function but there must be some way for me to call on/access the
 ordination distance for one axis right?

 The reason for this is I would like to do something like the stressplot
 function but for each axis.

 Thank you,
 Cara

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] distance matrix from metaMDS

2014-08-28 Thread Cara Fiore
Dear R users,

I would like to access the distance matrix generated by metaMDS as well as
use the dist function to calculate the euclidean distance for each axis in
the NMDS. I am having trouble finding a way to access these variables and
any help is greatly appreciated.

For the distance matrix I know I could just calculate the bray-curtis
distance but it would be nice to know how to get it from the NMDS function.
For the euclidean distance, the only thing I can find within metaMDS is the
score function but there must be some way for me to call on/access the
ordination distance for one axis right?

The reason for this is I would like to do something like the stressplot
function but for each axis.

Thank you,
Cara

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.