[R] duplicate rows with rbind in a loop
Dear R users, I wrote a simple script to change the header lines in a fasta file that contains DNA sequences in a format: >header1 sequence1 >header2 sequence2 I am basically trying to replace the "header" in this file with a line from another file (taxonomy file). In order to do that I have to find the matching header in the taxonomy file. The output should be in fasta format and it is, but the rows repeat so the output file is huge and it looks like: >header1 sequence1 >header1 sequence1 >header2 sequence2 The code I have is: tax=read.table("taxonomy_file.txt", header=F, quote="", sep="\t") tax2=data.frame(tax) library("Biostrings") seqs=readDNAStringSet("File.fasta") header=names(seqs) seqs2=paste(seqs) new.final=NULL i=1 #Go through tax file and match the header in tax file to header in seqs file for(i in 1:length(tax[,1])){ sampleID=NULL match=NULL sampleID=as.character(tax2[i,1]) #sample ID in taxonomy header match=which(sampleID==header) #index for match in header file if(match>0){ newH1=NULL newH2=NULL seqline=NULL new.header=NULL newH1=as.character(tax2[i,1]) newH2=as.character(tax2[i,2]) seqline=seqs2[match] new.header=paste(">",newH1,"|",newH2, sep="") new.final=rbind(new.final, new.header, seqline) } print(paste("percent complete =", round((i/length(tax2[,1]))*100,3), "%",sep=" ")) write.table(new.final, file="Test_output.txt", quote=FALSE, sep="\n", col.names=FALSE, row.names=FALSE, append=TRUE) i=i+1 } Something about rbind is repeating all of the rows every time it writes to the output file. I have not been able to find anything about this online or in the r help for rbind, although perhaps I am missing something obvious about this. I greatly appreciate any help with this! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distance matrix from metaMDS
Yes I looked at this and I tried metaMDSdist but got an error and for some reason I didn't try metaMDSredist which seems to be the right thing. So the main thing I was confused about was what to call dist() on -i.e., getting the correct ordinal distance, and then if I assume that the NMDS scores are the coordinates, which I believe they are, then how do I call dist() on one column? But, I just found the answer in a translation from matlab to R - you have to use drop=FALSE (and hopefully I am calling dist() on the right thing) euc.dist.axis1=dist(NMDS2[,1, drop=FALSE], method=euclidean) Maybe this is obvious to other folks but just in case there is anyone like me out there I figured I'd write back. Thanks for the info I have never written to this list before because I always found what I needed online. I appreciate your help and patience. Cara On Thu, Aug 28, 2014 at 10:19 PM, David L Carlson dcarl...@tamu.edu wrote: Don't the functions metaMDSdist() and metaMDSredist() that are documented on the metaMDS manual page give you the distance matrix? If you want to compute the distances based on a single axis, you could use vegdist(). David C -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Cara Fiore Sent: Thursday, August 28, 2014 7:02 PM To: r-help@r-project.org Subject: [R] distance matrix from metaMDS Dear R users, I would like to access the distance matrix generated by metaMDS as well as use the dist function to calculate the euclidean distance for each axis in the NMDS. I am having trouble finding a way to access these variables and any help is greatly appreciated. For the distance matrix I know I could just calculate the bray-curtis distance but it would be nice to know how to get it from the NMDS function. For the euclidean distance, the only thing I can find within metaMDS is the score function but there must be some way for me to call on/access the ordination distance for one axis right? The reason for this is I would like to do something like the stressplot function but for each axis. Thank you, Cara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] distance matrix from metaMDS
Dear R users, I would like to access the distance matrix generated by metaMDS as well as use the dist function to calculate the euclidean distance for each axis in the NMDS. I am having trouble finding a way to access these variables and any help is greatly appreciated. For the distance matrix I know I could just calculate the bray-curtis distance but it would be nice to know how to get it from the NMDS function. For the euclidean distance, the only thing I can find within metaMDS is the score function but there must be some way for me to call on/access the ordination distance for one axis right? The reason for this is I would like to do something like the stressplot function but for each axis. Thank you, Cara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.