Thank You, Andreas, yes, I try to manipulate an alignment. This is nice trick, although it returns empty alignment regardless threshold value used (I do have some data in the alignment:-)... Have a nice weekend, V.
Dne pátek 27. října 2017 17:02:45 CEST jste napsal(a): > Hello V. > Because you speak of columns I assume you are handling an alignment, > right? If you handle an alignment all sequences have the same length and > you can do as.matrix > > Like this? > > library(magrittr) > #maximum number of n's > thresh <- 0.005 #0.5% > seq <- as.matrix(seq) > temp <- seq %>% sapply(.,grep,pattern="n") %>% unlist(.,use.names=F) %>% > table > seq[,-(names(temp)[which(temp/ncol(seq)>thresh)] %>% as.integer)] > > Greetings, > Andreas > > Am 2017-10-27 16:25, schrieb Vojtěch Zeisek: > > Hello, > > I checked ape::del.colgapsonly, ips::deleteGaps and > > ips::deleteEmptyCells. > > They delete columns containing missing values, but I need also to > > delete > > columns containing base "N" (all columns with amount of Ns over certain > > threshold). > > Actually, ips::deleteEmptyCells has option nset=c("-", "n", "?"), so it > > is suppose to remove columns/rows containing only the given characters, > > but if I > > use it and export data (ape::write.dna or ape::write.nexus.data), some > > samples consist only of N characters... > > The DNAbin object being processed was originally imported from VCF > > using vcfR (read.vcfR(file="my.vcf") and converted: vcfR2DNAbin(x=myvcf, > > consensus=TRUE, > > extract.haps=FALSE, unphased_as_NA=FALSE)). > > I checked source code of the above functions, but they seem to only > > count NAs > > and then drop respective columns. And as sequences in DNAbin are stored > > in binary format, I'm bit struggled here... :( > > Any idea how to remove columns with given portion of "N" in sequences? > > Sincerely, > > V. -- Vojtěch Zeisek https://trapa.cz/en/ Department of Botany, Faculty of Science Charles University, Prague, Czech Republic https://www.natur.cuni.cz/biology/botany/ Institute of Botany, Czech Academy of Sciences Průhonice, Czech Republic http://www.ibot.cas.cz/en/
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/