Hi, I´m newbie to the use of samtools and have some naive questions. I checked the forum but didn´t find (or at least did not use the right words) an appropriate answer.
I´m running a metagenomics project(ilumina 2x100 pair end) I have filtered low qualities reads and now want to get rid of the possible contaminants. In metagenomics projects it can happen that you obtain human contamination of your sample. I used bwa to get all the possible alignments from my dataset as follows: bwa mem -M -t10 ../db/ensembl_homo_sapiens_genome ../custom/2/trimmed/2.R1.trimmed.fq.gz ../custom/2/trimmed/2.R2.trimmed.fq.gz > ../custom/2/SCREEN/2.screen.sam.gz # Then the idea is to get the reads that didn´t align to the human genome. That´s where i´m sort of which strategy to follow. My first idea is to get the unmapped reads and unmapped mates. My second idea is within the alignements to the human genome there a probably bad quality alignement and therefore i can rescue some of the reads that give bad quality alignements as they are potentially good reads from later metagenome assembly. What i have done so far: Use the flag 13 (if i understood correctly) #0x0001 Paired in sequencing #0x0004 Query unmapped #0x0008 Mate unmapped #1 + 4 + 8 = -f 13 for unmapped paired reads with unmapped mates. samtools view -S -h -f 13 -b ../custom/2/SCREEN/2.screen.sam.gz | samtools sort -n -o - deleteme > ../custom/2/SCREEN /2.screen.sorted.unmappedall.bam Is it safe to say that i´m getting what i want ??? What do you think of the strategy to get rid of the possible human contanminants in my sample. I guess other have had similar strategies ?? thanks for your help. david ------------------------------------------------------------------------------ _______________________________________________ Samtools-help mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/samtools-help
