Hi Alison,


On Sat, 2015-11-28 at 13:04 +0000, Wright, Alison wrote:
> Hi Petr,
> 
> Thanks very much for all of your help.
> 
> I would be very grateful for some help with the questions I have. Thanks again
> 
> 1. Why I have read that setting -q 1  when running mpileup means only
> uniquely mapping reads are considered?

> I have this read
> HISEQ2500-09:92:H8PJKADXX:1:1101:5415:2846      99      NC_024331.1     
> 24859997        4       76M3D1M2D23M    =       24860151        255     
> GATATTTGAGTTAATGTATGATGTGAAATGTGACTTTTTATTACATACTGTATCGATTATGGGACTATAACTCAACATCAAGTAAGGCTGCTGTCACTTA
>     
> CCCFFFFFHHHHHJJJIJJJJJJIJJJJJJJJJJJJJJJJJJJJJJJJJJIJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJHHHHHHFFFFEEEEEEED
>     AS:i:-37        XS:i:-58        XN:i:0  XM:i:2  XO:i:2  XG:i:5  NM:i:7  
> MD:Z:48C23T3^TCA1^GG23  YS:i:-8 YT:Z:CP

> This is clearly not a uniquely mapping read (due to presence of XS
> flag) but it has a MAPQ score of 4. Therefore, this would not be
> filtered out using -q 1?


Some (most?) aligners set mapping quality to 0 when the placement of a
read is ambiguous. The XS flag is not checked by samtools.



> 2. I see the option -A, --count-orphans. If you don't set this then I
> understand anomalous read pairs are skipped in variant calling. If
> mpileup isn't reading the flags about mate status then how does it know
> which are anomalous read pairs?

> I use the following mpileup command after sorting the filtered bam file
> with samtools: samtools mpileup -BAd 10000000

> With this command, is it ok to use my custom approach of filtering the
> bam files? The UNMAP,SECONDARY,QCFAIL,DUP should not be altered, the
> only thing that changes is the status of the mate. By specifying -A
> then presumably I am telling mpileup to ignore the status of the mate?

With the -A option, all reads with the PAIRED bit set (paired-end
sequencing technology) are required to have also the PROPER_PAIR bit
set. This filter is performed after the --rf and --ff filters are
applied.


> 3.  does pileup ignore reads with the flags UNMAP,SECONDARY,QCFAIL,DUP
> by default or only when --ff, --excl-flags STR|INT is specified?

--excl-flags UNMAP,SECONDARY,QCFAIL,DUP is the default behaviour.


> 4. I get the following message when I run mpileup
> samtools mpileup -BAd 10000000 -f reference file sample1.bam sample2.bam 
> sample3.bam etc >cohort.pileup
> 
> [mpileup] 19 samples in 19 input files
> (mpileup) Max depth is above 1M. Potential memory hog!
> 
> Will I be missing sites with high depth?


If the depth is too big, some reads may be ignored, but all sites are
considered.

Petr



> Thanks very much!
> 
> Alison
> 
> ---
> 
> Dr Alison Wright
> 
> Postdoctoral Research Associate
> University College London
> Department of Genetics, Evolution and Environment
> The Darwin Building
> Gower Street
> London WC1E 6BT
> 
> ​www.ucl.ac.uk/mank-group/




-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to