Ah thanks for the clarification.

OK, almost there...

The header of the output provides the following explanation:
##FORMAT=<ID=DPR,Number=R,Type=Integer,Description="Number of high-quality
bases observed for each allele">
##INFO=<ID=DPR,Number=R,Type=Integer,Description="Number of high-quality
bases observed for each allele">


Here is a possible SNP:

1  150900178 .  A G,<X> 0       .
DP=8;I16=4,3,1,0,246,13780,41,1681,140,2800,20,400,60,1286,25,625;QS=0.84375,0.15625,0;SGB=-0.379885;RPB=1;MQB=1;MQSB=0.900802;BQB=1;MQ0F=0;DPR=7,1,85
 PL:DPR  0,4,64,21,67,75:7,1,85

Can you explain why are there three integers: 7,1,85, when (if I've parsed
the output correctly) there are 2 nucleotides present (ref, A and alt, G).
Also, what is the <X>?  Please let me know if I'm missing some
documentation that describes these fields, as it would be great to get a
bit more oriented.

Here is another one, with four integers corresponding to 3 possible alleles:

1  171509318 .  A C,G,<X>       0       .
DP=151;I16=69,80,0,2,5229,229197,7,49,2980,59600,40,800,2696,60022,36,746;QS=0.995903,0.00260708,0.00148976,0;VDB=0.36;SGB=-0.453602;RPB=0.369128;MQB=1;MQSB=1;BQB=0.177852;MQ0F=0;DPR=149,1,1,1546
    PL:DPR  0,255,215,255,218,215,255,218,218,215:149,1,1,1546



Cheers,
Jonathan





On Fri, Oct 3, 2014 at 3:15 AM, Petr Danecek <[email protected]> wrote:

> That looks like output from bcftools, not samtools mpileup:
>
> $ samtools mpileup 2>&1 | grep output-tags
>   -t, --output-tags LIST  optional tags to output:
> DP,DPR,DV,DP4,INFO/DPR,SP []
>
> Petr
>
>
> On Thu, 2014-10-02 at 19:35 -0400, jbr950 wrote:
> > Hi Petr,
> >     Pardon me, I upgraded to the latest (htslib_1.0) and the -t option
> > you suggested didn't result in any output.  Appears that
> >
> >
> >     -t, --targets [^]<region>           similar to -r but streams
> > rather than index-jumps. Exclude regions with "^" prefix
> >
> >     -r, --regions <region>              restrict to comma-separated
> > list of regions
> >
> >
> >
> > Any thoughts?
> >
> >
> > Jonathan
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Oct 2, 2014 at 3:49 AM, Petr Danecek <[email protected]> wrote:
> >         Hi Jonathan,
> >
> >         new versions of mpileup have the -t DPR,INFO/DPR option which
> >         outputs
> >         depth for each observed base.
> >
> >         Cheers,
> >         petr
> >
> >
> >         On Wed, 2014-10-01 at 12:49 -0400, jbr950 wrote:
> >         > Hi Petr,
> >         >     Thanks for your time thus far.  I'm trying to query some
> >         bam files
> >         > at specified locations to have a look at the allele
> >         frequencies.  The
> >         > motivation is to 'validate' variants called in one dataset,
> >         in a novel
> >         > dataset using a more simple approach.  Ideally, I'd get the
> >         raw counts
> >         > of the number of reads mapping to A,T,C,G, and/or indel (if
> >         detected).
> >         >
> >         >
> >         > I tried mpileup two ways (with -cg in bcftools view to call
> >         variants,
> >         > and without), and  neither seems to provide me the data I
> >         would like
> >         > in the corresponding output:
> >         >
> >         > samtools mpileup  -BQ0 -d10000000 -l $variantBed -uf ${ref}
> >         $bamFile |
> >         > $bcftools view - > no_cg.vcf
> >         >
> >         > samtools mpileup  -BQ0 -d10000000 -l $variantBed -uf ${ref}
> >         $bamFile |
> >         > $bcftools view -cg -  > cg.vcf
> >         >
> >         >
> >         >
> >         > Then I check the context around 2 known variants:
> >         >
> >         >
> >         > Variant 1
> >         > grep -C 1 105213319 cg.vcf
> >         >
> >         > 14      105213318       .      A        .       117  .
> >         > DP=29;AF1=0;AC1=0;DP4=14,15,0,0;MQ=20;FQ=-114   PL      0
> >         > 14      105213319       .      G        C       25   .
> >         >
> >
>  DP=28;VDB=0.0398;AF1=0.5;AC1=1;DP4=10,9,4,5;MQ=20;FQ=28;PV4=1,1,1,1
> >         > GT:PL:GQ        0/1:55,0,128:58
> >         > 14      105213320       .      T        .       114  .
> >         > DP=28;AF1=0;AC1=0;DP4=14,14,0,0;MQ=20;FQ=-111   PL      0
> >         >
> >         >
> >         > Here, the variant was called and the DP4 field tells me the
> >         number of
> >         > reference and non-reference (presumably "C") alleles.  I am
> >         assuming
> >         > that there are 0 T's and A's.
> >         >
> >         >
> >         >
> >         >
> >         > grep -C 1 105213319 no_cg.vcf
> >         > 14      105213318       .      A        X       0    .
> >         >
> >         DP=29;I16=14,15,0,0,979,34421,0,0,580,11600,0,0,511,11223,0,0
>  PL
> >         >    0,87,198
> >         > 14      105213319       .      G        C,T,X   0    .
> >         >
> >
>  
> DP=28;I16=10,9,4,5,645,23047,338,12742,380,7600,180,3600,339,7223,177,4179;VDB=0.0398
>  PL      55,0,128,96,140,195,112,152,198,201
> >         > 14      105213320       .      T        X       0    .
> >         >
> >         DP=28;I16=14,14,0,0,924,32326,0,0,560,11200,0,0,520,11560,0,0
>  PL
> >         >    0,84,195
> >         >
> >         >
> >         > Here, I see that that the non-ref alleles consist of C's ant
> >         T's,
> >         > although I don't see any way to determine the relative
> >         proportion or
> >         > absolute number of each.
> >         >
> >         >
> >         >
> >         >
> >         >
> >         > Trying Variant 2:
> >         >  grep -C 1 154574018 cg.vcf
> >         > 1       154574017       .      T        .       283  .
> >         >
> >         DP=274;VDB=0.0365;AF1=0;AC1=0;DP4=146,128,0,0;MQ=20;FQ=-282
> >            PL
> >         >    0
> >         > 1       154574018       .      C        .       113  .
> >         >
> >
>  DP=273;VDB=0.0365;AF1=0;AC1=0;DP4=110,100,31,32;MQ=20;FQ=-110;PV4=0.67,1,1,1
>   PL      0
> >         > 1       154574019       .      T        .       283  .
> >         >
> >         DP=278;VDB=0.0365;AF1=0;AC1=0;DP4=142,136,0,0;MQ=20;FQ=-282
> >            PL
> >         >    0
> >         >
> >         >
> >         > This variant was not called at all.  I may be willing to
> >         "assume" that
> >         > the 63 'non-ref' reads reported in the DP4 field correspond
> >         to my
> >         > variant, but it would be nice to be sure.
> >         >
> >         >
> >         >
> >         >
> >         >
> >         > grep -C 1 154574018 no_cg.vcf
> >         > 1       154574017       .      T        X       0    .
> >         >
> >
>  
> DP=274;I16=146,128,0,0,10042,376816,0,0,5463,109209,0,0,5159,117099,0,0;VDB=0.0365
>     PL      0,255,255
> >         > 1       154574018       .      C        G,X     0    .
> >         >
> >
>  
> DP=273;I16=110,100,31,32,7610,283160,2311,86963,4183,83609,1260,25200,3918,88780,1242,28290;VDB=0.0365
> PL      0,83,117,255,255,220
> >         > 1       154574019       .      T        X       0    .
> >         >
> >
>  
> DP=278;I16=142,136,0,0,10104,376490,0,0,5543,110809,0,0,5167,116999,0,0;VDB=0.0365
>     PL      0,255,255
> >         >
> >         >
> >         >
> >         >
> >         > Here, the C to G is reported, but I again I only have total
> >         read depth
> >         > via the DP field, no breakdown by nucleotide (or am I
> >         missing it?).
> >         >
> >         >
> >         > It's true I'm using Samtools 0.1.18.  If upgrading to the
> >         latest would
> >         > give me a better option, I'm happy to do that, but after
> >         scouring the
> >         > available documentation, I don't see a way to do this.
> >         >
> >         >
> >         > In other words, I'd like to inquire at many positions the
> >         proportion
> >         > of A's, T's, C's, G's and/or detected indels.  Mpileup (at
> >         least how
> >         > I'm currently using it) gets me close but not all the way
> >         there.
> >         >
> >         >
> >         >
> >         >
> >         > Did it make sense?    I'd appreciate any suggestions.
> >         >
> >         >
> >         >
> >         > Best regards,
> >         > Jonathan
> >         >
> >         >
> >         >
> >         > On Tue, Sep 23, 2014 at 11:47 AM, Petr Danecek
> >         <[email protected]>
> >         > wrote:
> >         >         Hi Jonathan,
> >         >
> >         >         On Tue, 2014-09-23 at 11:05 -0400, jbr950 wrote:
> >         >         > Hi Petr,
> >         >         >    The workflow documentation looks great - this
> >         effort is
> >         >         undoubtedly
> >         >         > going to help a lot of researchers.  Thanks for
> >         doing it.
> >         >
> >         >         the credits for the workflow should not go to me,
> >         the
> >         >         htslib.org page
> >         >         has been put together by Martin Pollard, Thomas
> >         Keane and
> >         >         others.
> >         >         :-)
> >         >
> >         >         > In this instance, I actually don't need statistics
> >         and the
> >         >         raw pileups
> >         >         > are fine.  Which brings me back to the original
> >         issue.  With
> >         >         the new
> >         >         > version, the vcf outputs from mpileup, using the
> >         same .bed
> >         >         as input,
> >         >         > are still of varying  lengths.  Do you know of a
> >         way to
> >         >         force mpileup
> >         >         > to report data at every position in the .bed file
> >         for every
> >         >         run?  Or
> >         >         > alternatively, do you know why some positions
> >         report a DP=0
> >         >         and other
> >         >         > positions are omitted altogether?
> >         >
> >         >         These DP=0 sites are emitted at places where there
> >         is an
> >         >         aligned read
> >         >         with an indel. The indel may not appear in the
> >         output when the
> >         >         mpileup
> >         >         machinery decides it is not significant enough.
> >         >
> >         >         I hope this helps,
> >         >         Petr
> >         >
> >         >
> >         >         > Thanks again,
> >         >         > Jonathan
> >         >         >
> >         >         >
> >         >         >
> >         >         > On Tue, Sep 23, 2014 at 9:26 AM, Petr Danecek
> >         >         <[email protected]>
> >         >         > wrote:
> >         >         >         There is a slowly growing documentation on
> >         the new
> >         >         website,
> >         >         >         please check
> >         >         >         the variant calling workflow
> >         >         >         http://www.htslib.org/workflow/
> >         >         >
> >         >         >         You need to use `bcftools call`, not
> >         `bcftools
> >         >         view`.
> >         >         >
> >         >         >         petr
> >         >         >
> >         >         >
> >         >         >         On Tue, 2014-09-23 at 09:25 -0400, jbr950
> >         wrote:
> >         >         >         > Good catch - I was trying to use an old
> >         version of
> >         >         bcftools
> >         >         >         with the
> >         >         >         > new version of samtools..
> >         >         >         >
> >         >         >         >
> >         >         >         > So I changed that to point to the new
> >         version of
> >         >         bcftools,
> >         >         >         and now:
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         samtools-bcftools-htslib-1.0_x64-linux/bin/samtools
> >         mpileup
> >         >         >         -BQ0
> >         >         >         > -d10000000 -l $variantBed -uf ${ref}
> >         $bamFile >
> >         >         test.bcf
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >         > So far so good..
> >         >         >         >
> >         >         >         >
> >         >         >         > And then:
> >         >         >         > cat test.bcf |
> >         >         >
> >          samtools-bcftools-htslib-1.0_x64-linux/bin/bcftools
> >         >         >         > view -cg -
> >         >         >         > Error: Could not parse --min-ac g
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >         > On the other hand, calling: .
> >         >         >         > cat test.bcf |
> >         >         >
> >          samtools-bcftools-htslib-1.0_x64-linux/bin/bcftools
> >         >         >         > view-
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >         > does create a nice output, so I'm
> >         inclined to let
> >         >         it go with
> >         >         >         the
> >         >         >         > defaults. I don't see equivalent
> >         arguments for the
> >         >         -c and -g
> >         >         >         flags and
> >         >         >         > I'm not particularly attached to the use
> >         of
> >         >         Bayesian
> >         >         >         inference,
> >         >         >         > although it would be good to understand
> >         what
> >         >         differences I
> >         >         >         can expect
> >         >         >         > between the old method (using -c) and
> >         the new
> >         >         (sans -c)?
> >         >         >         >
> >         >         >         >
> >         >         >         > Thanks for your time and assistance!
> >         >         >         >
> >         >         >         >
> >         >         >         > Jonathan
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >         > On Tue, Sep 23, 2014 at 2:40 AM, Petr
> >         Danecek
> >         >         >         <[email protected]>
> >         >         >         > wrote:
> >         >         >         >         This does not look like output
> >         from
> >         >         samtools 1.0 -
> >         >         >         it outputs
> >         >         >         >         VCFv4.2
> >         >         >         >
> >         >         >         >         petr
> >         >         >         >
> >         >         >         >         On Mon, 2014-09-22 at 14:53
> >         -0400, jbr950
> >         >         wrote:
> >         >         >         >         > Hi Petr,
> >         >         >         >         >     Thanks for your reply.  I
> >         grabbed
> >         >         the
> >         >         >         >         >
> >         samtools-bcftools-htslib-1.0_x64-linux
> >         >         binary and
> >         >         >         tried
> >         >         >         >         again.
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         > When I'd used Samtools 0.1.18,
> >         the issue
> >         >         I emailed
> >         >         >         about was
> >         >         >         >         that the
> >         >         >         >         > number of lines in output
> >         varied by .bam
> >         >         file, and
> >         >         >         I didn't
> >         >         >         >         understand
> >         >         >         >         > why lines were being ommitted,
> >         and why
> >         >         not in a
> >         >         >         common
> >         >         >         >         manner.
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         > Using version 1.0 and the same
> >         command
> >         >         (I checked
> >         >         >         on the
> >         >         >         >         older
> >         >         >         >         > samtools and it is producing
> >         output), I
> >         >         get no
> >         >         >         output at
> >         >         >         >         all.  Mpileup
> >         >         >         >         > writes a header and then
> >         stops.  I have
> >         >         copied and
> >         >         >         pasted
> >         >         >         >         the output.
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         > My .bed is 3 columns: chr
> >          start   stop
> >         >         >         >         > with no header.
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         > Any advice appreciated,
> >         thanks!
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         > Jonathan
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         > [mpileup] 1 samples in 1 input
> >         files
> >         >         >         >         > (mpileup) Max depth is above
> >         1M.
> >         >         Potential memory
> >         >         >         hog!
> >         >         >         >         > [bcf_sync] incorrect number of
> >         fields
> >         >         (0 != 5) at
> >         >         >         0:0
> >         >         >         >         > [afs] 0:0.000
> >         >         >         >         > ##fileformat=VCFv4.1
> >         >         >         >         >
> >         >         >
> >          ##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw
> >         >         read
> >         >         >         >         depth">
> >         >         >         >         >
> >         >         >
> >          ##INFO=<ID=DP4,Number=4,Type=Integer,Description="#
> >         >         >         >         high-quality
> >         >         >         >         > ref-forward bases,
> >         ref-reverse,
> >         >         alt-forward and
> >         >         >         alt-reverse
> >         >         >         >         bases">
> >         >         >         >         >
> >         >         >         >
> >         >         >
> >         >
> >
> ##INFO=<ID=MQ,Number=1,Type=Integer,Description="Root-mean-square
> >         >         >         >         > mapping quality of covering
> >         reads">
> >         >         >         >         >
> >         >         >
> >          ##INFO=<ID=FQ,Number=1,Type=Float,Description="Phred
> >         >         >         >         probability of
> >         >         >         >         > all samples being the same">
> >         >         >         >         >
> >         >         >         >
> >         >         >
> >         >
> >          ##INFO=<ID=AF1,Number=1,Type=Float,Description="Max-likelihood
> >         >         >         >         > estimate of the first ALT
> >         allele
> >         >         frequency
> >         >         >         (assuming HWE)">
> >         >         >         >         >
> >         >         >         >
> >         >         >
> >         >
> >          ##INFO=<ID=AC1,Number=1,Type=Float,Description="Max-likelihood
> >         >         >         >         > estimate of the first ALT
> >         allele count
> >         >         (no HWE
> >         >         >         assumption)">
> >         >         >         >         >
> >         >         ##INFO=<ID=G3,Number=3,Type=Float,Description="ML
> >         >         >         estimate
> >         >         >         >         of genotype
> >         >         >         >         > frequencies">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##INFO=<ID=HWE,Number=1,Type=Float,Description="Chi^2 based
> >         >         >         >         HWE test
> >         >         >         >         > P-value based on G3">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##INFO=<ID=CLR,Number=1,Type=Integer,Description="Log ratio
> >         >         >         >         of
> >         >         >         >         > genotype likelihoods with and
> >         without
> >         >         the
> >         >         >         constraint">
> >         >         >         >         >
> >         >         >
> >          ##INFO=<ID=UGT,Number=1,Type=String,Description="The
> >         >         most
> >         >         >         >         probable
> >         >         >         >         > unconstrained genotype
> >         configuration in
> >         >         the trio">
> >         >         >         >         >
> >         >         >
> >          ##INFO=<ID=CGT,Number=1,Type=String,Description="The
> >         >         most
> >         >         >         >         probable
> >         >         >         >         > constrained genotype
> >         configuration in
> >         >         the trio">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##INFO=<ID=PV4,Number=4,Type=Float,Description="P-values for
> >         >         >         >         strand
> >         >         >         >         > bias, baseQ bias, mapQ bias
> >         and tail
> >         >         distance
> >         >         >         bias">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates
> >         >         >         >         that the
> >         >         >         >         > variant is an INDEL.">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##INFO=<ID=PC2,Number=2,Type=Integer,Description="Phred
> >         >         >         >         probability of
> >         >         >         >         > the nonRef allele frequency in
> >         group1
> >         >         samples
> >         >         >         being larger
> >         >         >         >         (,smaller)
> >         >         >         >         > than in group2.">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##INFO=<ID=PCHI2,Number=1,Type=Float,Description="Posterior
> >         >         >         >         weighted
> >         >         >         >         > chi^2 P-value for testing the
> >         >         association between
> >         >         >         group1 and
> >         >         >         >         group2
> >         >         >         >         > samples.">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##INFO=<ID=QCHI2,Number=1,Type=Integer,Description="Phred
> >         >         >         >         scaled
> >         >         >         >         > PCHI2.">
> >         >         >         >         >
> >         >         ##INFO=<ID=PR,Number=1,Type=Integer,Description="#
> >         >         >         >         permutations
> >         >         >         >         > yielding a smaller PCHI2.">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant
> >         >         >         >         Distance
> >         >         >         >         > Bias">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype
> >         >         >         >         Quality">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##FORMAT=<ID=GL,Number=3,Type=Float,Description="Likelihoods
> >         >         >         >         for
> >         >         >         >         > RR,RA,AA genotypes
> >         (R=ref,A=alt)">
> >         >         >         >         >
> >         >         >
> >          ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="#
> >         >         >         >         high-quality
> >         >         >         >         > bases">
> >         >         >         >         >
> >         >         >         >
> >         >         >
> >         >
> >          ##FORMAT=<ID=SP,Number=1,Type=Integer,Description="Phred-scaled
> strand
> >         >         >         >         > bias P-value">
> >         >         >         >         >
> >         >         >
> >         >
> >         ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of
> >         >         >         >         > Phred-scaled genotype
> >         likelihoods">
> >         >         >         >         > #CHROM  POS     ID      REF
> >            ALT
> >         >          QUAL
> >         >         >         FILTER  INFO
> >         >         >         >         FORMAT
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         > On Thu, Sep 4, 2014 at 5:20
> >         AM, Petr
> >         >         Danecek
> >         >         >         >         <[email protected]> wrote:
> >         >         >         >         >         Hi Jonathan,
> >         >         >         >         >
> >         >         >         >         >         these are good
> >         questions. Could
> >         >         you please
> >         >         >         try with
> >         >         >         >         the latest
> >         >         >         >         >         release,
> >         >         >         >         >         I am happy to answer
> >         any
> >         >         remaining issues
> >         >         >         not solved
> >         >         >         >         by the
> >         >         >         >         >         upgrade.
> >         >         >         >         >
> >         >         >         >         >         Cheers,
> >         >         >         >         >         Petr
> >         >         >         >         >
> >         >         >         >         >         On Sun, 2014-08-17 at
> >         01:34
> >         >         -0400, Jo R
> >         >         >         wrote:
> >         >         >         >         >         > Hello,
> >         >         >         >         >         >     I'm attemping to
> >         get
> >         >         nucleotide
> >         >         >         frequency
> >         >         >         >         information at
> >         >         >         >         >         a number
> >         >         >         >         >         > of positions across
> >         a number
> >         >         of samples,
> >         >         >         and am
> >         >         >         >         having
> >         >         >         >         >         difficulty
> >         >         >         >         >         > interpreting some
> >         output.  Any
> >         >         insights
> >         >         >         would be
> >         >         >         >         >         appreciated.
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > I'm running the
> >         following
> >         >         command:
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > samtools mpileup
> >         -BQ0
> >         >         -d10000000 -l
> >         >         >         >         VariantBed.bed -uf
> >         >         >         >         >         $refFile $bam
> >         >         >         >         >         > | bcftools view -bcg
> >         - |
> >         >         bcftools view -
> >         >         >         >
> >         >         >         >         >         >
> >         ${sampleName}_validation.vcf
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > I notice that this
> >         command
> >         >         creates an
> >         >         >         output file
> >         >         >         >         with
> >         >         >         >         >         > an unpredictable
> >         number of
> >         >         rows.
> >         >         >         Running the
> >         >         >         >         command using
> >         >         >         >         >         the same
> >         >         >         >         >         > bed file on a set of
> >         >         different .bam
> >         >         >         files creates
> >         >         >         >         a set of
> >         >         >         >         >         output vcf
> >         >         >         >         >         > files with a wide
> >         distribution
> >         >         in
> >         >         >         numbers of rows.
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > I presumed that the
> >         difference
> >         >         in row
> >         >         >         numbers
> >         >         >         >         means that
> >         >         >         >         >         some
> >         >         >         >         >         > positions drop out
> >         on
> >         >         some .bam files
> >         >         >         because
> >         >         >         >         those samples
> >         >         >         >         >         lacked
> >         >         >         >         >         > coverage where other
> >         samples
> >         >         had
> >         >         >         coverage.
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > If that's the case,
> >         though, I
> >         >         don't know
> >         >         >         what to
> >         >         >         >         make of
> >         >         >         >         >         lines like
> >         >         >         >         >         > the following one:
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > 1       2160881 .
> >              G
> >         >            .
> >         >         >          28.2    .
> >         >         >         >         >         >
> >         >         DP=0;VDB=0.0003;;AC1=2;FQ=-30   PL
> >         >         >         0
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > here, it looks like
> >         DP=0, but
> >         >         this
> >         >         >         position still
> >         >         >         >         got
> >         >         >         >         >         reported in the
> >         >         >         >         >         > vcf output.  I also
> >         don't see
> >         >         AC1 in the
> >         >         >         legend
> >         >         >         >         for INFO
> >         >         >         >         >         tags in the
> >         >         >         >         >         > samtools
> >         specification page,
> >         >         so I don't
> >         >         >         know what
> >         >         >         >         to make of
> >         >         >         >         >         a value
> >         >         >         >         >         > of 2.
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > So, I am confused.
> >         Positions
> >         >         with a
> >         >         >         positive
> >         >         >         >         value of DP
> >         >         >         >         >         and DP4 make
> >         >         >         >         >         > sense to me.  But
> >         why are some
> >         >         positions
> >         >         >         >         completely ommitted
> >         >         >         >         >         from the
> >         >         >         >         >         > vcf output, and
> >         other
> >         >         positions
> >         >         >         reporting a DP=0?
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > Thanks for any
> >         advice.
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         > Best regards,
> >         >         >         >         >         > Jonathan
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >         >
> >         >         >         >         >
> >         >         >         >
> >         >         >
> >         >
> >
>  
> ------------------------------------------------------------------------------
> >         >         >         >         >         >
> >         >         >
> >          _______________________________________________
> >         >         >         >         >         > Samtools-help
> >         mailing list
> >         >         >         >         >         >
> >         >         [email protected]
> >         >         >         >         >         >
> >         >         >         >
> >         >         >
> >         >
> >          https://lists.sourceforge.net/lists/listinfo/samtools-help
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >         >         --
> >         >         >         >         >          The Wellcome Trust
> >         Sanger
> >         >         Institute is
> >         >         >         operated by
> >         >         >         >         Genome
> >         >         >         >         >         Research
> >         >         >         >         >          Limited, a charity
> >         registered
> >         >         in England
> >         >         >         with
> >         >         >         >         number 1021457
> >         >         >         >         >         and a
> >         >         >         >         >          company registered in
> >         England
> >         >         with number
> >         >         >         2742969,
> >         >         >         >         whose
> >         >         >         >         >         registered
> >         >         >         >         >          office is 215 Euston
> >         Road,
> >         >         London, NW1
> >         >         >         2BE.
> >         >         >         >         >
> >         >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >         >         --
> >         >         >         >          The Wellcome Trust Sanger
> >         Institute is
> >         >         operated by
> >         >         >         Genome
> >         >         >         >         Research
> >         >         >         >          Limited, a charity registered
> >         in England
> >         >         with
> >         >         >         number 1021457
> >         >         >         >         and a
> >         >         >         >          company registered in England
> >         with number
> >         >         2742969,
> >         >         >         whose
> >         >         >         >         registered
> >         >         >         >          office is 215 Euston Road,
> >         London, NW1
> >         >         2BE.
> >         >         >         >
> >         >         >         >
> >         >         >         >
> >         >         >
> >         >         >
> >         >         >
> >         >         >
> >         >         >         --
> >         >         >          The Wellcome Trust Sanger Institute is
> >         operated by
> >         >         Genome
> >         >         >         Research
> >         >         >          Limited, a charity registered in England
> >         with
> >         >         number 1021457
> >         >         >         and a
> >         >         >          company registered in England with number
> >         2742969,
> >         >         whose
> >         >         >         registered
> >         >         >          office is 215 Euston Road, London, NW1
> >         2BE.
> >         >         >
> >         >         >
> >         >         >
> >         >
> >         >
> >         >
> >         >
> >         >         --
> >         >          The Wellcome Trust Sanger Institute is operated by
> >         Genome
> >         >         Research
> >         >          Limited, a charity registered in England with
> >         number 1021457
> >         >         and a
> >         >          company registered in England with number 2742969,
> >         whose
> >         >         registered
> >         >          office is 215 Euston Road, London, NW1 2BE.
> >         >
> >         >
> >         >
> >
> >
> >
> >
> >         --
> >          The Wellcome Trust Sanger Institute is operated by Genome
> >         Research
> >          Limited, a charity registered in England with number 1021457
> >         and a
> >          company registered in England with number 2742969, whose
> >         registered
> >          office is 215 Euston Road, London, NW1 2BE.
> >
> >
> >
>
>
>
>
> --
>  The Wellcome Trust Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
>
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to