This does not look like output from samtools 1.0 - it outputs VCFv4.2

petr

On Mon, 2014-09-22 at 14:53 -0400, jbr950 wrote: 
> Hi Petr,
>     Thanks for your reply.  I grabbed the
> samtools-bcftools-htslib-1.0_x64-linux binary and tried again.  
> 
> 
> When I'd used Samtools 0.1.18, the issue I emailed about was that the
> number of lines in output varied by .bam file, and I didn't understand
> why lines were being ommitted, and why not in a common manner.
> 
> 
> Using version 1.0 and the same command (I checked on the older
> samtools and it is producing output), I get no output at all.  Mpileup
> writes a header and then stops.  I have copied and pasted the output.
>  
> 
> 
> My .bed is 3 columns: chr   start   stop
> with no header.  
> 
> 
> Any advice appreciated, thanks!
> 
> 
> Jonathan
> 
> 
> 
> 
> 
> 
> [mpileup] 1 samples in 1 input files
> (mpileup) Max depth is above 1M. Potential memory hog!
> [bcf_sync] incorrect number of fields (0 != 5) at 0:0
> [afs] 0:0.000
> ##fileformat=VCFv4.1
> ##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
> ##INFO=<ID=DP4,Number=4,Type=Integer,Description="# high-quality
> ref-forward bases, ref-reverse, alt-forward and alt-reverse bases">
> ##INFO=<ID=MQ,Number=1,Type=Integer,Description="Root-mean-square
> mapping quality of covering reads">
> ##INFO=<ID=FQ,Number=1,Type=Float,Description="Phred probability of
> all samples being the same">
> ##INFO=<ID=AF1,Number=1,Type=Float,Description="Max-likelihood
> estimate of the first ALT allele frequency (assuming HWE)">
> ##INFO=<ID=AC1,Number=1,Type=Float,Description="Max-likelihood
> estimate of the first ALT allele count (no HWE assumption)">
> ##INFO=<ID=G3,Number=3,Type=Float,Description="ML estimate of genotype
> frequencies">
> ##INFO=<ID=HWE,Number=1,Type=Float,Description="Chi^2 based HWE test
> P-value based on G3">
> ##INFO=<ID=CLR,Number=1,Type=Integer,Description="Log ratio of
> genotype likelihoods with and without the constraint">
> ##INFO=<ID=UGT,Number=1,Type=String,Description="The most probable
> unconstrained genotype configuration in the trio">
> ##INFO=<ID=CGT,Number=1,Type=String,Description="The most probable
> constrained genotype configuration in the trio">
> ##INFO=<ID=PV4,Number=4,Type=Float,Description="P-values for strand
> bias, baseQ bias, mapQ bias and tail distance bias">
> ##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the
> variant is an INDEL.">
> ##INFO=<ID=PC2,Number=2,Type=Integer,Description="Phred probability of
> the nonRef allele frequency in group1 samples being larger (,smaller)
> than in group2.">
> ##INFO=<ID=PCHI2,Number=1,Type=Float,Description="Posterior weighted
> chi^2 P-value for testing the association between group1 and group2
> samples.">
> ##INFO=<ID=QCHI2,Number=1,Type=Integer,Description="Phred scaled
> PCHI2.">
> ##INFO=<ID=PR,Number=1,Type=Integer,Description="# permutations
> yielding a smaller PCHI2.">
> ##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance
> Bias">
> ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
> ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
> ##FORMAT=<ID=GL,Number=3,Type=Float,Description="Likelihoods for
> RR,RA,AA genotypes (R=ref,A=alt)">
> ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="# high-quality
> bases">
> ##FORMAT=<ID=SP,Number=1,Type=Integer,Description="Phred-scaled strand
> bias P-value">
> ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of
> Phred-scaled genotype likelihoods">
> #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT
> 
> 
> 
> On Thu, Sep 4, 2014 at 5:20 AM, Petr Danecek <[email protected]> wrote:
>         Hi Jonathan,
>         
>         these are good questions. Could you please try with the latest
>         release,
>         I am happy to answer any remaining issues not solved by the
>         upgrade.
>         
>         Cheers,
>         Petr 
>         
>         On Sun, 2014-08-17 at 01:34 -0400, Jo R wrote:
>         > Hello,
>         >     I'm attemping to get nucleotide frequency information at
>         a number
>         > of positions across a number of samples, and am having
>         difficulty
>         > interpreting some output.  Any insights would be
>         appreciated.
>         >
>         >
>         > I'm running the following command:
>         >
>         >
>         > samtools mpileup  -BQ0 -d10000000 -l VariantBed.bed -uf
>         $refFile $bam
>         > | bcftools view -bcg - | bcftools view - >
>         > ${sampleName}_validation.vcf
>         >
>         >
>         >
>         > I notice that this command creates an output file with
>         > an unpredictable number of rows.  Running the command using
>         the same
>         > bed file on a set of different .bam files creates a set of
>         output vcf
>         > files with a wide distribution in numbers of rows.
>         >
>         >
>         > I presumed that the difference in row numbers means that
>         some
>         > positions drop out on some .bam files because those samples
>         lacked
>         > coverage where other samples had coverage.
>         >
>         >
>         > If that's the case, though, I don't know what to make of
>         lines like
>         > the following one:
>         >
>         >
>         > 1       2160881 .       G       .       28.2    .
>         > DP=0;VDB=0.0003;;AC1=2;FQ=-30   PL      0
>         >
>         >
>         >
>         > here, it looks like DP=0, but this position still got
>         reported in the
>         > vcf output.  I also don't see AC1 in the legend for INFO
>         tags in the
>         > samtools specification page, so I don't know what to make of
>         a value
>         > of 2.
>         >
>         >
>         > So, I am confused.  Positions with a positive value of DP
>         and DP4 make
>         > sense to me.  But why are some positions completely ommitted
>         from the
>         > vcf output, and other positions reporting a DP=0?
>         >
>         >
>         > Thanks for any advice.
>         >
>         >
>         > Best regards,
>         > Jonathan
>         >
>         >
>         >
>         >
>         
>         >
>         
> ------------------------------------------------------------------------------
>         > _______________________________________________
>         > Samtools-help mailing list
>         > [email protected]
>         > https://lists.sourceforge.net/lists/listinfo/samtools-help
>         
>         
>         
>         
>         --
>          The Wellcome Trust Sanger Institute is operated by Genome
>         Research
>          Limited, a charity registered in England with number 1021457
>         and a
>          company registered in England with number 2742969, whose
>         registered
>          office is 215 Euston Road, London, NW1 2BE. 
> 
> 




-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to