On Thu, 28 Aug 2014 16:36:11 -0400
Nils Homer <[email protected]> wrote:

> Could it be because we search for the associated sequence dictionary
> using the fasta file name, and the dictionary it finds for one is
> missing or mismatching? You could search for *.dict or *.fai files
> for each path.

Right again. It's actually the presence rather than the absence of
a *.fai index file which trips up picard, since if I rename it to
something else the command runs fine. The *.fai index was generated by
samtools, but picard must not like the look of it. The dictionary is
embedded in the header of my BAM file. Thanks again for the explanation.

Jeremy



> 
> N
> 
> 
> On Thu, Aug 28, 2014 at 3:50 PM, Jeremy Volkening <[email protected]>
> wrote:
> 
> > Well, that did the trick. I had given up on providing the reference
> > sequence because it always crashed picard with:
> >
> > Exception in thread "main" java.lang.NullPointerException
> >         at
> > htsjdk.samtools.reference.ReferenceSequenceFileWalker.get(ReferenceSequenceFileWalker.java:87)
> >         at
> > picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:113)
> >         at
> > picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:53)
> >         at
> > picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)
> >         at
> > picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124)
> >         at
> > picard.analysis.CollectAlignmentSummaryMetrics.main(CollectAlignmentSummaryMetrics.java:94)
> >
> >
> >
> > and it wasn't clear to me why it would need the full reference when
> > all of the relevant info should be in the BAM. After your
> > suggestion, however, I kept trying and it seems picard didn't like
> > the path I was providing:
> >
> > R=../oar_3.1/Ovis_aries.Oar_v3.1.dna.toplevel.fa
> >
> > It's interesting that following symlinks worked just fine:
> >
> > R=../oar_3.1/ref.fa
> > R=./Ovis_aries.Oar_v3.1.dna.toplevel.fa
> >
> > Is this perhaps a bug, feature, or user error? Maybe a length
> > limitation on the path? In any case, thanks for the help and the
> > easy fix.
> >
> > Jeremy
> >
> >
> >
> >
> > On Thu, 28 Aug 2014 14:54:01 -0400
> > Nils Homer <[email protected]> wrote:
> >
> > > Try giving it a reference sequence (R=...) to see if the
> > > alignment-based metrics are output.
> > >
> > > N
> > >
> > >
> > > On Thu, Aug 28, 2014 at 2:35 PM, Jeremy Volkening
> > > <[email protected]> wrote:
> > >
> > > > Hello,
> > > >
> > > > I have a set of ~ 1 billion gDNA paired reads mapped to a
> > > > reference genome with BWA-MEM. I tried to use picard's
> > > > CollectAlignmentSummaryMetrics to generate a summary of the
> > > > mapping, but it reports zero aligned reads:
> > > >
> > > > FIRST_OF_PAIR   549154663   549154663   1   0   0   0   0   0
> > > > 0 0   0 0   0   0   99.936337   0   0   0   0   0   0.000001
> > > > SECOND_OF_PAIR  549154663   549154663   1   0   0   0   0   0
> > > > 0 0   0 0   0   0   98.611736   0   0   0   0   0   0
> > > > PAIR    1098309326  1098309326  1   0   0   0   0   0   0   0
> > > > 0 0   0 0   99.274036   0   0   0   0   0   0
> > > >
> > > > A look at the SAM flags manually seems to indicate that most
> > > > pairs are properly aligned,
> > > > and the output of samtools flagstat seems to agree:
> > > >
> > > > 1102426754 + 0 in total (QC-passed reads + QC-failed reads)
> > > > 0 + 0 duplicates
> > > > 1097354292 + 0 mapped (99.54%:-nan%)
> > > > 1102426754 + 0 paired in sequencing
> > > > 551276790 + 0 read1
> > > > 551149964 + 0 read2
> > > > 1042528490 + 0 properly paired (94.57%:-nan%)
> > > > 1096261806 + 0 with itself and mate mapped
> > > > 1092486 + 0 singletons (0.10%:-nan%)
> > > > 36064341 + 0 with mate mapped to a different chr
> > > > 14493831 + 0 with mate mapped to a different chr (mapQ>=5)
> > > >
> > > > I don't mind using samtools flagstat to evaluate the mapping,
> > > > but I'm puzzled by this
> > > > and concerned that the problems with the picard output signal
> > > > further potential issues
> > > > using picard for duplicate removal, etc. I've seen a few users
> > > > report similar behavior
> > > > in the past, and one was able to pin it to their combination of
> > > > OS and java, but I've
> > > > tried multiple combinations of java package (Sun, OpenJDK) and
> > > > picard version and get
> > > > the same results.
> > > >
> > > > To generate the above results I used:
> > > >
> > > > Debian 'Wheezy'
> > > > java version 1.7.0_67
> > > > picard version 1.119
> > > > samtools version 0.1.18
> > > >
> > > > and the picard command was:
> > > >
> > > > java -jar /opt/picard/CollectAlignmentSummaryMetrics.jar
> > > > MAX_INSERT_SIZE=500 I=bwa.map.w_rg.bam O=bwa.picard.summary
> > > >
> > > > As per BWA the library insert size distribution has mean and sd
> > > > of ~ 180 and 20, respectively.
> > > > I am attaching a pruned SAM file with just a pair of reads that
> > > > appear to be mapped correctly
> > > > but which picard reports as unaligned as per above. Any
> > > > suggestions or help would be much
> > > > appreciated.
> > > >
> > > > Thanks,
> > > > Jeremy
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > ------------------------------------------------------------------------------
> > > > Slashdot TV.
> > > > Video for Nerds.  Stuff that matters.
> > > > http://tv.slashdot.org/
> > > > _______________________________________________
> > > > Samtools-help mailing list
> > > > [email protected]
> > > > https://lists.sourceforge.net/lists/listinfo/samtools-help
> > > >
> > > >
> >
> >


------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to