I made a bug report on github for this (
https://github.com/samtools/samtools/issues/538). The actual cause is
lacking LB tags in the read group, which while not technically required in
the specification will likely cause a few things to break if they're
missing.
Devon
--
Devon Ryan, Ph.D.
Email: [email protected]
Data Manager/Bioinformatician
Max Planck Institute of Immunobiology and Epigenetics
Stübeweg 51
79108 Freiburg
Germany
<[email protected]>
On Wed, Feb 17, 2016 at 12:22 AM, Friedman, Ryan <[email protected]>
wrote:
> I'm running into an issue trying to remove duplicate reads with Samtools
> and think it might be a bug in the latest version.
>
> I have a pipeline to process my aligned data before putting it into
> Freebayes to call variants that works as such:
> Code:
>
> samtools view -S -b foo.sam > foo.bamnorg # convert from sam to bam
> bamaddrg -b foo.bamnorg > foo.bam # ensure read groups are attached
> samtools sort foo.bam foo.sorted .bam # sorts
> samtools index foo.sorted.bam # index the
> sorted bam
> samtools rmdup foo.sorted.bam foo.sorted.1.bam # remove duplicate reads
> freebayes ...
>
> The entire pipeline works fine up until rmdup where it immediately results
> in a segmentation fault.
>
> What's particularly interesting is that this seg fault only occurs in some
> files. I originally thought this had something to do with size or memory
> allocation, since rmdup worked fine with RNA seq data but not with this
> genome seq data. However, the seg fault still occurred even when I
> allocated more memory.
>
> Furthermore, I think this might be a bug. The lab group I work with
> recently switched to a new cluster system. I run into the seg fault on the
> new cluster, which is running samtools 1.3, but the seg fault does not
> occur on the old cluster, which is running version 0.1.18.
>
> The error has been narrowed down to a line number in the bam.c file, with
> debug output below:
> Code:
>
> $ gdb samtools core
> GNU gdb (Ubuntu 7.9-1ubuntu1) 7.9
> Copyright (C) 2015 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-linux-gnu".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from samtools...done.
>
> [New LWP 11878]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `samtools rmdup NAT_B_9.sorted.bam
> NAT_B_9.sorted.1.bam'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 bam_get_library (h=h@entry=0x8bb1c0, b=b@entry=0x8db9c0) at bam.c:112
> 112 for (cp = LB; *cp && *cp != '\t' && *cp != '\n'; cp++)
> (gdb) list
> 107 if (strncmp(rg, ID, strlen(rg)) != 0 || ID[strlen(rg)] != '\t')
> 108 continue;
> 109
> 110 // Valid until next query
> 111 static char LB_text[1024];
> 112 for (cp = LB; *cp && *cp != '\t' && *cp != '\n'; cp++)
> 113 ;
> 114 strncpy(LB_text, LB, MIN(cp-LB, 1023));
> 115 LB_text[MIN(cp-LB, 1023)] = 0;
> 116
> (gdb)
>
> Does anyone have any suggestions/potential fixes?
>
>
> Ryan Friedman
> Washington University in St. Louis
> B.A. Candidate in Genomics & Computational Biology, Computer Science
> Class of 2017
> [email protected] | (440) 708-4736
>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Samtools-help mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/samtools-help
>
>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help