On Mon, Oct 10, 2016 at 9:37 AM, John Marshall <[email protected]> wrote: > On 10 Oct 2016, at 05:35, Juan Daniel Montenegro Cabrera > <[email protected]> wrote: >> I did a few test in my spare time. All samtools version from 0.1.19 >> have the same sorting problem, with or without the use of (-@) multiple >> threads. >> Version 0.1.18 is able to sort the file correctly, but is slower than >> sambamba, >> especially for really big bam files. >> I have a reduce unsorted bam file of ~500Mb that can be used to reproduce >> this issue. > > Thanks for the sample file. Looking at the properly-sorted and badly-sorted > 6B_concat reads, it turns out that the wrongly-processed ones are those > that have positions greater than 2^30. > > The problem is some code in samtools sort that was written back when > chromosomes were limited to 2^29 bases, and that doesn't work for > positions beyond 2^30. Having identified the outdated code, this will > be easy to fix. Thanks for the bug report. > > John
Good work John & Juan for explaining this. Does the sorting bug introduced in samtools 0.1.19 continue though to the current release of samtools 1.3.1 as well? Peter ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Samtools-help mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/samtools-help
