On Aug 21, 2014, at 15:29 , Thomas Sibley <[email protected]> wrote: > On Aug 21, 2014, at 15:10 , Thomas Sibley <[email protected]> wrote: >> The runtime is still dramatically different from my other input, and I >> wonder if you have any ideas why? Do you expect it's some property of the >> data? Or some pathological edge case? >> >> I can't help but think the crash and runtime leap are related... > > Bingo. After putting together my jstack/strace observations with a reading > of the MarkDuplicates source, I wrote the one-line patch attached, recompiled > MarkDuplicates.jar, and the runtime dropped to 2-3 minutes. I started a > tight loop running and re-running MarkDuplicates on the same problematic > input and will see if I can provoke a JVM crash still.
100 runs later, doing 4 at a time, and I haven't provoked a JVM crash with my patched MarkDuplicates. I also tried with the IntelDeflater re-enabled and did not trigger a crash. Average run times of 1.38 minutes. _Something_ is bogging down in findOpticalDuplicates and going off the deep end. I'd guess the Collection.sort call since I saw mention of the sort implementation guts in the crash logs, but again, I'm no expert on the JVM. Thomas ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Samtools-help mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/samtools-help
