On Sunday, 26 August 2018 at 05:55:47 UTC, Pjotr Prins wrote:
Artem wrote Sambamba as a student

    https://github.com/biod/sambamba

and it is now running around the world in sequencing centers. Many many CPU hours and a resulting huge carbon foot print. The large competing C++ samtools project has been trying for 8 years to catch up with an almost unchanged student project and they are still slower in many cases.

[snip]

Note that Artem used the GC and only took GC out for critical sections in parallel code. I don't buy these complaints about GC.

The complaints about breaking code I don't see that much either. Sambamba pretty much kept compiling over the years and with LDC/LLVM latest we see a 20% perfomance increase. For free (at least from our perspective). Kudos to LDC/LLVM efforts!!

This sounds very similar to my experiences with the tsv utilities, on most of the same points (development simplicity, comparative performance, GC use, LDC). Data processing apps may well be a sweet spot. See my DConf talk for an overview (https://github.com/eBay/tsv-utils/blob/master/docs/dconf2018.pdf).

Though not mentioned in the talk, I also haven't had any significant issues with new compiler releases. May have be related to the type of code being written. Regarding the GC - The throughput oriented nature of data processing tools like the tsv utilities looks like a very good fit for the current GC. Applications where low GC latency is needed may have different results. It'd be great to hear an experience report from development of an application where GC was used and low GC latency was a priority.

--Jon

Reply via email to