Re: They wrote the fastest parallelized BAM parser in D

Chris via Digitalmars-d Tue, 31 Mar 2015 02:27:03 -0700

On Monday, 30 March 2015 at 18:23:31 UTC, Russel Winder wrote:

On Mon, 2015-03-30 at 18:04 +0000, george via Digitalmars-dwrote:
> .NET actually already has a foothold in bioinformatics,> specially in user facing software and steering of reading> equipments and robots.>> So D's needs a story over C# and F# (alongside WPF for data> visualization) use cases.>> --
> Paulo
Paulo,

Can you send me some pointers to this stuff?
Though when it comes to open source bioinformatics projects,Perl and Python have a large footholdamong most most bioinformaticians. Most utilities that requirespeed are often written in C and C++ (BLAST, HMMER, SAMTOOLSetc).
I think D stands a good chance as a language of choice forbioinformatics projects.
George
My "prejudice", based on training people in Python and C++ overthelast few years, is that Python and C++ have a very strongposition inthe bioinformatics community, with the use of IPython (nowbecoming
Jupyter) increasing and solidifying the Python position.
D's position is quite weak here because one of the importantthings isvisualising data, something SciPy/Matplotlib are very good at.D has
no real play in this arena and so there is no way (currently) of
creating a foothold. Sad, but…

As Andrew Brown pointed out, visualization is not behind Pythonssuccess. Its success lies in the fact that it's a language youcan hack away in easily. Almost everybody who has to do some dataprocessing (most researchers do these days) and has limited or noexperience with programming will opt for Python: easy (atfirst!), well-documented and everyone else uses it. However, theinitial euphoria of being able to automatically rename files andextract value X from file Y soon gives way to frustration when itcomes to performance.

The paper shows well that in a world where data processing is ofutmost importance, and we're talking about huge sets of data,languages like Python don't cut it anymore. Two things arehappening at the moment: on the one hand people still use Pythonfor various reasons (see above and hundreds of posts on thisforum), at the same time there's growing discontent amongresearchers, scientists and engineers as regards performance,simply because the data sets are becoming bigger and bigger everyday and the algorithms are getting more and more refined. Sooneror later people will have to find new ways, out of sheernecessity.

Don't forget that "the state of the art" can change very quicklyin IT and the name of the game is anticipating new developmentsrather than taking snapshots of the current state of the art andframe them. D really has a lot to offer for data processing and Iwouldn't rule it out that more and more programmers will turn toit for this task.

Re: They wrote the fastest parallelized BAM parser in D

Reply via email to