Re: dmdtags 1.0.0: an accurate tag generator for D source code

2021-08-31 Thread H. S. Teoh via Digitalmars-d-announce
On Fri, Aug 27, 2021 at 09:38:58PM +, Paul Backus via 
Digitalmars-d-announce wrote:
> `dmdtags` is a tags file generator for D source code that uses the DMD
> compiler frontend for accurate parsing.
> 
> This release supports 100%-accurate parsing of arbitrary D code
> (tested on DMD and Phobos sources), as well as the most commonly-used
> command line options, `-R`, `-o`, and `-a`. The generated tags file
> has been tested for compatibility with Vim and is compliant with the
> [POSIX standard for `ctags`][posix], so any editor with `ctags`
> support should be able to use it.

This is AWESOME!!!  Thanks a ton for this... I'll definitely be using
this in the near future!


[...]
> [`universal-ctags`][uctags], the current most-popular and
> best-maintained tags file generator, claims support for many
> programming languages, including D. However, its D parser is not
> well-maintained, and it often excludes large numbers of symbols from
> its output due to parsing failures.
> 
> Because `dmdtags` uses the DMD frontend for parsing, its results will
> always be accurate and up-to-date. For pure D projects, it can be used
> as a replacement for `universal-ctags`. For mixed-language projects,
> it can be used together with other tag generators with the `--append`
> option.

Is there any hope of merging this back to upstream ctags?

Regardless, this is awesome.


T

-- 
Long, long ago, the ancient Chinese invented a device that lets them see 
through walls. It was called the "window".


Re: SAOC 2021 Projects Summarized

2021-08-31 Thread Ali Çehreli via Digitalmars-d-announce

On 8/30/21 5:47 AM, Mike Parker wrote:

> Ahmet Sait KoC’ak

Being a fellow Turkish, I am curios why his last name is spelled that 
way. Unless it was sepecially requested by him, I would use the 
following obviously correct spelling:


 Ahmet Sait Koçak

Ali




dhtslib v0.12.0 (high-throughput sequencing library)

2021-08-31 Thread James Blachly via Digitalmars-d-announce
I'm delighted to finally post an official announcement of our package 
for high-throughput sequencing (HTS), also called Next-generation 
sequencing (NGS): `dhtslib`. It's not a very clever name, and we are 
working on a new one. ;)


https://github.com/blachlylab/dhtslib/
https://code.dlang.org/packages/dhtslib

Once upon a time, BioD[1] was fairly active, but I am afraid D is not 
heavily used in bioinformatics and computational biology, especially in 
high-throughput (genome) sequencing applications when compared to its 
peers.[2] However, our group (cancer genomics) has found D an ideal 
language which is easy to pick up for Python programmers and yet retains 
powerful features for C/C++ programmers.


`dhtslib` began as a thin wrapper over the ubiquitous, but very 
low-level and hard to use `htslib` C library 
(https://github.com/samtools/htslib/). We use `dhtslib` extensively in 
both public and private projects for computational biology, and over the 
years it has grown from simply a (huge) set of `extern (C)` definitions 
to a fully featured, RAII-enabled genome sequencing focused 
bioinformatics package. If you are working in this field, or know 
someone open to D who works in this field, I strongly encourage you to 
point them at `dhtslib`!


 * `htslib` namespace with complete bindings to htslib
 * `dhtslib` namespace with high level object-oriented interfaces, many 
using underlying htslib calls for high performance, but via convenient 
and idiomatic D including RAII, Forward ranges, etc.

 * htslib-backed read/write of SAM/BAM/CRAM, VCF/BCF
 * Readers for BED and GFF3/GTF (not part of htslib)
 * FASTQ streamer
 * CIGAR manipulations

The next version, v0.13.0, adds a novel feature "Typesafe Coordinates", 
which I'll post about separately in a moment!


Kind regards

James S Blachly, MD
The Ohio State University

[0] https://github.com/blachlylab/dhtslib/
https://code.dlang.org/packages/dhtslib
[1] https://github.com/biod/BioD
[2] Here is a contemporary example of D used in high-throughput 
sequencing: DENTIST by Arne Ludwig at Max Planck institute
https://github.com/a-ludi/dentist -- if you know of more, please 
let me know!


Announcement and Request: Typesafe Coordinate Systems for High-Throughput Sequencing Applications

2021-08-31 Thread James Blachly via Digitalmars-d-announce
In another post, I've just announced our D-based high throughput 
sequencing library, dhtslib.


One feature that is, AFAIK, novel in the field is leveraging the 
compiler's type system to enforce correctness regarding different 
genome/reference sequence coordinate systems. Clearly, the encoding of 
domain specific knowledge in a language's type system is nothing new, 
but it is surprising that this has not been done before in 
bioinformatics, and it is an idea that IMO is long overdue given the 
trainwreck of different coordinate systems in our field.


You can find dhtslib's develop branch, with Typesafe Coordinates merged 
and ready to use, here:


https://github.com/blachlylab/dhtslib/


**Now the request:**
We've drafted a manuscript describing Typesafe Coordinates as a sort of 
low-key endorsement of the D language and our library package `dhtslib`. 
You can find the manuscript here:


https://github.com/blachlylab/typesafe-coordinates/

We would be very grateful to those of you who would take the time to 
read the manuscript and post comments (publicly or privately), 
_especially if we have made any incorrect statements_ or our language 
regarding type systems is awkward or nonstandard.


We did praise D, and gently criticized Rust and OCaml* somewhat as it 
appeared to me that they lacked the features required to implement 
Typesafe Coordinate Systems in as ergonomic a way as we could in D. 
However, being a true novice at both of these other languages there is 
the possibility that I've missed something significant, and that the 
Rust and OCaml implementations could be retooled to match the D 
implementation. I'd still be glad to hear it if that's the case.


I plan to make a few minor cleanups and submit this to a preprint server 
as well as a scientific journal in the next week or so.


Kind regards

James S Blachly, MD
The Ohio State University


* as a side note, I actually find the OCaml code quite attractive in its 
terseness: `let j = cl_interval_of_ho (ob_interval_of_zb i)`


Re: SAOC 2021 Projects Summarized

2021-08-31 Thread Mike Parker via Digitalmars-d-announce

On Wednesday, 1 September 2021 at 04:56:28 UTC, Ali Çehreli wrote:

On 8/30/21 5:47 AM, Mike Parker wrote:

> Ahmet Sait KoC’ak

Being a fellow Turkish, I am curios why his last name is 
spelled that way. Unless it was sepecially requested by him, I 
would use the following obviously correct spelling:


 Ahmet Sait Koçak

Ali


That's how it was submitted in the application. I simply 
copy-pasted.