Hi Sandro, I've discussed this matter on the debian science e-mail list asking for an cblas abstraction. I got a reply from Mo Zhou (https://lists.debian.org/debian-science/2019/01/msg00002.html ). In short he suggested to file a bug against numpy (severity: important).
I repeat the main part of the e-mail here: """ Debian's default (generic, no optimization) Atlas is really SLOW, such that I can write a faster BLAS by simply following this guide: https://github.com/flame/how-to-optimize-gemm Debian's reference BLAS (netlib) contains both BLAS and CBLAS API/ABI, which should be able to be linked against by numpy, and then we can make numpy use a high-performance BLAS backend by simply switching alternatives. Any user of a modern x84_64 CPU can easily find the giant performance gap between the default Atlas (generic code) and the default OpenBLAS (generic+optimized, runtime-dispatch). MKL would be even faster if the user doesn't mind installing non-free. I guess "cblas" is supposed to mean a backend with "cblas_*" API/ABI, and "blas" for "*_" (Fortran) API/ABI. If this is true, then at least MKL and openblas are expected to work well. As for numpy's linkage to libcblas.so ... I guess Sandro isn't aware of the fact that BLAS implementations such as OpenBLAS squashed both C and Fortran ABI into to a single shared object. And AFAIK only Atlas split the C ABI into an individual library libcblas.so . """ Regards, Jörg.