Hi,

I've been playing around recently with linking Numpy to different BLAS
implementations, particularly Eigen and ACML 6 (with openCL support!). I've
successfully linked Numpy to both of these libraries, but I found the
process overly difficult and confusing. I'm interested in either writing a
blog post or adding to Numpy docs to make it easier for others, but I'm
hoping to clear up some of my own confusions first.

I'll start with a rough outline of what I did to link Numpy with Eigen and
ACML.

*(1)*
Modify numpy/core/setup.py. Change

- blas_info = get_info('blas_opt', 0)
+ blas_info = get_info('blas', 0)

and changing get_dotblas_sources to

def get_dotblas_sources(ext, build_dir):
    if blas_info
        return ext.depends[:3]
    return None

(to remove the check for ('NO_ATLAS_INFO', 1)).


*(2)*
Compile CBLAS with BLLIB in Makefile.in pointing to the shared object for
your BLAS. Make a shared object (not a static library) out of CBLAS. This
requires adding -fPIC to the CFLAGS and FFLAGS.

*Question: Is it a bug that I couldn't get Numpy working with a static
CBLAS library and a shared object BLAS?*


*(3)*
Modify site.cfg at the top level of the Numpy directory with

[blas]
library_dirs = /path/to/directory/containing/shared_objects
include_dirs = /path/to/headers/from/CBLAS
blas_libs = cblas, your_blas_lib

where there headers from CBLAS are cblas_f77.h and cblas.h. For the
blas_libs variable, the library name "foo" loads libfoo.so, so with the
above example the libraries should be called libcblas.so and
libyour_blas_lib.so and lie in the listed library_dir.

Finally, run "python setup.py build" from the root of the Numpy codebase
(same directory that site.cfg lives in).


*My questions about this:*
CBLAS questions:
What does CBLAS do, and why/when is it necessary? For both ACML 6 and
Eigen, I could not link directly to the library but could with CBLAS. My
understanding is that the BLAS interface is a Fortran ABI, and the CBLAS
provides a C ABI (cdecl?) to BLAS.
Why can't the user link Numpy directly to the Fortran ABI? How are ATLAS
and openBLAS handled?

My procedure questions:
Is the installation procedure I outlined above reasonable, or does it
contain steps that could/should be removed? Having to edit Numpy source
seems sketchy to me. I largely came up with this procedure by looking up
tutorials online and by trial and error. I don't want to write
documentation that encourages people to do something in a non-optimal way,
so if there is a better way to do this, please let me know.


*Some final thoughts:*
Although I linked properly to the library, I discovered ACML 6 didn't work
at all on my computer (the ACML6 example code didn't even work). This is
very disappointed, as openCL support in ACML 6 + integrated GPU on laptop +
openCL on Intel integrated GPUs on Linux through beignet seemed like a
potentially very promising performance boost for all of us running Numpy on
newer laptops.

Eigen has excellent performance. On my i5-5200U (Broadwell) CPU, I  found
Eigen BLAS compiled with AVX and FMA instructions to take 3.93s to multiply
2 4000x4000 double matrices with a single thread, while my install of Numpy
from ubuntu took 9s (and used 4 threads on my 2 cores). My Ubuntu numpy
appears to built against "libblas", which I think is the reference
implementation.

Eigen gave 32GFLOPS of 64 bit performance from a single laptop core, I find
this quite impressive!


Thanks for any feedback and response to the questions!
-Eric Martin
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to