Hi, I've been playing around recently with linking Numpy to different BLAS implementations, particularly Eigen and ACML 6 (with openCL support!). I've successfully linked Numpy to both of these libraries, but I found the process overly difficult and confusing. I'm interested in either writing a blog post or adding to Numpy docs to make it easier for others, but I'm hoping to clear up some of my own confusions first.
I'll start with a rough outline of what I did to link Numpy with Eigen and ACML. *(1)* Modify numpy/core/setup.py. Change - blas_info = get_info('blas_opt', 0) + blas_info = get_info('blas', 0) and changing get_dotblas_sources to def get_dotblas_sources(ext, build_dir): if blas_info return ext.depends[:3] return None (to remove the check for ('NO_ATLAS_INFO', 1)). *(2)* Compile CBLAS with BLLIB in Makefile.in pointing to the shared object for your BLAS. Make a shared object (not a static library) out of CBLAS. This requires adding -fPIC to the CFLAGS and FFLAGS. *Question: Is it a bug that I couldn't get Numpy working with a static CBLAS library and a shared object BLAS?* *(3)* Modify site.cfg at the top level of the Numpy directory with [blas] library_dirs = /path/to/directory/containing/shared_objects include_dirs = /path/to/headers/from/CBLAS blas_libs = cblas, your_blas_lib where there headers from CBLAS are cblas_f77.h and cblas.h. For the blas_libs variable, the library name "foo" loads libfoo.so, so with the above example the libraries should be called libcblas.so and libyour_blas_lib.so and lie in the listed library_dir. Finally, run "python setup.py build" from the root of the Numpy codebase (same directory that site.cfg lives in). *My questions about this:* CBLAS questions: What does CBLAS do, and why/when is it necessary? For both ACML 6 and Eigen, I could not link directly to the library but could with CBLAS. My understanding is that the BLAS interface is a Fortran ABI, and the CBLAS provides a C ABI (cdecl?) to BLAS. Why can't the user link Numpy directly to the Fortran ABI? How are ATLAS and openBLAS handled? My procedure questions: Is the installation procedure I outlined above reasonable, or does it contain steps that could/should be removed? Having to edit Numpy source seems sketchy to me. I largely came up with this procedure by looking up tutorials online and by trial and error. I don't want to write documentation that encourages people to do something in a non-optimal way, so if there is a better way to do this, please let me know. *Some final thoughts:* Although I linked properly to the library, I discovered ACML 6 didn't work at all on my computer (the ACML6 example code didn't even work). This is very disappointed, as openCL support in ACML 6 + integrated GPU on laptop + openCL on Intel integrated GPUs on Linux through beignet seemed like a potentially very promising performance boost for all of us running Numpy on newer laptops. Eigen has excellent performance. On my i5-5200U (Broadwell) CPU, I found Eigen BLAS compiled with AVX and FMA instructions to take 3.93s to multiply 2 4000x4000 double matrices with a single thread, while my install of Numpy from ubuntu took 9s (and used 4 threads on my 2 cores). My Ubuntu numpy appears to built against "libblas", which I think is the reference implementation. Eigen gave 32GFLOPS of 64 bit performance from a single laptop core, I find this quite impressive! Thanks for any feedback and response to the questions! -Eric Martin
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion