Hi Michał,

Michał Górny <mgo...@gentoo.org> writes:

> On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote:
>> Different BLAS/LAPACK implementations are expected to be compatible
>> to each other in both the API and ABI level. They can be used as
>> drop-in replacement to the others. This sounds nice, but the difference
>> in SONAME hampered the gentoo integration of well-optimized ones.
>
> If SONAMEs are different, then they are not compatible by definition.

This blas/lapack SONAME difference is a special case.  They are parially
compatible in the sense that every alternative implementation of blas is
a superset of the reference one.

Therefore linking to the reference at build time will make sure the
compatibility with the alternative implementations, even with different
SONAME.

>> [...]
>> 
>> Similar to Debian's update-alternatives mechanism, Gentoo's eselect
>> is good at dealing with drop-in replacements as well. My preliminary
>> investigation suggests that eselect is enough for enabling BLAS/LAPACK
>> runtime switching. Hence, the proposed solution is eselect-based:
>> 
>>   * Every BLAS/LAPACK implementation should provide generic library
>>     and eselect candidate libraries at the same time. Taking netlib,
>>     BLIS and OpenBLAS as examples:
>> 
>>     reference:
>> 
>>       usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3)
>>         -- default BLAS provider
>>         -- candidate of the eselect "blas" unit
>>         -- will be symlinked to usr/lib64/libblas.so.3 by eselect
>
> /usr/lib64 is not supposed to be modified by eselect, it's package
> manager area.  Yes, I know a lot of modules still do that but that's no
> reason to make things worse when people are putting significant effort
> to actually improve things.

Sorry, I didn't see your reply before mine.

We are going to use the LDPATH and ld.so.conf mechanism suggested by
you.

>>       usr/lib64/lapack/reference/liblapack.so.3 (SONAME=liblapack.so.3)
>>         -- default LAPACK provider
>>         -- candidate of the eselect "lapack" unit
>>         -- will be symlinked to usr/lib64/liblapack.so.3 by eselect
>> 
>>     blis (doesn't provide LAPACK):
>>       
>>       usr/lib64/libblis.so.2  (SONAME=libblis.so.2)
>>         -- general purpose
>> 
>>       usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3)
>>         -- candidate of the eselect "blas" unit
>>         -- will be symlinked to usr/lib64/libblas.so.3 by eselect
>>         -- compiled from the same set of object files as libblis.so.2
>> 
>>     openblas:
>>           
>>       usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0)
>>         -- general purpose
>> 
>>       usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3)
>>         -- candidate of the eselect "blas" unit
>>         -- will be symlinked to usr/lib64/libblas.so.3 by eselect
>>         -- compiled from the same set of object files as
>> libopenblas.so.0
>> 
>>       usr/lib64/lapack/openblas/liblapack.so.3 (SONAME=liblapack.so.3)
>>         -- candidate of the eselect "lapack" unit
>>         -- will be symlinked to usr/lib64/liblapack.so.3 by eselect
>>         -- compiled from the same set of object files as
>> libopenblas.so.0
>> 
>> This solution is similar to Debian's[3]. This solution achieves our
>> goal,
>> and it requires us to patch upstream build systems (same to Debian).
>> Preliminary demonstration for this solution is available, see below.
>
> So basically the three walls of text say in round-about way that you're
> going to introduce custom hacks to recompile libraries with different
> SONAME.  Ok.

>> Is this solution reliable?
>> --------------------------
>> 
>> * A similar solution has been used by Debian for many years.
>> * Many projects call BLAS/LAPACK libraries through FFI, including Julia.
>>   (See Julia's standard library: LinearAlgebra)
>> 
>> Proposed Changes
>> ----------------
>> 
>> 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo
>>    main repo. They use exactly the same source tarball. It's not quite
>>    helpful to package these components in a fine-grained manner. A
>> single
>>    sci-libs/lapack package is enough.
>
> Where's the gain in that?

>> 2. Merge the "cblas" eselect unit into "blas" unit. It is potentially
>>    harmful when "blas" and "cblas" point to different implementations.
>>    That means "app-eselect/eselect-cblas" should be deprecated.
>> 
>> 3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK providers
>>    will be registered in their dependency information.
>> 
>> Note, ebuilds for BLAS/LAPACK reverse dependencies are expected to work
>> with these changes correctly without change. For example, my local
>> numpy-1.16.1 compilation was successful without change.
>> 
>> Preliminary Demonstration
>> -------------------------
>> 
>> The preliminary implementation is available in my personal overlay[4].
>> A simple sanity test script `check-cpp.sh` is provided to illustrate
>> the effectiveness of the proposed solution.
>> 
>> The script `check-cpp.sh` compiles two C++ programs -- one calls general
>> matrix-matrix multiplication from BLAS, while another one calls general
>> singular value decomposition from LAPACK. Once compiled, this script
>> will switch different BLAS/LAPACK implementations and run the C++
>> programs
>> without recompilation.
>> 
>> The preliminary result is avaiable here[5]. (CPU=Power9, ARCH=ppc64le)
>> From the experimental results, we find that
>> 
>>   For (512x512) single precision matrix multiplication:
>>    * reference BLAS takes ~360 ms
>>    * BLIS takes ~70 ms
>>    * OpenBLAS takes ~10 ms
>> 
>>   For (512x512) single precision singular value decomposition:
>>    * reference LAPACK takes ~1900 ms
>>    * BLIS (+reference LAPACK) takes ~1500 ms
>>    * OpenBLAS takes ~1100 ms
>> 
>> The difference in computation speed illustrates the effectiveness of
>> the proposed solution. Theoretically, any other package could take
>> advantage from this solution without any recompilation as long as
>> it's linked against a library with SONAME.
>
> An actual ABI compliance test, e.g. done using abi-compliance-checker
> would be more interesting.

As said above, the symbols don't need to be 1-1 copy of each other. Any
library which is a superset of the reference one will work.

>> Acknowledgement
>> ---------------
>> This is an on-going GSoC-2019 Porject:
>> https://summerofcode.withgoogle.com/projects/?sp-page=2#6268942782300160
>
> It would probably have been better if the project was discussed before
> GSoC.  I'm really against pushing a bad idea forward just because
> someone set it for GSoC without discussing it first.

We should have involved you in the preparation phase.  By no means we
are going to use GSoC as a shield to push forward bad ideas.  If you
have objections, just speak up.  It's never too late to do so.

Yours,
Benda

Attachment: signature.asc
Description: PGP signature

Reply via email to