Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed)
Hi Michał, Sorry for the late reply. Just encountered some severe hardware failure. On 2019-06-13 07:49, Michał Górny wrote: >> >> sci-libs/{blas,cblas,lapack,lapacke}::gentoo should be deprecated. They >> are based on exactly the same source tarball, and maintaining 4 ebuild >> files for a single tarball is not a good choice IHMO. Those old ebuild >> files seems to leverage the flexibility of upstream build system >> because it enables one to, for example, skip the reference blas build >> and use an existing optimized BLAS impelementation and hence introduce >> flexibility. That flexibility is hard to maintain and is not necessary >> anymore with the new runtime switching mechanism. >> >> That's why I propose to merge the 4 ebuild into a single one: >> sci-libs/lapack. We don't need to add the "reference" postfix >> because no upstream will loot the name "lapack". When talking >> about "lapack" it's always the reference implementation. > > What's the real gain here, and how does it compare to loss of > flexibility of being able to build only what the package in question > needs? First let's see what these 4 components are: 1. blas: written in fortran, provides fundamental linear algebra routines. libblas.so can work alone. 2. cblas: a thin C wrapper around the fortran blas. that means libcblas.so calls libblas.so for the real calculation. 3. lapack: written in fortran, frequently calls BLAS for implementing higher level linear algebra routines. liblapack.so needs libblas.so (fortran). 4. lapacke: a thin C wrapper around the fortran lapack. liblapacke.so needs liblapack.so. The real gain by merging 4 ebuilds into 1 ebuild: 1. easier to maintain, updating 4 ebuilds on every single version bump is much harder compared to updating only 1. This will also make it easier to provide and maintain the virtual-* features for long run. 2. could avoid confusing or even potentially problematic setups, e.g.: A user happened to compile OpenBLAS for the libblas provider, and BLIS for the libcblas provider: appA -> libblas (OpenBLAS) appB -> libcblas (BLIS) appC -> liblapacke (Ref) -> liblapack (Ref) -> libblas (OpenBLAS) -> libcblas.so (BLIS) The user will get him/herself confused on what BLAS is really doing the calculation. Plus, sometimes mixing threading model may cause poor performance (e.g. openmp + pthread) or even silent corruption (e.g. GNU openmp + Intel openmp). Merging cblas into blas, and lapacke into lapack will make it harder to get things wrong. IHMO that mentioned flexibility is not really necessary. Any scientific computing user who needs performance and dislikes the virtual-* solution could directly link their programs against MKL or openblas without thinking about the reference blas, because both MKL and OpenBLAS provides the full set of blas,cblas,lapack,lapacke API and ABI via a single shared object. Plus, that flexibility could be replaced by the proposed runtime switching solution: by alternating the blas(cblas) selection, liblapack.so can be dynamicly linked against different optimized implementations. Discarding this flexibility will only affect users who insist on linking an unoptimized lapack against a specific blas implementation. And one may also fall into trouble with such flexibility, e.g.: libcblas (Reference) -> libblas.so (reference) liblapack (Reference) -> libopenblas.so appC -> (liblapacke, libcblas) --> liblapacke -> liblapack -> libopenblas --> libcblas (reference) libopenblas's ABI is a superset of those of libcblas, which indicates confusion and symbol race condition during run-time. With the proposed (redesigned) solution, these potentially bad cases could be avoided because the solution trys to keep the backend consistency. Some people had headache on the BLAS/LAPACK flexibility and they created flexiblas. In a word, the (4->1) change can reudce the maintaining cost for (blas,cblas,lapack,lapacke) and make the virtual-* feature easier to implement and maintain for long run. Additionally, the flexibility mentioned before is not really necessary when the virtual-* feature is fully implemented. Best, Mo.
Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed)
On Thu, 2019-06-13 at 00:15 -0700, Mo Zhou wrote: > Hi Gentoo devs, > > I redesigned the solution for BLAS/LAPACK runtime switching. > New solution is based on eselect+ld.so.conf . See following. > > > Goal > > > > > > * When a program is linked against libblas.so or liblapack.so > > provided by any BLAS/LAPACK provider, the eselect-based solution > > will allow user to switch the underlying library without recompiling > > anything. > > Instead of manipulating symlinks, I wrote a dedicated eselect module > for BLAS: > > https://github.com/cdluminate/my-overlay/blob/master/app-eselect/eselect-blas/files/blas.eselect-0.2 > > This implementation will generate a corresponding ld.so.conf file > on switching and refresh ld.so cache. > > advantages: > > 1. not longer manipulates symlinks under package manager directory, >see https://bugs.gentoo.org/531842 and https://bugs.gentoo.org/632624 > > 2. we don't have to think about static lib and header switching like >Debian does. > > > * When a program is linked against a specific implementation, e.g. > > libmkl_rt.so, the solution doesn't break anything. > > This still holds with the new solution. > > > Solution > > > > > > Similar to Debian's update-alternatives mechanism, Gentoo's eselect > > is good at dealing with drop-in replacements as well. My preliminary > > The redesigned solution totally diverted from Debian's solution. > > * sci-libs/lapack provides standard API and ABI for BLAS/CBLAS/LAPACK > and LAPACKE. It provides a default set of libblas.so, libcblas.so, > and liblapack.so . Reverse dependencies linked against the three > libraries (reference blas) will take advantage of the runtime > switching mechanism through USE="virtual-blas virtual-lapack". > Reverse dependencies linked to specific implementations such as > libopenblas.so won't be affected at all. > > * every non-standard BLAS/LAPACK implementations could be registered > as alternatives via USE="virtual-blas virtual-lapack". Once the > virtual-* flags are toggled, the ebuild file will build some > extra shared objects with correct SONAME. > > For example: > > /usr/lib64/libblis.so.2 (SONAME=libblis.so.2, general purpose) > /usr/lib64/blas/blis/libblas.so.3 (USE="virtual-blas", > SONAME=libblas.so.3) > /usr/lib64/blas/blis/libcblas.so.3 (USE="virtual-blas", > SONAME=libcblas.so.3) > > * Reverse dependencies of BLAS/LAPACK could optionally provide the > "virtual-blas virtual-lapack" USE flags. > > if use virtual-*: > link against reference blas/lapack > else: > link against whatever the ebuild maintainer like and get rid > of the switching mechanism > > > Proposed Changes > > > > > > 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo > >main repo. They use exactly the same source tarball. It's not quite > >helpful to package these components in a fine-grained manner. A > > single > >sci-libs/lapack package is enough. > > sci-libs/{blas,cblas,lapack,lapacke}::gentoo should be deprecated. They > are based on exactly the same source tarball, and maintaining 4 ebuild > files for a single tarball is not a good choice IHMO. Those old ebuild > files seems to leverage the flexibility of upstream build system > because it enables one to, for example, skip the reference blas build > and use an existing optimized BLAS impelementation and hence introduce > flexibility. That flexibility is hard to maintain and is not necessary > anymore with the new runtime switching mechanism. > > That's why I propose to merge the 4 ebuild into a single one: > sci-libs/lapack. We don't need to add the "reference" postfix > because no upstream will loot the name "lapack". When talking > about "lapack" it's always the reference implementation. What's the real gain here, and how does it compare to loss of flexibility of being able to build only what the package in question needs? -- Best regards, Michał Górny signature.asc Description: This is a digitally signed message part
Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed)
Hi Gentoo devs, I redesigned the solution for BLAS/LAPACK runtime switching. New solution is based on eselect+ld.so.conf . See following. > Goal > > > * When a program is linked against libblas.so or liblapack.so > provided by any BLAS/LAPACK provider, the eselect-based solution > will allow user to switch the underlying library without recompiling > anything. Instead of manipulating symlinks, I wrote a dedicated eselect module for BLAS: https://github.com/cdluminate/my-overlay/blob/master/app-eselect/eselect-blas/files/blas.eselect-0.2 This implementation will generate a corresponding ld.so.conf file on switching and refresh ld.so cache. advantages: 1. not longer manipulates symlinks under package manager directory, see https://bugs.gentoo.org/531842 and https://bugs.gentoo.org/632624 2. we don't have to think about static lib and header switching like Debian does. > * When a program is linked against a specific implementation, e.g. > libmkl_rt.so, the solution doesn't break anything. This still holds with the new solution. > Solution > > > Similar to Debian's update-alternatives mechanism, Gentoo's eselect > is good at dealing with drop-in replacements as well. My preliminary The redesigned solution totally diverted from Debian's solution. * sci-libs/lapack provides standard API and ABI for BLAS/CBLAS/LAPACK and LAPACKE. It provides a default set of libblas.so, libcblas.so, and liblapack.so . Reverse dependencies linked against the three libraries (reference blas) will take advantage of the runtime switching mechanism through USE="virtual-blas virtual-lapack". Reverse dependencies linked to specific implementations such as libopenblas.so won't be affected at all. * every non-standard BLAS/LAPACK implementations could be registered as alternatives via USE="virtual-blas virtual-lapack". Once the virtual-* flags are toggled, the ebuild file will build some extra shared objects with correct SONAME. For example: /usr/lib64/libblis.so.2 (SONAME=libblis.so.2, general purpose) /usr/lib64/blas/blis/libblas.so.3 (USE="virtual-blas", SONAME=libblas.so.3) /usr/lib64/blas/blis/libcblas.so.3 (USE="virtual-blas", SONAME=libcblas.so.3) * Reverse dependencies of BLAS/LAPACK could optionally provide the "virtual-blas virtual-lapack" USE flags. if use virtual-*: link against reference blas/lapack else: link against whatever the ebuild maintainer like and get rid of the switching mechanism > Proposed Changes > > > 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo >main repo. They use exactly the same source tarball. It's not quite >helpful to package these components in a fine-grained manner. A > single >sci-libs/lapack package is enough. sci-libs/{blas,cblas,lapack,lapacke}::gentoo should be deprecated. They are based on exactly the same source tarball, and maintaining 4 ebuild files for a single tarball is not a good choice IHMO. Those old ebuild files seems to leverage the flexibility of upstream build system because it enables one to, for example, skip the reference blas build and use an existing optimized BLAS impelementation and hence introduce flexibility. That flexibility is hard to maintain and is not necessary anymore with the new runtime switching mechanism. That's why I propose to merge the 4 ebuild into a single one: sci-libs/lapack. We don't need to add the "reference" postfix because no upstream will loot the name "lapack". When talking about "lapack" it's always the reference implementation. > 2. Merge the "cblas" eselect unit into "blas" unit. It is potentially >harmful when "blas" and "cblas" point to different implementations. >That means "app-eselect/eselect-cblas" should be deprecated. eselect-cblas should be deprecated. That affects gsl because it is registered as an cblas alternative. gsl doesn't provide the standard BLAS (fortran) API+ABI so it will not be added as a runtime switching candidate. Does this redesinged solution look acceptable now? Best, Mo.
Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching
Hi David, David Seifert writes: >> > An actual ABI compliance test, e.g. done using abi-compliance- >> > checker would be more interesting. >> >> As said above, the symbols don't need to be 1-1 copy of each other. >> Any library which is a superset of the reference one will work. > > Again, I'm willing to accept this under a USE="lapack_targets_virtual" > configuration, I hear you, and I was with the same opinion, too. Nevertheless, if a runtime switch works, it is simpler than USE_EXPAND. We already have a couple of them, PYTHON_TARGET, RUBY_TARGET, etc. I don't want to add more to this list unless absolutely necessary. Alternatively, it could be exclusive USE flags instead of USE_EXPAND. That's possible and we need to add a lot of USE flags, OpenBLAS, blis, reference, altas, nvBLAS, etc. I don't think it a clean solution. > but wholesale editing of DT_NEEDED entries is definitely too scary and > too invasive for most non-sci/hpc users of Gentoo. Sorry if I confused you. There is no hack of DT_NEEDED involved. An application is compiled against the reference, then it is pointed by LDPATH and ld.so.conf to a drop-in replacement library, and it profits. That's not more than upgrading a dynamic library. The scheme was shown to work with gcc runtime, libstdc++, and opengl, in Gentoo. > Again, for 99% of users, OpenBLAS will be the right trade-off between > performance and customizability. Every recompilation of libreoffice or > chromium will devour more CPU cycles than switching between USE-flag > implementations. I understand your point. Let me elaborate: 1. OpenBLAS has quite a bit of hand-tuned assembly, it is not very portable. Special care is need to manage the default BLAS on profiles, like ppc64, arm64, arm, riscv, etc. 2. All the Gentoo users care about optimization. And there are non-science applications that call linear algebra routines. 3. Maintaining different BLAS frameworks in repo/gentoo.git and proj/sci.git wastes everybody's time, and causes endless confusion to users. That offsets the time saved to make OpenBLAS the global default. Therefore, from the users' point of view, I still think runtime switching makes more sense. Yours, Benda
Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching
On Wed, 2019-05-29 at 22:33 +0800, Benda Xu wrote: > Hi Michał, > > Michał Górny writes: > > > On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote: > > > Different BLAS/LAPACK implementations are expected to be > > > compatible > > > to each other in both the API and ABI level. They can be used as > > > drop-in replacement to the others. This sounds nice, but the > > > difference > > > in SONAME hampered the gentoo integration of well-optimized ones. > > > > If SONAMEs are different, then they are not compatible by > > definition. > > This blas/lapack SONAME difference is a special case. They are > parially > compatible in the sense that every alternative implementation of blas > is > a superset of the reference one. > > Therefore linking to the reference at build time will make sure the > compatibility with the alternative implementations, even with > different > SONAME. > > > > [...] > > > > > > Similar to Debian's update-alternatives mechanism, Gentoo's > > > eselect > > > is good at dealing with drop-in replacements as well. My > > > preliminary > > > investigation suggests that eselect is enough for enabling > > > BLAS/LAPACK > > > runtime switching. Hence, the proposed solution is eselect-based: > > > > > > * Every BLAS/LAPACK implementation should provide generic > > > library > > > and eselect candidate libraries at the same time. Taking > > > netlib, > > > BLIS and OpenBLAS as examples: > > > > > > reference: > > > > > > usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3) > > > -- default BLAS provider > > > -- candidate of the eselect "blas" unit > > > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > > > > /usr/lib64 is not supposed to be modified by eselect, it's package > > manager area. Yes, I know a lot of modules still do that but > > that's no > > reason to make things worse when people are putting significant > > effort > > to actually improve things. > > Sorry, I didn't see your reply before mine. > > We are going to use the LDPATH and ld.so.conf mechanism suggested by > you. > > > > usr/lib64/lapack/reference/liblapack.so.3 > > > (SONAME=liblapack.so.3) > > > -- default LAPACK provider > > > -- candidate of the eselect "lapack" unit > > > -- will be symlinked to usr/lib64/liblapack.so.3 by > > > eselect > > > > > > blis (doesn't provide LAPACK): > > > > > > usr/lib64/libblis.so.2 (SONAME=libblis.so.2) > > > -- general purpose > > > > > > usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3) > > > -- candidate of the eselect "blas" unit > > > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > > > -- compiled from the same set of object files as > > > libblis.so.2 > > > > > > openblas: > > > > > > usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0) > > > -- general purpose > > > > > > usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3) > > > -- candidate of the eselect "blas" unit > > > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > > > -- compiled from the same set of object files as > > > libopenblas.so.0 > > > > > > usr/lib64/lapack/openblas/liblapack.so.3 > > > (SONAME=liblapack.so.3) > > > -- candidate of the eselect "lapack" unit > > > -- will be symlinked to usr/lib64/liblapack.so.3 by > > > eselect > > > -- compiled from the same set of object files as > > > libopenblas.so.0 > > > > > > This solution is similar to Debian's[3]. This solution achieves > > > our > > > goal, > > > and it requires us to patch upstream build systems (same to > > > Debian). > > > Preliminary demonstration for this solution is available, see > > > below. > > > > So basically the three walls of text say in round-about way that > > you're > > going to introduce custom hacks to recompile libraries with > > different > > SONAME. Ok. > > > Is this solution reliable? > > > -- > > > > > > * A similar solution has been used by Debian for many years. > > > * Many projects call BLAS/LAPACK libraries through FFI, including > > > Julia. > > > (See Julia's standard library: LinearAlgebra) > > > > > > Proposed Changes > > > > > > > > > 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from > > > gentoo > > >main repo. They use exactly the same source tarball. It's not > > > quite > > >helpful to package these components in a fine-grained manner. > > > A > > > single > > >sci-libs/lapack package is enough. > > > > Where's the gain in that? > > > 2. Merge the "cblas" eselect unit into "blas" unit. It is > > > potentially > > >harmful when "blas" and "cblas" point to different > > > implementations. > > >That means "app-eselect/eselect-cblas" should be deprecated. > > > > > > 3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK > > > providers > > >wi
Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching
Hi Michał, Michał Górny writes: > On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote: >> Different BLAS/LAPACK implementations are expected to be compatible >> to each other in both the API and ABI level. They can be used as >> drop-in replacement to the others. This sounds nice, but the difference >> in SONAME hampered the gentoo integration of well-optimized ones. > > If SONAMEs are different, then they are not compatible by definition. This blas/lapack SONAME difference is a special case. They are parially compatible in the sense that every alternative implementation of blas is a superset of the reference one. Therefore linking to the reference at build time will make sure the compatibility with the alternative implementations, even with different SONAME. >> [...] >> >> Similar to Debian's update-alternatives mechanism, Gentoo's eselect >> is good at dealing with drop-in replacements as well. My preliminary >> investigation suggests that eselect is enough for enabling BLAS/LAPACK >> runtime switching. Hence, the proposed solution is eselect-based: >> >> * Every BLAS/LAPACK implementation should provide generic library >> and eselect candidate libraries at the same time. Taking netlib, >> BLIS and OpenBLAS as examples: >> >> reference: >> >> usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3) >> -- default BLAS provider >> -- candidate of the eselect "blas" unit >> -- will be symlinked to usr/lib64/libblas.so.3 by eselect > > /usr/lib64 is not supposed to be modified by eselect, it's package > manager area. Yes, I know a lot of modules still do that but that's no > reason to make things worse when people are putting significant effort > to actually improve things. Sorry, I didn't see your reply before mine. We are going to use the LDPATH and ld.so.conf mechanism suggested by you. >> usr/lib64/lapack/reference/liblapack.so.3 (SONAME=liblapack.so.3) >> -- default LAPACK provider >> -- candidate of the eselect "lapack" unit >> -- will be symlinked to usr/lib64/liblapack.so.3 by eselect >> >> blis (doesn't provide LAPACK): >> >> usr/lib64/libblis.so.2 (SONAME=libblis.so.2) >> -- general purpose >> >> usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3) >> -- candidate of the eselect "blas" unit >> -- will be symlinked to usr/lib64/libblas.so.3 by eselect >> -- compiled from the same set of object files as libblis.so.2 >> >> openblas: >> >> usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0) >> -- general purpose >> >> usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3) >> -- candidate of the eselect "blas" unit >> -- will be symlinked to usr/lib64/libblas.so.3 by eselect >> -- compiled from the same set of object files as >> libopenblas.so.0 >> >> usr/lib64/lapack/openblas/liblapack.so.3 (SONAME=liblapack.so.3) >> -- candidate of the eselect "lapack" unit >> -- will be symlinked to usr/lib64/liblapack.so.3 by eselect >> -- compiled from the same set of object files as >> libopenblas.so.0 >> >> This solution is similar to Debian's[3]. This solution achieves our >> goal, >> and it requires us to patch upstream build systems (same to Debian). >> Preliminary demonstration for this solution is available, see below. > > So basically the three walls of text say in round-about way that you're > going to introduce custom hacks to recompile libraries with different > SONAME. Ok. >> Is this solution reliable? >> -- >> >> * A similar solution has been used by Debian for many years. >> * Many projects call BLAS/LAPACK libraries through FFI, including Julia. >> (See Julia's standard library: LinearAlgebra) >> >> Proposed Changes >> >> >> 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo >>main repo. They use exactly the same source tarball. It's not quite >>helpful to package these components in a fine-grained manner. A >> single >>sci-libs/lapack package is enough. > > Where's the gain in that? >> 2. Merge the "cblas" eselect unit into "blas" unit. It is potentially >>harmful when "blas" and "cblas" point to different implementations. >>That means "app-eselect/eselect-cblas" should be deprecated. >> >> 3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK providers >>will be registered in their dependency information. >> >> Note, ebuilds for BLAS/LAPACK reverse dependencies are expected to work >> with these changes correctly without change. For example, my local >> numpy-1.16.1 compilation was successful without change. >> >> Preliminary Demonstration >> - >> >> The preliminary implementation is available in my personal overlay[4]. >> A simple sanity test script `check-cpp.sh` is provided to illustrate >> the effectiveness of the proposed solu
Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching
Dear David, David Seifert writes: > We already have such a solution in the sci-overlay. It has proven > extremely brittle and shaky. What's more, using eselect set which library to link to was regarded harmful. > The plan is to do this via USE flags similar to python-single-r1 > flags. Yes, that was the conclusion in 2017. But now there is something to be renewed, see below. > Optionally, we can leave a "virtual" USE flag, where users can switch > implementation at runtime, but this will not be a supported > configuration. [...] > While I understand that runtime-switching sounds like a great feature, > it has proven too difficult to get right and too hard to enforce > invariants on correct symlinks. I thought so, too. Back in 2017, I did an investigate of Debian runtime switching, and concluded the patches were too heavy for us. That was the main reason we did not consider the Debian runtime switching way.[0, 1] However, I have made a deeper study this year, and concluded it is quite doable in a simple way in Gentoo. Historically, in alternative{,-2}.eclass of the science overlay, "alternative" is a name from Debian[2]. We have made it too complex over the years, and now there is a good chance to make it simple again. > People that want this can go the virtual+eselect approach in the > overlay, Which is brittle and shaky as we agree. > but 99% of Gentoo users will be happy with just linking against > OpenBLAS/reference-lapack and not having to fix weird stale symlinks > that eselect-alternatives somehow lost track of. With the simplified approach Mo is after, there will no longer be weird stale symlinks issues. The solution has been successfully proved by the update-alternative mechanism in Debian. Actually, by adopting mgorny's LDPATH way of handing gcc runtime, we don't need symlinks at all![3] For more details, please refer to https://people.debian.org/~lumin/gsoc19-gentoo.pdf > See also: > https://bugs.gentoo.org/632624 > https://bugs.gentoo.org/348843#c30 Thanks. That reminded me of all the discussions we all along. Yours, Benda 0. https://github.com/gentoo/sci/issues/805#issuecomment-345690034 1. https://github.com/gentoo/sci/issues/805#issuecomment-345701064 2. https://github.com/gentoo/sci/issues/805#issuecomment-332887691 3. https://bugs.gentoo.org/632618
Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching
On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote: > Different BLAS/LAPACK implementations are expected to be compatible > to each other in both the API and ABI level. They can be used as > drop-in replacement to the others. This sounds nice, but the difference > in SONAME hampered the gentoo integration of well-optimized ones. If SONAMEs are different, then they are not compatible by definition. > Assume a Gentoo user compiled a pile of packages on top of the reference > BLAS and LAPACK, namely these reverse dependencies are linked against > libblas.so.3 and liblapack.so.3 . When the user discovered that > OpenBLAS provides much better performance, they'll have to recompile > the whole reverse dependency tree in order to take advantage from > OpenBLAS, > because the SONAME of OpenBLAS is libopenblas.so.0 . When the user > wants to try MKL (libmkl_rt.so), they'll have to recompile the whole > reverse dependency tree again. > > This is not friendly to our earth. > > Goal > > > * When a program is linked against libblas.so or liblapack.so > provided by any BLAS/LAPACK provider, the eselect-based solution > will allow user to switch the underlying library without recompiling > anything. > > * When a program is linked against a specific implementation, e.g. > libmkl_rt.so, the solution doesn't break anything. > > Solution > > > Similar to Debian's update-alternatives mechanism, Gentoo's eselect > is good at dealing with drop-in replacements as well. My preliminary > investigation suggests that eselect is enough for enabling BLAS/LAPACK > runtime switching. Hence, the proposed solution is eselect-based: > > * Every BLAS/LAPACK implementation should provide generic library > and eselect candidate libraries at the same time. Taking netlib, > BLIS and OpenBLAS as examples: > > reference: > > usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3) > -- default BLAS provider > -- candidate of the eselect "blas" unit > -- will be symlinked to usr/lib64/libblas.so.3 by eselect /usr/lib64 is not supposed to be modified by eselect, it's package manager area. Yes, I know a lot of modules still do that but that's no reason to make things worse when people are putting significant effort to actually improve things. > usr/lib64/lapack/reference/liblapack.so.3 (SONAME=liblapack.so.3) > -- default LAPACK provider > -- candidate of the eselect "lapack" unit > -- will be symlinked to usr/lib64/liblapack.so.3 by eselect > > blis (doesn't provide LAPACK): > > usr/lib64/libblis.so.2 (SONAME=libblis.so.2) > -- general purpose > > usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3) > -- candidate of the eselect "blas" unit > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > -- compiled from the same set of object files as libblis.so.2 > > openblas: > > usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0) > -- general purpose > > usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3) > -- candidate of the eselect "blas" unit > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > -- compiled from the same set of object files as > libopenblas.so.0 > > usr/lib64/lapack/openblas/liblapack.so.3 (SONAME=liblapack.so.3) > -- candidate of the eselect "lapack" unit > -- will be symlinked to usr/lib64/liblapack.so.3 by eselect > -- compiled from the same set of object files as > libopenblas.so.0 > > This solution is similar to Debian's[3]. This solution achieves our > goal, > and it requires us to patch upstream build systems (same to Debian). > Preliminary demonstration for this solution is available, see below. So basically the three walls of text say in round-about way that you're going to introduce custom hacks to recompile libraries with different SONAME. Ok. > > Is this solution reliable? > -- > > * A similar solution has been used by Debian for many years. > * Many projects call BLAS/LAPACK libraries through FFI, including Julia. > (See Julia's standard library: LinearAlgebra) > > Proposed Changes > > > 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo >main repo. They use exactly the same source tarball. It's not quite >helpful to package these components in a fine-grained manner. A > single >sci-libs/lapack package is enough. Where's the gain in that? > 2. Merge the "cblas" eselect unit into "blas" unit. It is potentially >harmful when "blas" and "cblas" point to different implementations. >That means "app-eselect/eselect-cblas" should be deprecated. > > 3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK providers >will be registered in their dependency information. > > Note, ebuilds for BLAS/LAPACK reverse dependencies are expected to
Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching
On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote: > Hi Gentoo devs, > > Classical numerical linear algebra libraries, BLAS[1] and LAPACK[2] > play important roles in the scientific computing field, as many > software such as Numpy, Scipy, Julia, Octave, R are built upon them. > > There is a standard implementation of BLAS and LAPACK, named netlib > or simply "reference implementation". This implementation had been > provided by gentoo's main repo. However, it has a major problem: > performance. On the other hand, a number of well-optimized > BLAS/LAPACK > implementations exist, including OpenBLAS (free), BLIS (free), > MKL (non-free), etc., but none of them has been properly integrated > into the Gentoo distribution. > > I'm writing to propose a good solution to this problem. If no gentoo > developer is object to this proposal, I'll keep moving forward and > start submitting PRs to Gentoo main repo. > > Historical Obstacle > --- > > Different BLAS/LAPACK implementations are expected to be compatible > to each other in both the API and ABI level. They can be used as > drop-in replacement to the others. This sounds nice, but the > difference > in SONAME hampered the gentoo integration of well-optimized ones. > > Assume a Gentoo user compiled a pile of packages on top of the > reference > BLAS and LAPACK, namely these reverse dependencies are linked against > libblas.so.3 and liblapack.so.3 . When the user discovered that > OpenBLAS provides much better performance, they'll have to recompile > the whole reverse dependency tree in order to take advantage from > OpenBLAS, > because the SONAME of OpenBLAS is libopenblas.so.0 . When the user > wants to try MKL (libmkl_rt.so), they'll have to recompile the whole > reverse dependency tree again. > > This is not friendly to our earth. > > Goal > > > * When a program is linked against libblas.so or liblapack.so > provided by any BLAS/LAPACK provider, the eselect-based solution > will allow user to switch the underlying library without > recompiling > anything. > > * When a program is linked against a specific implementation, e.g. > libmkl_rt.so, the solution doesn't break anything. > > Solution > > > Similar to Debian's update-alternatives mechanism, Gentoo's eselect > is good at dealing with drop-in replacements as well. My preliminary > investigation suggests that eselect is enough for enabling > BLAS/LAPACK > runtime switching. Hence, the proposed solution is eselect-based: > > * Every BLAS/LAPACK implementation should provide generic library > and eselect candidate libraries at the same time. Taking netlib, > BLIS and OpenBLAS as examples: > > reference: > > usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3) > -- default BLAS provider > -- candidate of the eselect "blas" unit > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > > usr/lib64/lapack/reference/liblapack.so.3 > (SONAME=liblapack.so.3) > -- default LAPACK provider > -- candidate of the eselect "lapack" unit > -- will be symlinked to usr/lib64/liblapack.so.3 by eselect > > blis (doesn't provide LAPACK): > > usr/lib64/libblis.so.2 (SONAME=libblis.so.2) > -- general purpose > > usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3) > -- candidate of the eselect "blas" unit > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > -- compiled from the same set of object files as libblis.so.2 > > openblas: > > usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0) > -- general purpose > > usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3) > -- candidate of the eselect "blas" unit > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > -- compiled from the same set of object files as > libopenblas.so.0 > > usr/lib64/lapack/openblas/liblapack.so.3 > (SONAME=liblapack.so.3) > -- candidate of the eselect "lapack" unit > -- will be symlinked to usr/lib64/liblapack.so.3 by eselect > -- compiled from the same set of object files as > libopenblas.so.0 > > This solution is similar to Debian's[3]. This solution achieves our > goal, > and it requires us to patch upstream build systems (same to Debian). > Preliminary demonstration for this solution is available, see below. > > Is this solution reliable? > -- > > * A similar solution has been used by Debian for many years. > * Many projects call BLAS/LAPACK libraries through FFI, including > Julia. > (See Julia's standard library: LinearAlgebra) > > Proposed Changes > > > 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from > gentoo >main repo. They use exactly the same source tarball. It's not > quite >helpful to package these components in a fine-grained manner. A > single >sci-l