Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed)

2019-06-17 Thread Mo Zhou
Hi Michał,

Sorry for the late reply. Just encountered some severe hardware failure.

On 2019-06-13 07:49, Michał Górny wrote:
>>
>> sci-libs/{blas,cblas,lapack,lapacke}::gentoo should be deprecated. They
>> are based on exactly the same source tarball, and maintaining 4 ebuild
>> files for a single tarball is not a good choice IHMO. Those old ebuild
>> files seems to leverage the flexibility of upstream build system
>> because it enables one to, for example, skip the reference blas build
>> and use an existing optimized BLAS impelementation and hence introduce
>> flexibility. That flexibility is hard to maintain and is not necessary
>> anymore with the new runtime switching mechanism.
>>
>> That's why I propose to merge the 4 ebuild into a single one:
>> sci-libs/lapack. We don't need to add the "reference" postfix
>> because no upstream will loot the name "lapack". When talking
>> about "lapack" it's always the reference implementation.
> 
> What's the real gain here, and how does it compare to loss of
> flexibility of being able to build only what the package in question
> needs?

First let's see what these 4 components are:
1. blas: written in fortran, provides fundamental linear algebra
 routines. libblas.so can work alone.
2. cblas: a thin C wrapper around the fortran blas. that means
 libcblas.so calls libblas.so for the real calculation.
3. lapack: written in fortran, frequently calls BLAS for
 implementing higher level linear algebra routines.
 liblapack.so needs libblas.so (fortran).
4. lapacke: a thin C wrapper around the fortran lapack.
 liblapacke.so needs liblapack.so.

The real gain by merging 4 ebuilds into 1 ebuild:
1. easier to maintain, updating 4 ebuilds on every single
   version bump is much harder compared to updating only 1.
   This will also make it easier to provide and maintain
   the virtual-* features for long run.
2. could avoid confusing or even potentially problematic
   setups, e.g.: A user happened to compile OpenBLAS for
   the libblas provider, and BLIS for the libcblas provider:

   appA -> libblas (OpenBLAS)
   appB -> libcblas (BLIS)
   appC -> liblapacke (Ref) -> liblapack (Ref) -> libblas (OpenBLAS)
-> libcblas.so (BLIS)

   The user will get him/herself confused on what BLAS
   is really doing the calculation. Plus, sometimes
   mixing threading model may cause poor performance
   (e.g. openmp + pthread) or even silent corruption
   (e.g. GNU openmp + Intel openmp).

   Merging cblas into blas, and lapacke into lapack
   will make it harder to get things wrong.

IHMO that mentioned flexibility is not really necessary. Any
scientific computing user who needs performance and dislikes
the virtual-* solution could directly link their programs
against MKL or openblas without thinking about the reference
blas, because both MKL and OpenBLAS provides the full set
of blas,cblas,lapack,lapacke API and ABI via a single shared
object. Plus, that flexibility could be replaced by the
proposed runtime switching solution: by alternating
the blas(cblas) selection, liblapack.so can be dynamicly
linked against different optimized implementations.

Discarding this flexibility will only affect users who
insist on linking an unoptimized lapack against a specific
blas implementation. And one may also fall into trouble
with such flexibility, e.g.:

   libcblas (Reference) -> libblas.so (reference)
   liblapack (Reference) -> libopenblas.so

   appC -> (liblapacke, libcblas)
--> liblapacke -> liblapack -> libopenblas
--> libcblas (reference)

   libopenblas's ABI is a superset of those of libcblas,
   which indicates confusion and symbol race condition
   during run-time.

With the proposed (redesigned) solution, these potentially
bad cases could be avoided because the solution trys to keep
the backend consistency. Some people had headache on the
BLAS/LAPACK flexibility and they created flexiblas.

In a word, the (4->1) change can reudce the maintaining cost
for (blas,cblas,lapack,lapacke) and make the virtual-* feature
easier to implement and maintain for long run. Additionally,
the flexibility mentioned before is not really necessary when
the virtual-* feature is fully implemented.

Best,
Mo.



Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed)

2019-06-13 Thread Michał Górny
On Thu, 2019-06-13 at 00:15 -0700, Mo Zhou wrote:
> Hi Gentoo devs,
> 
> I redesigned the solution for BLAS/LAPACK runtime switching.
> New solution is based on eselect+ld.so.conf . See following.
> 
> > Goal
> > 
> > 
> >   * When a program is linked against libblas.so or liblapack.so
> > provided by any BLAS/LAPACK provider, the eselect-based solution
> > will allow user to switch the underlying library without recompiling
> > anything.
> 
> Instead of manipulating symlinks, I wrote a dedicated eselect module
> for BLAS:
> 
> https://github.com/cdluminate/my-overlay/blob/master/app-eselect/eselect-blas/files/blas.eselect-0.2
> 
> This implementation will generate a corresponding ld.so.conf file
> on switching and refresh ld.so cache.
> 
> advantages:
> 
> 1. not longer manipulates symlinks under package manager directory,
>see https://bugs.gentoo.org/531842 and https://bugs.gentoo.org/632624
> 
> 2. we don't have to think about static lib and header switching like
>Debian does.
> 
> >   * When a program is linked against a specific implementation, e.g.
> > libmkl_rt.so, the solution doesn't break anything.
> 
> This still holds with the new solution.
> 
> > Solution
> > 
> > 
> > Similar to Debian's update-alternatives mechanism, Gentoo's eselect
> > is good at dealing with drop-in replacements as well. My preliminary
> 
> The redesigned solution totally diverted from Debian's solution.
> 
> * sci-libs/lapack provides standard API and ABI for BLAS/CBLAS/LAPACK
>   and LAPACKE. It provides a default set of libblas.so, libcblas.so,
>   and liblapack.so . Reverse dependencies linked against the three
>   libraries (reference blas) will take advantage of the runtime
>   switching mechanism through USE="virtual-blas virtual-lapack".
>   Reverse dependencies linked to specific implementations such as
>   libopenblas.so won't be affected at all.
> 
> * every non-standard BLAS/LAPACK implementations could be registered
>   as alternatives via USE="virtual-blas virtual-lapack". Once the
>   virtual-* flags are toggled, the ebuild file will build some
>   extra shared objects with correct SONAME.
> 
>   For example:
> 
>   /usr/lib64/libblis.so.2 (SONAME=libblis.so.2, general purpose)
>   /usr/lib64/blas/blis/libblas.so.3 (USE="virtual-blas",
> SONAME=libblas.so.3)
>   /usr/lib64/blas/blis/libcblas.so.3 (USE="virtual-blas",
> SONAME=libcblas.so.3)
> 
> * Reverse dependencies of BLAS/LAPACK could optionally provide the
>   "virtual-blas virtual-lapack" USE flags.
> 
>   if use virtual-*:
>   link against reference blas/lapack
>   else:
>   link against whatever the ebuild maintainer like and get rid
>   of the switching mechanism
> 
> > Proposed Changes
> > 
> > 
> > 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo
> >main repo. They use exactly the same source tarball. It's not quite
> >helpful to package these components in a fine-grained manner. A
> > single
> >sci-libs/lapack package is enough.
> 
> sci-libs/{blas,cblas,lapack,lapacke}::gentoo should be deprecated. They
> are based on exactly the same source tarball, and maintaining 4 ebuild
> files for a single tarball is not a good choice IHMO. Those old ebuild
> files seems to leverage the flexibility of upstream build system
> because it enables one to, for example, skip the reference blas build
> and use an existing optimized BLAS impelementation and hence introduce
> flexibility. That flexibility is hard to maintain and is not necessary
> anymore with the new runtime switching mechanism.
> 
> That's why I propose to merge the 4 ebuild into a single one:
> sci-libs/lapack. We don't need to add the "reference" postfix
> because no upstream will loot the name "lapack". When talking
> about "lapack" it's always the reference implementation.

What's the real gain here, and how does it compare to loss of
flexibility of being able to build only what the package in question
needs?


-- 
Best regards,
Michał Górny



signature.asc
Description: This is a digitally signed message part


Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed)

2019-06-13 Thread Mo Zhou
Hi Gentoo devs,

I redesigned the solution for BLAS/LAPACK runtime switching.
New solution is based on eselect+ld.so.conf . See following.

> Goal
> 
> 
>   * When a program is linked against libblas.so or liblapack.so
> provided by any BLAS/LAPACK provider, the eselect-based solution
> will allow user to switch the underlying library without recompiling
> anything.

Instead of manipulating symlinks, I wrote a dedicated eselect module
for BLAS:

https://github.com/cdluminate/my-overlay/blob/master/app-eselect/eselect-blas/files/blas.eselect-0.2

This implementation will generate a corresponding ld.so.conf file
on switching and refresh ld.so cache.

advantages:

1. not longer manipulates symlinks under package manager directory,
   see https://bugs.gentoo.org/531842 and https://bugs.gentoo.org/632624

2. we don't have to think about static lib and header switching like
   Debian does.

>   * When a program is linked against a specific implementation, e.g.
> libmkl_rt.so, the solution doesn't break anything.

This still holds with the new solution.

> Solution
> 
> 
> Similar to Debian's update-alternatives mechanism, Gentoo's eselect
> is good at dealing with drop-in replacements as well. My preliminary

The redesigned solution totally diverted from Debian's solution.

* sci-libs/lapack provides standard API and ABI for BLAS/CBLAS/LAPACK
  and LAPACKE. It provides a default set of libblas.so, libcblas.so,
  and liblapack.so . Reverse dependencies linked against the three
  libraries (reference blas) will take advantage of the runtime
  switching mechanism through USE="virtual-blas virtual-lapack".
  Reverse dependencies linked to specific implementations such as
  libopenblas.so won't be affected at all.

* every non-standard BLAS/LAPACK implementations could be registered
  as alternatives via USE="virtual-blas virtual-lapack". Once the
  virtual-* flags are toggled, the ebuild file will build some
  extra shared objects with correct SONAME.

  For example:

  /usr/lib64/libblis.so.2 (SONAME=libblis.so.2, general purpose)
  /usr/lib64/blas/blis/libblas.so.3 (USE="virtual-blas",
SONAME=libblas.so.3)
  /usr/lib64/blas/blis/libcblas.so.3 (USE="virtual-blas",
SONAME=libcblas.so.3)

* Reverse dependencies of BLAS/LAPACK could optionally provide the
  "virtual-blas virtual-lapack" USE flags.

  if use virtual-*:
  link against reference blas/lapack
  else:
  link against whatever the ebuild maintainer like and get rid
  of the switching mechanism

> Proposed Changes
> 
> 
> 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo
>main repo. They use exactly the same source tarball. It's not quite
>helpful to package these components in a fine-grained manner. A
> single
>sci-libs/lapack package is enough.

sci-libs/{blas,cblas,lapack,lapacke}::gentoo should be deprecated. They
are based on exactly the same source tarball, and maintaining 4 ebuild
files for a single tarball is not a good choice IHMO. Those old ebuild
files seems to leverage the flexibility of upstream build system
because it enables one to, for example, skip the reference blas build
and use an existing optimized BLAS impelementation and hence introduce
flexibility. That flexibility is hard to maintain and is not necessary
anymore with the new runtime switching mechanism.

That's why I propose to merge the 4 ebuild into a single one:
sci-libs/lapack. We don't need to add the "reference" postfix
because no upstream will loot the name "lapack". When talking
about "lapack" it's always the reference implementation.

> 2. Merge the "cblas" eselect unit into "blas" unit. It is potentially
>harmful when "blas" and "cblas" point to different implementations.
>That means "app-eselect/eselect-cblas" should be deprecated.

eselect-cblas should be deprecated. That affects gsl because it is
registered as an cblas alternative. gsl doesn't provide the standard
BLAS (fortran) API+ABI so it will not be added as a runtime switching
candidate.

Does this redesinged solution look acceptable now?

Best,
Mo.



Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching

2019-05-29 Thread Benda Xu
Hi David,

David Seifert  writes:

>> > An actual ABI compliance test, e.g. done using abi-compliance-
>> > checker would be more interesting.
>> 
>> As said above, the symbols don't need to be 1-1 copy of each other.
>> Any library which is a superset of the reference one will work.
>
> Again, I'm willing to accept this under a USE="lapack_targets_virtual"
> configuration, 

I hear you, and I was with the same opinion, too.

Nevertheless, if a runtime switch works, it is simpler than USE_EXPAND.

We already have a couple of them, PYTHON_TARGET, RUBY_TARGET, etc.  I
don't want to add more to this list unless absolutely necessary.

Alternatively, it could be exclusive USE flags instead of USE_EXPAND.
That's possible and we need to add a lot of USE flags, OpenBLAS, blis,
reference, altas, nvBLAS, etc.  I don't think it a clean solution.

> but wholesale editing of DT_NEEDED entries is definitely too scary and
> too invasive for most non-sci/hpc users of Gentoo.

Sorry if I confused you.  There is no hack of DT_NEEDED involved.

An application is compiled against the reference, then it is pointed by
LDPATH and ld.so.conf to a drop-in replacement library, and it profits.
That's not more than upgrading a dynamic library.

The scheme was shown to work with gcc runtime, libstdc++, and opengl, in
Gentoo.

> Again, for 99% of users, OpenBLAS will be the right trade-off between
> performance and customizability. Every recompilation of libreoffice or
> chromium will devour more CPU cycles than switching between USE-flag
> implementations.

I understand your point.  Let me elaborate:

1. OpenBLAS has quite a bit of hand-tuned assembly, it is not very
   portable.  Special care is need to manage the default BLAS on
   profiles, like ppc64, arm64, arm, riscv, etc.

2. All the Gentoo users care about optimization.  And there are
   non-science applications that call linear algebra routines.

3. Maintaining different BLAS frameworks in repo/gentoo.git and
   proj/sci.git wastes everybody's time, and causes endless confusion to
   users.  That offsets the time saved to make OpenBLAS the global
   default.

Therefore, from the users' point of view, I still think runtime
switching makes more sense.

Yours,
Benda



Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching

2019-05-29 Thread David Seifert
On Wed, 2019-05-29 at 22:33 +0800, Benda Xu wrote:
> Hi Michał,
> 
> Michał Górny  writes:
> 
> > On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote:
> > > Different BLAS/LAPACK implementations are expected to be
> > > compatible
> > > to each other in both the API and ABI level. They can be used as
> > > drop-in replacement to the others. This sounds nice, but the
> > > difference
> > > in SONAME hampered the gentoo integration of well-optimized ones.
> > 
> > If SONAMEs are different, then they are not compatible by
> > definition.
> 
> This blas/lapack SONAME difference is a special case.  They are
> parially
> compatible in the sense that every alternative implementation of blas
> is
> a superset of the reference one.
> 
> Therefore linking to the reference at build time will make sure the
> compatibility with the alternative implementations, even with
> different
> SONAME.
> 
> > > [...]
> > > 
> > > Similar to Debian's update-alternatives mechanism, Gentoo's
> > > eselect
> > > is good at dealing with drop-in replacements as well. My
> > > preliminary
> > > investigation suggests that eselect is enough for enabling
> > > BLAS/LAPACK
> > > runtime switching. Hence, the proposed solution is eselect-based:
> > > 
> > >   * Every BLAS/LAPACK implementation should provide generic
> > > library
> > > and eselect candidate libraries at the same time. Taking
> > > netlib,
> > > BLIS and OpenBLAS as examples:
> > > 
> > > reference:
> > > 
> > >   usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3)
> > > -- default BLAS provider
> > > -- candidate of the eselect "blas" unit
> > > -- will be symlinked to usr/lib64/libblas.so.3 by eselect
> > 
> > /usr/lib64 is not supposed to be modified by eselect, it's package
> > manager area.  Yes, I know a lot of modules still do that but
> > that's no
> > reason to make things worse when people are putting significant
> > effort
> > to actually improve things.
> 
> Sorry, I didn't see your reply before mine.
> 
> We are going to use the LDPATH and ld.so.conf mechanism suggested by
> you.
> 
> > >   usr/lib64/lapack/reference/liblapack.so.3
> > > (SONAME=liblapack.so.3)
> > > -- default LAPACK provider
> > > -- candidate of the eselect "lapack" unit
> > > -- will be symlinked to usr/lib64/liblapack.so.3 by
> > > eselect
> > > 
> > > blis (doesn't provide LAPACK):
> > >   
> > >   usr/lib64/libblis.so.2  (SONAME=libblis.so.2)
> > > -- general purpose
> > > 
> > >   usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3)
> > > -- candidate of the eselect "blas" unit
> > > -- will be symlinked to usr/lib64/libblas.so.3 by eselect
> > > -- compiled from the same set of object files as
> > > libblis.so.2
> > > 
> > > openblas:
> > >   
> > >   usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0)
> > > -- general purpose
> > > 
> > >   usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3)
> > > -- candidate of the eselect "blas" unit
> > > -- will be symlinked to usr/lib64/libblas.so.3 by eselect
> > > -- compiled from the same set of object files as
> > > libopenblas.so.0
> > > 
> > >   usr/lib64/lapack/openblas/liblapack.so.3
> > > (SONAME=liblapack.so.3)
> > > -- candidate of the eselect "lapack" unit
> > > -- will be symlinked to usr/lib64/liblapack.so.3 by
> > > eselect
> > > -- compiled from the same set of object files as
> > > libopenblas.so.0
> > > 
> > > This solution is similar to Debian's[3]. This solution achieves
> > > our
> > > goal,
> > > and it requires us to patch upstream build systems (same to
> > > Debian).
> > > Preliminary demonstration for this solution is available, see
> > > below.
> > 
> > So basically the three walls of text say in round-about way that
> > you're
> > going to introduce custom hacks to recompile libraries with
> > different
> > SONAME.  Ok.
> > > Is this solution reliable?
> > > --
> > > 
> > > * A similar solution has been used by Debian for many years.
> > > * Many projects call BLAS/LAPACK libraries through FFI, including
> > > Julia.
> > >   (See Julia's standard library: LinearAlgebra)
> > > 
> > > Proposed Changes
> > > 
> > > 
> > > 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from
> > > gentoo
> > >main repo. They use exactly the same source tarball. It's not
> > > quite
> > >helpful to package these components in a fine-grained manner.
> > > A
> > > single
> > >sci-libs/lapack package is enough.
> > 
> > Where's the gain in that?
> > > 2. Merge the "cblas" eselect unit into "blas" unit. It is
> > > potentially
> > >harmful when "blas" and "cblas" point to different
> > > implementations.
> > >That means "app-eselect/eselect-cblas" should be deprecated.
> > > 
> > > 3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK
> > > providers
> > >

Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching

2019-05-29 Thread Benda Xu
Hi Michał,

Michał Górny  writes:

> On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote:
>> Different BLAS/LAPACK implementations are expected to be compatible
>> to each other in both the API and ABI level. They can be used as
>> drop-in replacement to the others. This sounds nice, but the difference
>> in SONAME hampered the gentoo integration of well-optimized ones.
>
> If SONAMEs are different, then they are not compatible by definition.

This blas/lapack SONAME difference is a special case.  They are parially
compatible in the sense that every alternative implementation of blas is
a superset of the reference one.

Therefore linking to the reference at build time will make sure the
compatibility with the alternative implementations, even with different
SONAME.

>> [...]
>> 
>> Similar to Debian's update-alternatives mechanism, Gentoo's eselect
>> is good at dealing with drop-in replacements as well. My preliminary
>> investigation suggests that eselect is enough for enabling BLAS/LAPACK
>> runtime switching. Hence, the proposed solution is eselect-based:
>> 
>>   * Every BLAS/LAPACK implementation should provide generic library
>> and eselect candidate libraries at the same time. Taking netlib,
>> BLIS and OpenBLAS as examples:
>> 
>> reference:
>> 
>>   usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3)
>> -- default BLAS provider
>> -- candidate of the eselect "blas" unit
>> -- will be symlinked to usr/lib64/libblas.so.3 by eselect
>
> /usr/lib64 is not supposed to be modified by eselect, it's package
> manager area.  Yes, I know a lot of modules still do that but that's no
> reason to make things worse when people are putting significant effort
> to actually improve things.

Sorry, I didn't see your reply before mine.

We are going to use the LDPATH and ld.so.conf mechanism suggested by
you.

>>   usr/lib64/lapack/reference/liblapack.so.3 (SONAME=liblapack.so.3)
>> -- default LAPACK provider
>> -- candidate of the eselect "lapack" unit
>> -- will be symlinked to usr/lib64/liblapack.so.3 by eselect
>> 
>> blis (doesn't provide LAPACK):
>>   
>>   usr/lib64/libblis.so.2  (SONAME=libblis.so.2)
>> -- general purpose
>> 
>>   usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3)
>> -- candidate of the eselect "blas" unit
>> -- will be symlinked to usr/lib64/libblas.so.3 by eselect
>> -- compiled from the same set of object files as libblis.so.2
>> 
>> openblas:
>>   
>>   usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0)
>> -- general purpose
>> 
>>   usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3)
>> -- candidate of the eselect "blas" unit
>> -- will be symlinked to usr/lib64/libblas.so.3 by eselect
>> -- compiled from the same set of object files as
>> libopenblas.so.0
>> 
>>   usr/lib64/lapack/openblas/liblapack.so.3 (SONAME=liblapack.so.3)
>> -- candidate of the eselect "lapack" unit
>> -- will be symlinked to usr/lib64/liblapack.so.3 by eselect
>> -- compiled from the same set of object files as
>> libopenblas.so.0
>> 
>> This solution is similar to Debian's[3]. This solution achieves our
>> goal,
>> and it requires us to patch upstream build systems (same to Debian).
>> Preliminary demonstration for this solution is available, see below.
>
> So basically the three walls of text say in round-about way that you're
> going to introduce custom hacks to recompile libraries with different
> SONAME.  Ok.

>> Is this solution reliable?
>> --
>> 
>> * A similar solution has been used by Debian for many years.
>> * Many projects call BLAS/LAPACK libraries through FFI, including Julia.
>>   (See Julia's standard library: LinearAlgebra)
>> 
>> Proposed Changes
>> 
>> 
>> 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo
>>main repo. They use exactly the same source tarball. It's not quite
>>helpful to package these components in a fine-grained manner. A
>> single
>>sci-libs/lapack package is enough.
>
> Where's the gain in that?

>> 2. Merge the "cblas" eselect unit into "blas" unit. It is potentially
>>harmful when "blas" and "cblas" point to different implementations.
>>That means "app-eselect/eselect-cblas" should be deprecated.
>> 
>> 3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK providers
>>will be registered in their dependency information.
>> 
>> Note, ebuilds for BLAS/LAPACK reverse dependencies are expected to work
>> with these changes correctly without change. For example, my local
>> numpy-1.16.1 compilation was successful without change.
>> 
>> Preliminary Demonstration
>> -
>> 
>> The preliminary implementation is available in my personal overlay[4].
>> A simple sanity test script `check-cpp.sh` is provided to illustrate
>> the effectiveness of the proposed 

Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching

2019-05-29 Thread Benda Xu
Dear David,

David Seifert  writes:

> We already have such a solution in the sci-overlay. It has proven
> extremely brittle and shaky.

What's more, using eselect set which library to link to was regarded
harmful.

> The plan is to do this via USE flags similar to python-single-r1
> flags. 

Yes, that was the conclusion in 2017.  

But now there is something to be renewed, see below.

> Optionally, we can leave a "virtual" USE flag, where users can switch
> implementation at runtime, but this will not be a supported
> configuration.

[...]

> While I understand that runtime-switching sounds like a great feature,
> it has proven too difficult to get right and too hard to enforce
> invariants on correct symlinks.

I thought so, too. Back in 2017, I did an investigate of Debian runtime
switching, and concluded the patches were too heavy for us. That was the
main reason we did not consider the Debian runtime switching way.[0, 1]

However, I have made a deeper study this year, and concluded it is quite
doable in a simple way in Gentoo.


Historically, in alternative{,-2}.eclass of the science overlay,
"alternative" is a name from Debian[2]. We have made it too complex over
the years, and now there is a good chance to make it simple again.

> People that want this can go the virtual+eselect approach in the
> overlay,

Which is brittle and shaky as we agree.

> but 99% of Gentoo users will be happy with just linking against
> OpenBLAS/reference-lapack and not having to fix weird stale symlinks
> that eselect-alternatives somehow lost track of.

With the simplified approach Mo is after, there will no longer be weird
stale symlinks issues.  The solution has been successfully proved by the
update-alternative mechanism in Debian.  Actually, by adopting mgorny's
LDPATH way of handing gcc runtime, we don't need symlinks at all![3]

For more details, please refer to
https://people.debian.org/~lumin/gsoc19-gentoo.pdf

> See also:
> https://bugs.gentoo.org/632624
> https://bugs.gentoo.org/348843#c30

Thanks. That reminded me of all the discussions we all along.

Yours,
Benda

0. https://github.com/gentoo/sci/issues/805#issuecomment-345690034
1. https://github.com/gentoo/sci/issues/805#issuecomment-345701064
2. https://github.com/gentoo/sci/issues/805#issuecomment-332887691
3. https://bugs.gentoo.org/632618



Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching

2019-05-29 Thread Michał Górny
On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote:
> Different BLAS/LAPACK implementations are expected to be compatible
> to each other in both the API and ABI level. They can be used as
> drop-in replacement to the others. This sounds nice, but the difference
> in SONAME hampered the gentoo integration of well-optimized ones.

If SONAMEs are different, then they are not compatible by definition.

> Assume a Gentoo user compiled a pile of packages on top of the reference
> BLAS and LAPACK, namely these reverse dependencies are linked against
> libblas.so.3 and liblapack.so.3 . When the user discovered that
> OpenBLAS provides much better performance, they'll have to recompile
> the whole reverse dependency tree in order to take advantage from
> OpenBLAS,
> because the SONAME of OpenBLAS is libopenblas.so.0 . When the user
> wants to try MKL (libmkl_rt.so), they'll have to recompile the whole
> reverse dependency tree again.
> 
> This is not friendly to our earth.
> 
> Goal
> 
> 
>   * When a program is linked against libblas.so or liblapack.so
> provided by any BLAS/LAPACK provider, the eselect-based solution
> will allow user to switch the underlying library without recompiling
> anything.
> 
>   * When a program is linked against a specific implementation, e.g.
> libmkl_rt.so, the solution doesn't break anything.
> 
> Solution
> 
> 
> Similar to Debian's update-alternatives mechanism, Gentoo's eselect
> is good at dealing with drop-in replacements as well. My preliminary
> investigation suggests that eselect is enough for enabling BLAS/LAPACK
> runtime switching. Hence, the proposed solution is eselect-based:
> 
>   * Every BLAS/LAPACK implementation should provide generic library
> and eselect candidate libraries at the same time. Taking netlib,
> BLIS and OpenBLAS as examples:
> 
> reference:
> 
>   usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3)
> -- default BLAS provider
> -- candidate of the eselect "blas" unit
> -- will be symlinked to usr/lib64/libblas.so.3 by eselect

/usr/lib64 is not supposed to be modified by eselect, it's package
manager area.  Yes, I know a lot of modules still do that but that's no
reason to make things worse when people are putting significant effort
to actually improve things.

>   usr/lib64/lapack/reference/liblapack.so.3 (SONAME=liblapack.so.3)
> -- default LAPACK provider
> -- candidate of the eselect "lapack" unit
> -- will be symlinked to usr/lib64/liblapack.so.3 by eselect
> 
> blis (doesn't provide LAPACK):
>   
>   usr/lib64/libblis.so.2  (SONAME=libblis.so.2)
> -- general purpose
> 
>   usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3)
> -- candidate of the eselect "blas" unit
> -- will be symlinked to usr/lib64/libblas.so.3 by eselect
> -- compiled from the same set of object files as libblis.so.2
> 
> openblas:
>   
>   usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0)
> -- general purpose
> 
>   usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3)
> -- candidate of the eselect "blas" unit
> -- will be symlinked to usr/lib64/libblas.so.3 by eselect
> -- compiled from the same set of object files as
> libopenblas.so.0
> 
>   usr/lib64/lapack/openblas/liblapack.so.3 (SONAME=liblapack.so.3)
> -- candidate of the eselect "lapack" unit
> -- will be symlinked to usr/lib64/liblapack.so.3 by eselect
> -- compiled from the same set of object files as
> libopenblas.so.0
> 
> This solution is similar to Debian's[3]. This solution achieves our
> goal,
> and it requires us to patch upstream build systems (same to Debian).
> Preliminary demonstration for this solution is available, see below.

So basically the three walls of text say in round-about way that you're
going to introduce custom hacks to recompile libraries with different
SONAME.  Ok.

> 
> Is this solution reliable?
> --
> 
> * A similar solution has been used by Debian for many years.
> * Many projects call BLAS/LAPACK libraries through FFI, including Julia.
>   (See Julia's standard library: LinearAlgebra)
> 
> Proposed Changes
> 
> 
> 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo
>main repo. They use exactly the same source tarball. It's not quite
>helpful to package these components in a fine-grained manner. A
> single
>sci-libs/lapack package is enough.

Where's the gain in that?

> 2. Merge the "cblas" eselect unit into "blas" unit. It is potentially
>harmful when "blas" and "cblas" point to different implementations.
>That means "app-eselect/eselect-cblas" should be deprecated.
> 
> 3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK providers
>will be registered in their dependency information.
> 
> Note, ebuilds for BLAS/LAPACK reverse dependencies are expected 

Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching

2019-05-28 Thread David Seifert
On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote:
> Hi Gentoo devs,
> 
> Classical numerical linear algebra libraries, BLAS[1] and LAPACK[2]
> play important roles in the scientific computing field, as many
> software such as Numpy, Scipy, Julia, Octave, R are built upon them.
> 
> There is a standard implementation of BLAS and LAPACK, named netlib
> or simply "reference implementation". This implementation had been
> provided by gentoo's main repo. However, it has a major problem:
> performance. On the other hand, a number of well-optimized
> BLAS/LAPACK
> implementations exist, including OpenBLAS (free), BLIS (free),
> MKL (non-free), etc., but none of them has been properly integrated
> into the Gentoo distribution.
> 
> I'm writing to propose a good solution to this problem. If no gentoo
> developer is object to this proposal, I'll keep moving forward and
> start submitting PRs to Gentoo main repo.
> 
> Historical Obstacle
> ---
> 
> Different BLAS/LAPACK implementations are expected to be compatible
> to each other in both the API and ABI level. They can be used as
> drop-in replacement to the others. This sounds nice, but the
> difference
> in SONAME hampered the gentoo integration of well-optimized ones.
> 
> Assume a Gentoo user compiled a pile of packages on top of the
> reference
> BLAS and LAPACK, namely these reverse dependencies are linked against
> libblas.so.3 and liblapack.so.3 . When the user discovered that
> OpenBLAS provides much better performance, they'll have to recompile
> the whole reverse dependency tree in order to take advantage from
> OpenBLAS,
> because the SONAME of OpenBLAS is libopenblas.so.0 . When the user
> wants to try MKL (libmkl_rt.so), they'll have to recompile the whole
> reverse dependency tree again.
> 
> This is not friendly to our earth.
> 
> Goal
> 
> 
>   * When a program is linked against libblas.so or liblapack.so
> provided by any BLAS/LAPACK provider, the eselect-based solution
> will allow user to switch the underlying library without
> recompiling
> anything.
> 
>   * When a program is linked against a specific implementation, e.g.
> libmkl_rt.so, the solution doesn't break anything.
> 
> Solution
> 
> 
> Similar to Debian's update-alternatives mechanism, Gentoo's eselect
> is good at dealing with drop-in replacements as well. My preliminary
> investigation suggests that eselect is enough for enabling
> BLAS/LAPACK
> runtime switching. Hence, the proposed solution is eselect-based:
> 
>   * Every BLAS/LAPACK implementation should provide generic library
> and eselect candidate libraries at the same time. Taking netlib,
> BLIS and OpenBLAS as examples:
> 
> reference:
> 
>   usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3)
> -- default BLAS provider
> -- candidate of the eselect "blas" unit
> -- will be symlinked to usr/lib64/libblas.so.3 by eselect
> 
>   usr/lib64/lapack/reference/liblapack.so.3
> (SONAME=liblapack.so.3)
> -- default LAPACK provider
> -- candidate of the eselect "lapack" unit
> -- will be symlinked to usr/lib64/liblapack.so.3 by eselect
> 
> blis (doesn't provide LAPACK):
>   
>   usr/lib64/libblis.so.2  (SONAME=libblis.so.2)
> -- general purpose
> 
>   usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3)
> -- candidate of the eselect "blas" unit
> -- will be symlinked to usr/lib64/libblas.so.3 by eselect
> -- compiled from the same set of object files as libblis.so.2
> 
> openblas:
>   
>   usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0)
> -- general purpose
> 
>   usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3)
> -- candidate of the eselect "blas" unit
> -- will be symlinked to usr/lib64/libblas.so.3 by eselect
> -- compiled from the same set of object files as
> libopenblas.so.0
> 
>   usr/lib64/lapack/openblas/liblapack.so.3
> (SONAME=liblapack.so.3)
> -- candidate of the eselect "lapack" unit
> -- will be symlinked to usr/lib64/liblapack.so.3 by eselect
> -- compiled from the same set of object files as
> libopenblas.so.0
> 
> This solution is similar to Debian's[3]. This solution achieves our
> goal,
> and it requires us to patch upstream build systems (same to Debian).
> Preliminary demonstration for this solution is available, see below.
> 
> Is this solution reliable?
> --
> 
> * A similar solution has been used by Debian for many years.
> * Many projects call BLAS/LAPACK libraries through FFI, including
> Julia.
>   (See Julia's standard library: LinearAlgebra)
> 
> Proposed Changes
> 
> 
> 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from
> gentoo
>main repo. They use exactly the same source tarball. It's not
> quite
>helpful to package these components in a fine-grained manner. A
> single
>

[gentoo-dev] RFC: BLAS and LAPACK runtime switching

2019-05-28 Thread Mo Zhou
Hi Gentoo devs,

Classical numerical linear algebra libraries, BLAS[1] and LAPACK[2]
play important roles in the scientific computing field, as many
software such as Numpy, Scipy, Julia, Octave, R are built upon them.

There is a standard implementation of BLAS and LAPACK, named netlib
or simply "reference implementation". This implementation had been
provided by gentoo's main repo. However, it has a major problem:
performance. On the other hand, a number of well-optimized BLAS/LAPACK
implementations exist, including OpenBLAS (free), BLIS (free),
MKL (non-free), etc., but none of them has been properly integrated
into the Gentoo distribution.

I'm writing to propose a good solution to this problem. If no gentoo
developer is object to this proposal, I'll keep moving forward and
start submitting PRs to Gentoo main repo.

Historical Obstacle
---

Different BLAS/LAPACK implementations are expected to be compatible
to each other in both the API and ABI level. They can be used as
drop-in replacement to the others. This sounds nice, but the difference
in SONAME hampered the gentoo integration of well-optimized ones.

Assume a Gentoo user compiled a pile of packages on top of the reference
BLAS and LAPACK, namely these reverse dependencies are linked against
libblas.so.3 and liblapack.so.3 . When the user discovered that
OpenBLAS provides much better performance, they'll have to recompile
the whole reverse dependency tree in order to take advantage from
OpenBLAS,
because the SONAME of OpenBLAS is libopenblas.so.0 . When the user
wants to try MKL (libmkl_rt.so), they'll have to recompile the whole
reverse dependency tree again.

This is not friendly to our earth.

Goal


  * When a program is linked against libblas.so or liblapack.so
provided by any BLAS/LAPACK provider, the eselect-based solution
will allow user to switch the underlying library without recompiling
anything.

  * When a program is linked against a specific implementation, e.g.
libmkl_rt.so, the solution doesn't break anything.

Solution


Similar to Debian's update-alternatives mechanism, Gentoo's eselect
is good at dealing with drop-in replacements as well. My preliminary
investigation suggests that eselect is enough for enabling BLAS/LAPACK
runtime switching. Hence, the proposed solution is eselect-based:

  * Every BLAS/LAPACK implementation should provide generic library
and eselect candidate libraries at the same time. Taking netlib,
BLIS and OpenBLAS as examples:

reference:

  usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3)
-- default BLAS provider
-- candidate of the eselect "blas" unit
-- will be symlinked to usr/lib64/libblas.so.3 by eselect

  usr/lib64/lapack/reference/liblapack.so.3 (SONAME=liblapack.so.3)
-- default LAPACK provider
-- candidate of the eselect "lapack" unit
-- will be symlinked to usr/lib64/liblapack.so.3 by eselect

blis (doesn't provide LAPACK):
  
  usr/lib64/libblis.so.2  (SONAME=libblis.so.2)
-- general purpose

  usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3)
-- candidate of the eselect "blas" unit
-- will be symlinked to usr/lib64/libblas.so.3 by eselect
-- compiled from the same set of object files as libblis.so.2

openblas:
  
  usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0)
-- general purpose

  usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3)
-- candidate of the eselect "blas" unit
-- will be symlinked to usr/lib64/libblas.so.3 by eselect
-- compiled from the same set of object files as
libopenblas.so.0

  usr/lib64/lapack/openblas/liblapack.so.3 (SONAME=liblapack.so.3)
-- candidate of the eselect "lapack" unit
-- will be symlinked to usr/lib64/liblapack.so.3 by eselect
-- compiled from the same set of object files as
libopenblas.so.0

This solution is similar to Debian's[3]. This solution achieves our
goal,
and it requires us to patch upstream build systems (same to Debian).
Preliminary demonstration for this solution is available, see below.

Is this solution reliable?
--

* A similar solution has been used by Debian for many years.
* Many projects call BLAS/LAPACK libraries through FFI, including Julia.
  (See Julia's standard library: LinearAlgebra)

Proposed Changes


1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo
   main repo. They use exactly the same source tarball. It's not quite
   helpful to package these components in a fine-grained manner. A
single
   sci-libs/lapack package is enough.

2. Merge the "cblas" eselect unit into "blas" unit. It is potentially
   harmful when "blas" and "cblas" point to different implementations.
   That means "app-eselect/eselect-cblas" should be deprecated.

3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK providers