Github user fommil commented on the pull request:
https://github.com/apache/incubator-spark/pull/575#issuecomment-35879891
Hi all,
The discussions with ASF on the LEGAL ticket has exposed some concerns -
**unrelated to the LGPL** -
that I think everybody needs to be aware of regarding native BLAS/LAPACK
libraries.
Basically, the ASF need to bundle and license their projects in a way that
is easy for distributors and end
users to understand. They have gone to a lot of effort to authorise
"Category A" and "Category B" licenses
so that there are no surprises.
However, native loading af system-provided BLAS/LAPACK **is** a surprise in
this context.
I don't want ASF's commercial distributors to get into a flap about
dynamically loading
binaries that were created by Apple, Intel, NVIDIA or AMD's. These binaries
would not be
explicitly listed in the software license list that Apache carefully
construct.
I propose a simple solution, which really is just the ASF's recommendation:
we make the native
components "optional" and make it very easy for distributors to turn them
on if they understand
the additional legal and technical implications.
In fact, `netlib-java` already supports this... conservative upstream
projects need only depend on the `core`
artefact, and then end-users who want the native performance improvements
can depend on `all` (or a more
specific artefact, including their own): natives are an optional runtime
dependency.
@dlwh would you be happy enough to change breeze's dependency to depend on
`com.github.fommil.netlib:core`
and give easy instructions to your upstream users to depend on
`com.github.fommil.netlib:all`
in order to get the native speedups? (I can even write a this to be
included in your `README`).
To be honest, it would actually help clean my inbox because I get a lot of
bug reports from users of
Breeze who are confused about logging messages regarding natives failing to
load because they have not
followed the system natives instructions (e.g. they haven't installed
ATLAS, so they get a warning message
and then it harmlessly falls back to the Fortran or F2J implementations).
Note that the Fortran reference natives - or ATLAS binaries - are not
necessarily a problem from a licensing
perspective, because we can explicitly list them. But, from a
technical point of view I don't think it's really worth the extra efforts
to give them
special attention. The performance results (above) agree with all industry
benchmarks (including my own
and the Java Matrix Benchmarks) that say system optimised natives greatly
outperform generically tuned
implementations. Also, the Fortran implementation is only marginally faster
than the F2J implementation
(JVM JIT for the win!).
BTW, it might be interesting for you to run the performance tests when
using the F2J backend of
`netlib-java` to convince yourself of the benefit of the system natives.
Does this sound sensible?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
[email protected] or file a JIRA ticket with INFRA.
---