On 2021-10-26 16:20, Vitaly Davidovich wrote:
Hi Magnus,
On Tue, Oct 26, 2021 at 6:44 AM Magnus Ihse Bursie
<magnus.ihse.bur...@oracle.com> wrote:
On 2021-10-26 01:43, Vitaly Davidovich wrote:
> Hi all,
>
> We're testing some of our code on Java 17 (17.0.1)/linux and hit
an issue
> related to libsvml.so. It seems this library is now part of 17
to support
> the (incubating) Vector API. We have a java library backed (via
JNI) by
> NAG, which itself links against libsvml.so. The issue arises due to
> java.lang.UnsatisfiedLinkError when our java library is trying
to call into
> a NAG function which in turn is looking for a certain symbol
from libsvml
> (__svml_exp2_ha_mask in particular).
>
> It looks like the JDK is eagerly loading symbols from its
packaged libsvml
> (is there a way to disable that for now?). That version of the
library is
> also in conflict with the one we want to load, as witnessed by
the missing
> symbol (there're probably others but we stopped testing at this
point).
>
> Is this a known issue/compatibility hazard? Happy to hear
thoughts/opinions
> on this and provide further info, if needed.
It sounds like there are two completely different libsvml.so around,
muddying the waters. The libsvml.so shipped with the JDK is a core
part
of the vector functionality in the JDK. This is highly unlikely to be
installed as a system library (basically, if anyone has ever
managed to
do that, they've worked hard to do the wrong thing).
A quick googling indicates that there is a separate libsvml.so
shipped
by IBM as part of their compiler. My guess is that your JNI
library is
using this.
It's using libsvml from a NAG (vendor,
https://www.nag.com/content/nag-library) library; the NAG library
itself ships a version of Intel's MKL library, which is where libsvml
comes from. It looks like the JDK has its own subset of libsvml:
https://github.com/openjdk/jdk/tree/master/src/jdk.incubator.vector/linux/native/libsvml.
These all have Intel copyrights, so I assume they're based on the same
product. However, JDK's libsvml is a subset (in terms of total # of
symbols) from MKL's libsvml.
Okay, maybe I'm mistaken here. I helped to create libsvml in the JDK by
moving files out from src/hotspot, so I just assumed that these were
independent products just happening to share the same name.
I've cc:ed Sandhya who were involved in getting libsvml into the JDK to
shed some light on this.
We've run into similar issues in the past. Unfortunately, there is no
really good way of resolving this. :(
Sigh :(
One long term solution to minimize this kind of problems would be to
rename our library "libjsvml.so", in the hope that this is less
likely
to clash with another library. This is a similar solution to what
we've
been taking in the past with other name-clashing JDK libraries.
This would need to be coupled with JDK dlopen()'ing this "private" lib
with RTLD_LOCAL, right? Otherwise, it seems like one may end up with a
hybrid situation, where some symbols would be resolved from libjsvm.so
and others from libsvm.so.
If this is the case, then we might need to rename the symbols in
libjsvml as well. Normally, if we export symbols, they are given a
unique (or hopefully unique...) prefix, like JVM_ or
Java_<class>_<method>. This consideration was apparently not done for
the SVML lib, and if the code was copied with function names unchanged,
this went from "unlikely to collide" to "destined to collide". :-(
In the short term (say for a 17.0.2 release), could there be a JVM
flag added that could skip loading libsvm.so? I understand that
effectively would disable the Vector API, but that's ok for us - we're
not currently experimenting with that API. Just trying to get 17
working with existing "stock" code :).
Hm...
Actually, from looking at the code, I think you can just delete the
libsvml.so from the JDK installation. It is already dlload:ed, and if
that fails, I think it just silently ignores this.
/Magnus