On Tue, Apr 30, 2024 at 5:42 AM Magnus Ihse Bursie <magnus.ihse.bur...@oracle.com> wrote: > > > On 2024-04-26 03:15, Jiangli Zhou wrote: > > On Thu, Apr 25, 2024 at 9:28 AM Magnus Ihse Bursie > > <magnus.ihse.bur...@oracle.com> wrote: > >> > >> Just to be more clear, that's with using `objcopy` to localize > >> non-exported symbols for all JDK static libraries and libjvm.a, not just > >> libjvm.a right? > >> > >> Yes. > >> > >> > >> Can you please include the compiler or linker errors on linux/clang? > >> > >> It is a bit tricky. The problem arises at the partial linking step. The > >> problem seem to arise out of how clang converts a request to link into an > >> actual call to ld. I enabled debug code (printing the command line, and > >> running clang with `-v` to get it to print the actual command line used to > >> run ld) and ran it on GHA, where it worked fine. This is how it looks > >> there: > >> > >> WILL_RUN: /usr/bin/clang -v -m64 -r -o > >> /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o > >> > >> /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o > >> Ubuntu clang version 14.0.0-1ubuntu1.1 > >> Target: x86_64-pc-linux-gnu > >> Thread model: posix > >> InstalledDir: /usr/bin > >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10 > >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11 > >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12 > >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 > >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9 > >> Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 > >> Candidate multilib: .;@m64 > >> Selected multilib: .;@m64 > >> "/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m > >> elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o > >> /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o > >> -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13 > >> -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../lib64 > >> -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu > >> -L/usr/lib/../lib64 -L/usr/lib/llvm-14/bin/../lib -L/lib -L/usr/lib -r > >> /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o > >> > >> In contrast, on my machine it looks like this: > >> > >> WILL_RUN: > >> /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/clang -v > >> -m64 -r -o > >> /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o > >> > >> /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o > >> clang version 13.0.1 > >> Target: x86_64-unknown-linux-gnu > >> Thread model: posix > >> InstalledDir: > >> /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin > >> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 > >> Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 > >> Candidate multilib: .;@m64 > >> Candidate multilib: 32;@m32 > >> Candidate multilib: x32;@mx32 > >> Selected multilib: .;@m64 > >> "/usr/bin/ld" --hash-style=both --eh-frame-hdr -m elf_x86_64 > >> -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o > >> /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o > >> /lib/x86_64-linux-gnu/crt1.o /lib/x86_64-linux-gnu/crti.o > >> /usr/lib/gcc/x86_64-linux-gnu/9/crtbegin.o > >> -L/usr/lib/gcc/x86_64-linux-gnu/9 > >> -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 > >> -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu > >> -L/usr/lib/../lib64 > >> -L/usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/../lib > >> -L/lib -L/usr/lib -r > >> /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o > >> -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s > >> --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtend.o > >> /lib/x86_64-linux-gnu/crtn.o > >> /usr/bin/ld: cannot find -lgcc_s > >> /usr/bin/ld: cannot find -lgcc_s > >> clang-13: error: linker command failed with exit code 1 (use -v to see > >> invocation) > >> > >> I don't understand what makes clang think it should include "-lgcc > >> --as-needed -lgcc_s" and the crt*.o files when doing a partial link. In > >> fact, the entire process on how clang (and gcc) builds up the linker > >> command line is bordering on black magic to me. I think it can be affected > >> by variables set at compile time (at least this was the case for gcc, last > >> I checked), or maybe it picks up some kind of script from the environment. > >> That's why I believe my machine could just be messed up. > >> > >> I could get a bit further by passing "-nodefaultlibs" (or whatever it > >> was), but then the generated .o file were messed up wrt to library symbols > >> and it failed dramatically when trying to do the final link of the static > >> java launcher. > >> > >> > > Looks like you are using /usr/bin/ld and not lld. I haven't run into > > this type of issue. Have you tried -fuse-ld=lld? > > I am not sure why clang insisted on picking up ld and not lld. I remeber > trying with -fuse-ld=lld, and that it did not work either. > Unfortunately, I don't remember exactly what the problems were. > > I started reinstalling my Linux workstation yesterday, but something > went wrong, and it failed so hard that it got semi-bricked by the new > installation, so I need to redo everything from scratch. :-( After that > is done, I'll re-test. Hopefully this was just my old installation that > was too broken. > > > > > >>> > >>> I have also tried to extract all the changes (and only the changes) > >>> related to static build from the hermetic-java-runtime branch (ignoring > >>> the JavaHome/resource loading changes), to see if I could get something > >>> like StaticLink.gmk in mainline. I thought I was doing quite fine, but > >>> after a while I realized my testing was botched since the launcher had > >>> actually loaded the libraries dynamically instead, even though they were > >>> statically linked. :-( I am currently trying to bisect my way thought my > >>> repo to understand where things went wrong. > >> > >> Did you run with `bin/javastatic`? The system automatically detects if the > >> binary contains statically linked native libraries and avoids loading the > >> dynamic libraries. Can you please share which test(s) ran into the library > >> loading issue? I'll see if I can reproduce the problem that you are > >> running into. > >> > >> It was in fact not a problem. I was fooled by an error message. To be sure > >> I was not loading any dynamically linked libraries, I removed the jdk/lib > >> directory. Now the launcher failed, saying something like: > >> > >> "Error: Cannot locate dynamic library libjava.dylib". > >> > >> which was a bit of a jump scare. > >> > >> However, it turned out that it actually tried to load lib/jvm.cfg, and > >> failed in loading this (since I had removed the entire lib directory), and > >> this failure caused the above error message to be printed. When I restored > >> lib/jvm.cfg (but not any dynamic libraries), the launcher worked. > >> > > Sounds like you are running into problems immediately during startup. > > Does the problem occur with just running bin/javastatic using a simple > > HelloWorld? Can you please send me your command line for reproducing? > > Maybe I was not clear enough: I did resolve the problem. > > > For the static Java support, I changed CreateExecutionEnvironment to > > return immediately if it executes statically. jvm.cfg is not loaded. > > Please see > > https://github.com/openjdk/leyden/blob/c1c5fc686c1452550e1b3663a320fba652248505/src/java.base/unix/native/libjli/java_md.c#L296. > > Sounds like the JLI_IsStaticJDK() check is not working properly in > > your case. > > I've been trying to extract from your port a minimal set of patches that > is needed to get static build to work. In that process, JavaHome and > JLI_IsStaticJDK have been removed. It might be that this issue arised > only in my slimmed-down branch, and not on your leyden branch (at this > point I don't recall exactly). But, we need to fix this separately, > since we must be able to build a static launcher without the hermetic > changes.
The JDK and VM code has pre-existing assumptions about the JDK directories and dynamic linking (e.g. the .so). JLI_IsStaticJDK|JLI_SetStaticJDK|JVM_IsStaticJDK|JVM_SetStaticJDK is needed for static JDK support to handle those cases correctly. CreateExecutionEnvironment that I mentioned earlier is one of the examples. I'm quite certain the issue that you are running into is due to the incorrect static check/handling in CreateExecutionEnvironment. > > In my branch, I am only using compile-time #ifdef checks for static vs > dynamic. In the long run, the runtime checks that you have done are a > good thing, but at the moment they are just adding intrusive changes > without providing any benefit -- if we can't reuse .o files between > dynamic and static compilation, there is no point in introducing a > runtime check when we already have a working compile-time check. I haven't seen your branch/code. I'd suggest not going with the #ifdef checks as that's the opposite direction of what we want to achieve. It doesn't seem to be worth your effort to add more #ifdef checks in order to do static linking build work, even those are for temporary testing reasons. > > I did think I correctly changed every dynamic check that you had added > to a compile-time check, so it bewilders me somewhat when you say that > jvm.cfg is not needed in your branch. > > Can you verify and confirm that the static launcher actually works in > your branch, if there is no "lib/jvm.cfg" present? In my <path>/leyden/build/linux-x86_64-server-slowdebug/images/jdk directory: $ mv lib/jvm.cfg lib/jvm.cfg.no_used $ find . | grep jvm.cfg ./lib/jvm.cfg.no_used $ bin/javastatic -cp <my_jar> HelloWorld HelloWorld Thanks! Jiangli > > /Magnus > > > > > > Best, > > Jiangli > > > >> There are several bugs lurking here. For once, the error message is > >> incorrect and should be corrected. Secondly, a statically linked launcher > >> has just a single JVM included and should not have to look for the > >> lib/jvm.cfg file at all. > >> > >> After looking around a bit in the launcher/jli code, my conclusion is that > >> this code will need some additional care and loving attention to make it > >> properly adjusted to function as a static launcher. We can't have a static > >> launcher that tries to load a jvm.cfg file it does not need, and when it > >> fails, complains that it is missing a dynamic library that it should not > >> load. > >> > >> I'll try to get this fixed as part of my efforts to get the static > >> launcher into mainline. > >>> This was done haphazardly in StaticLink.gmk in the hermetic-java-runtime > >>> branch, where an arbitrary subset of external libraries were hard-coded. > >>> Before integration in mainline can be possible, this information needs > >>> to be collected correctly and automatically for all included JDK > >>> libraries. Fortunately, it is not likely to be too hard. I basically > >>> just need to store the information from the LIBS provided to the > >>> NativeCompilation, and pick that up for all static libraries we include > >>> in the static launcher. (A complication is that we need to de-duplicate > >>> the list, and that some libraries are specified using two words, like > >>> "-framework Application" on macos, so it will take some care getting it > >>> right.) > >> > >> Right, currently the hermetic-java-runtime branch specifies a list of > >> hard-coded dependency libraries for linking. One of the goals of the > >> hermetic prototype was avoiding/reducing refactoring/restructuring the > >> existing code whenever possible. The reason is to reduce merge overhead > >> when integrating with new changes from the mainline. We can do the proper > >> refactoring and cleanups when getting the changes into the mainline. > >> > >> That is basically what I am doing right now. I am looking at your > >> prototype and trying to reimplement this functionality properly so that it > >> can be merged into mainline. The first step on that way was to actually > >> get your prototype running. > >> > >> Now I have managed to get a version of your prototype that only includes > >> the minimal set of changes needed to support the static launcher, and that > >> works on mac and linux/gcc. Since your prototype is based on > >> 586396cbb55a31 from March, I am trying to merge the patch with the latest > >> master. This worked fine for macOS, but I hit some unexpected snag on > >> Linux which I'm currently investigating. > >> > >> We have only briefly touched on the spec change topic (for the naming of > >> native libraries) during the zoom meetings. I also agree that we should > >> get that part started now. It's unclear to me if there's any existing > >> blocker for that. > >> > >> I don't think there is. It's just that someone needs to step up and do it. > >> > >> /Magnus