Hi Magnus,

Thanks for looking into this from the build perspective.

On Wed, Feb 14, 2024 at 1:00 AM Magnus Ihse Bursie
<magnus.ihse.bur...@oracle.com> wrote:
>
> First some background for build-dev: I have spent some time looking at
> the build implications of the Hermetic Java effort, which is part of
> Project Leyden. A high-level overview is available here:
> https://cr.openjdk.org/~jiangli/hermetic_java.pdf and the current source
> code is here: https://github.com/openjdk/leyden/tree/hermetic-java-runtime.

Some additional hermetic Java related references that are also useful:

- https://bugs.openjdk.org/browse/JDK-8303796 is an umbrella bug that
links to the issues for resolving static linking issues so far
- https://github.com/openjdk/jdk21/pull/26 is the enhancement for
building the complete set of static libraries in JDK/VM, particularly
including libjvm.a

>
> Hermetic Java faces several challenges, but the part that is relevant
> for the build system is the ability to create static libraries. We've
> had this functionality (in three different ways...) for some time, but
> it is rather badly implemented.
>
> As a result of my investigations, I have a bunch of questions. :-) I
> have gotten some answers in private discussion, but for the sake of
> transparency I will repeat them here, to foster an open dialogue.
>
> 1. Am I correct in understanding that the ultimate goal of this exercise
> is to be able to have jmods which include static libraries (*.a) of the
> native code which the module uses, and that the user can then run a
> special jlink command to have this linked into a single executable
> binary (which also bundles the *.class files and any additional
> resources needed)?
>
> 2. If so, is the idea to create special kinds of static jmods, like
> java.base-static.jmod, that contains *.a files instead of lib*.so files?
> Or is the idea that the normal jmod should contain both?
>
> 3. Linking .o and .a files into an executable is a formidable task. Is
> the intention to have jlink call a system-provided ld, or to bundle ld
> with jlink, or to reimplement this functionality in Java?

I have a similar view as Alan responded in your other email thread.
Things are still in the early stage for the general solution.

In the https://github.com/openjdk/leyden/tree/hermetic-java-runtime
branch, when configuring JDK with --with-static-java=yes, the JDK
binary contains the following extra artifacts:

- static-libs/*.a: The complete set of JDK/VM static libraries
- jdk/bin/javastatic: A demo Java launcher fully statically linked
with the selected JDK .a libraries (e.g. it currently statically link
with the headless) and libjvm.a. It's the standard Java launcher
without additional work for hermetic Java.

In our prototype for hermetic Java, we build the hermetic executable
image (a single image) from the following input (see description on
singlejar packaging tool in
https://cr.openjdk.org/~jiangli/hermetic_java.pdf):

- A customized launcher (with additional work for hermetic) executable
fully statically linked with JDK/VM static libraries (.a files),
application natives and dependencies (e.g. in .a static libraries)
- JDK lib/modules, JDK resource files
- Application classes and resource files

Including a JDK library .a into the corresponding .jmod would require
extracting the .a for linking with the executable. In some systems
that may cause memory overhead due to the extracted copy of the .a
files. I think we should consider the memory overhead issue.

One possibility (as Alan described in his response) is for jlink to
invoke the ld on the build system. jlink could pass the needed JDK
static libraries and libjvm.a (provided as part of the JDK binary) to
ld based on the modules required for the application.

>
> 4. Is the intention is to allow users to create their own jmods with
> static libraries, and have these linked in as well? This seems to be the
> case.

An alternative with less memory overhead could be using application
modular JAR and separate .a as the input for jlink.

> If that is so, then there will always be the risk for name
> collisions, and we can only minimize the risk by making sure any global
> names are as unique as possible.

Part of the current effort includes resolving the discovered symbol
collision issues with static linking. Will respond to your other email
on the symbol issue separately later.

>
> 5. The original implementation of static builds in the JDK, created for
> the Mobile project, used a configure flag, --enable-static-builds, to
> change the entire behavior of the build system to only produce *.a files
> instead of lib*.so. In contrast, the current system is using a special
> target instead.

I think we would need both configure flag and special target for the
static builds.

> In my eyes, this is a much worse solution. Apart from
> the conceptual principle (if the build should generate static or dynamic
> libraries is definitely a property of what a "configuration" means),
> this makes it much harder to implement efficiently, since we cannot make
> changes in NativeCompilation.gmk, where they are needed.

For the potential objcopy work to resolve symbol issues, we can add
that conditionally in NativeCompilation.gmk if STATIC_LIBS is true. We
have an internal prototype (not included in
https://github.com/openjdk/leyden/tree/hermetic-java-runtime yet) done
by one of colleagues for localizing symbols in libfreetype using
objcopy.

>
> That was not as much a question as a statement. 🙂 But here is the
> question: Do you think it would be reasonable to restore the old
> behavior but with the new methods, so that we don't use special targets,
> but instead tells configure to generate static libraries? I'm thinking
> we should have a flag like "--with-library-type=" that can have values
> "dynamic" (which is default), "static" or "both".

If we want to also build a fully statically linked launcher, maybe
--with-static-java? Being able to configure either dynamic, static or
both as you suggested also seems to be a good idea.

> I am not sure if "both" are needed, but if we want to bundle both lib*.so and 
> *.a files
> into a single jmod file (see question 2 above), then it definitely is.
> In general, the cost of producing two kinds of libraries are quite
> small, compared to the cost of compiling the source code to object files.

Completely agree. It would be good to avoid recompiling the .o file
for static and dynamic builds. As proposed in
https://bugs.openjdk.org/browse/JDK-8303796:

It's beneficial to be able to build both .so and .a from the same set
of .o files. That would involve some changes to handle the dynamic JDK
and static JDK difference at runtime, instead of relying on the
STATIC_BUILD macro.

>
> Finally, I have looked at how to manipulate symbol visibility. There
> seems many ways forward, so I feel confident that we can find a good
> solution.
>
> One way forward is to use objcopy to manipulate symbol status
> (global/local). There is an option --localize-symbol in objcopy, that
> has been available in objcopy since at least 2.15, which was released
> 2004, so it should be safe to use. But ideally we should avoid using
> objcopy and do this as part of the linking process. This should be
> possible to do, given that we make changes in NativeCompilation.gmk --
> see question 5 above.
>
> As a fallback, it is also possible to rename symbols, either piecewise
> or wholesale, using objcopy. There are many ways to do this, using
> --prefix-symbols, --redefine-sym or --redefine-syms (note the -s, this
> takes a file with a list of symbols). Thus we can always introduce a
> "post factum namespace" by renaming symbols.

Renaming or redefining the symbol at build time could cause confusions
with debugging. That's a concern raised in
https://github.com/openjdk/jdk/pull/17456 discussions.

Additionally, redefining symbols using tools like objcopy may not
handle member names referenced in string literals. For example, in
https://github.com/openjdk/jdk/pull/17456 additional changes are
needed in assembling and SA to reflect the symbol change.

>
> So in the end, I think it will be fully possible to produce .a files
> that only has global symbols for the functions that are part of the API
> exposed by that library, and have all other symbols local, and make this
> is in a way that is consistent with the rest of the build system.
>
> Finally, a note on Hotspot. Due to debugging reasons, we export
> basically all symbols in hotspot as global. This is not reasonable to do
> for a static build. The effect of not exporting those symbols will be
> that SA will not function to 100%. On the other hand, I have no idea if
> SA works at all with a static build. Have you tested this? Is this part
> of the plan to support, or will it be officially dropped for Hermetic Java?

We have done some testing with jtreg SA related tests for the fully
statically linked `javastatic`.

If we use objcopy to localize symbols in hotspot, it's not yet clear
what's the impact on SA. We could do some tests. The other question
that I raised is the supported gcc versions (for partial linking)
related to the solution.

Best,
Jiangli

>
> /Magnus
>

Reply via email to