Magnus, thanks for the response. Please see comments inlined below.
On Fri, Apr 12, 2024 at 4:52 AM Magnus Ihse Bursie
<magnus.ihse.bur...@oracle.com> wrote:
On 2024-04-02 21:16, Jiangli Zhou wrote:
Hi Magnus,
In today's zoom meeting with Alan, Ron, Liam and Chuck, we (briefly) discussed
how to move forward contributing the static Java related changes (additional
runtime fixes/enhancements on top of the existing static support in JDK) from
https://github.com/openjdk/leyden/tree/hermetic-java-runtime to JDK mainline.
Just a bit more details/context below, which may be useful for others reading
this thread.
The https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch
currently contains following for supporting hermetic Java (without the launcher
work for runtime support):
1. Build change for linking the Java launcher (as bin/javastatic) with
JDK/hotspot static libraries (.a), mainly in
https://github.com/openjdk/leyden/blob/hermetic-java-runtime/make/StaticLink.gmk.
The part for creating the complete sets of static libraries (including
libjvm.a) has already been included in the mainline since last year.
https://github.com/openjdk/leyden/blob/hermetic-java-runtime/make/StaticLink.gmk
is in a very raw state and is intended to demonstrate the capability of
building a static Java launcher.
Indeed. It is nowhere near being able to be integrated.
The main purpose of StaticLink.gmk is to support the static-java-image
make target, which can be used to perform the actual static linking
step using libjvm.a and JDK static libraries. That currently doesn't
exist in the JDK mainline. Creating a "fully" statically linked Java
launcher is the first step (out of many) towards supporting
static/hermetic Java.
As part of cleaning/refactoring/integrating for the static linking
step, we want to agree and decide/accept on the following:
- Support the "fully" statically linked java launcher for testing and
demoing the capability of static JDK support, e.g.
- Support running jtreg testing using the "fully" statically linked
Java launcher
- Set up tests in github workflow to help detect any breaking
changes for static support, e.g. new symbol issues introduced by any
changes. There were some earlier discussions on this with Ron and Alan
during the zoom meetings.
- Which JDK native libraries to be statically linked with the new
launcher target? E.g. StaticLink.gmk currently excludes libjsound.a,
libawt_xawt.a, etc from statically linked with the launcher.
- Do we want more than one statically linked launcher target, based on
the set of linked native libraries?
Based on the decisions of the above, the launcher static linking part
would mostly be in a different shape when it's integrated into the
mainline. That's why I referred to StaticLink.gmk as in a "very raw"
state.
Here is a high-level view of the state of things for static support:
(I) What we already have in the JDK mainline:
- Able to build a complete set of JDK/VM static libraries using
`static-libs-image` make target (necessary for supporting static JDK)
- Compilation for .o files are done separately for the static
libraries and dynamic library (ok for now)
(II) What missing:
- Static linking step as mentioned above
(III) What needs to be improved (require cleanups and refactoring, and
you mentioned some of those in your response as well):
- Support building both the static libraries and dynamic libraries
using the same set of .o files, instead of separately compiled .o
files. That helps improve build speed and reduce memory overhead for
building JDK. Your current refactoring work aims to help that.
- Clean up the usages of STATIC_BUILD macro. Most of the usages are in
test code.
- Other runtime fixes/enhancements in the leyden
https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch
I think most work mentioned in III has dependencies on II. We need a
workable base to be able to build the "fully" statically linked
launcher for building and testing the work mentioned in III, when
integrating any of those to the JDK mainline. The makefile refactoring
work can be done in parallel but does not need to be completed before
we add the static linking step in JDK mainline.
2. Additional runtime fixes/enhancements on top of the existing static support
in JDK, e.g. support further lookup dynamic native library if the built-in
native library cannot be found.
3. Some initial (prototype) work on supporting hermetic JDK resource files in
the jimage (JDK modules image).
To move forward, one of the earliest items needed is to add the capability of
building the fully statically linked Java launcher in JDK mainline. The other
static Java runtime changes can be followed up after the launcher linking part,
so they can be built and tested as individual PRs created for the JDK mainline.
Magnus, you have expressed interest in helping get the launcher linking part
(refactor from
https://github.com/openjdk/leyden/blob/hermetic-java-runtime/make/StaticLink.gmk)
into JDK mainline. What's your thought on prioritizing the launcher static
linking part before other makefile clean ups for static libraries?
Trust me, my absolute top priority now is working on getting the proper build
support needed for Hermetic Java. I can't prioritize it any higher.
Thanks!
I am not sure what you are asking for. We can't just merge StaticLink.gmk from
your prototype. And even if we did, what good will it do you?
Please see my comments above.
The problem you are running into is that the build system has not been designed
to properly support static linking. There are already 3-4 hacks in place to get
something sort-of useful out, but they are prone to breaking. I assume that we
agree that for Hermetic Java to become a success, we need to have a stable
foundation for static builds.
The core problem of all static linking hacks is that they are not integrated in
the right place. They need to be a core part of what NativeCompilation
delivers, not something done in a separate file. To put it in other words,
StaticLink.gmk from your branch do not need cleanup -- it needs to go away, and
the functionality moved to the proper place.
My approach is that NativeCompilation should support doing either only dynamic
linking (as today), or static linking (as today with STATIC_LIBS or
STATIC_BUILD), or both. The assumption is that the latter will be default, or
at least should be tested by default in GHA. For this to work, we need to
compile the source code to .o files only once, and then link these .o files
either into a dynamic or a static library (or both).
As of today, the leyden
https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch
can build a "fully" statically linked Java launcher. The issue of
compiling the dynamic and static libraries .o files separately is not
a blocker. It's good to have it resolved at some point of time.
This, in turn, require several changes:
1) The linking code needs to be cleaned up, and all technical debt needs to be
resolved. This is what I have been doing since I started working on static
builds for Hermetic Java. JDK-8329704 (which was integrated yesterday) was the
first major milestone of this cleanup. Now, the path were to find a library
created by the JDK (static or dynamic) is encapsulated in ResolveLibPath. This
is currently a monster, but at least all knowledge is collected in a single
location, instead of spread over the code base. Getting this simplified is the
next step.
2) We need to stop passing the STATIC_BUILD define when compiling. This is
partially addressed in your PR, where you have replaced #ifdef STATIC_BUILD
with a dynamic lookup. But there is also the problem of JNI/JVMTI entry points.
I have been pondering how we can compile the code in a way so we support both
dynamic and static name resolution, and I think I have a solution.
This is unfortunately quite complex, and I have started a discussion with Alan if it is
possible to update the JNI spec so that both static and dynamic entry points can have the form
"JNI_OnLoad_<library-name>". Ideally, I'd like to see us push for this with as
much effort as possible. If we got this in place, static builds would be much easier, and the
changes required for Hermetic Java even smaller.
Thumbs up! That seems to be a good direction. Currently in the leyden
branch, it first looks up the unique
JNI_OnLoad<_lib_name>|Agent_OnLoad<_lib_name> etc for built-in
libraries, then search for the dynamic libraries using the
conventional naming when necessary. e.g.:
https://github.com/openjdk/leyden/commit/a5c886d2e85a0ff0c3712a5488ae61d8c9d7ba1a
https://github.com/openjdk/leyden/commit/1da8e3240e0bd27366d19f2e7dde386e46015135
When spec supports JNI_OnLoad_<library-name> and etc. for dynamic
libraries, we may still need to support the conventional naming
without the <_lib_name> part for existing libraries out there.
And finally, on top of all of this, is the question of widening the platform support. To
support linux/gcc with objcopy is trivial, but the question about Windows still remain. I
have two possible ways forward, one is to check if there is alternative tooling to use
(the prime candidate is the clang-ldd), and the other is to try to "fake" a
partial linking by concatenating all source code before compiling. This is not ideal,
though, for many reasons, and I am not keen on implementing it, not even for testing. And
at this point, I have not had time to investigate any of these options much further,
since I have been focusing on 1) above.
A third option is of course to just say that due to toolchain limitations,
static linking is not available on Windows.
Thank you for taking this on! Potentially we could consider taking the
objcopy to localizing hotspot symbols on unix-like platforms, based on
https://github.com/openjdk/jdk/pull/17456 discussions. Additional
testing is still needed to verify the solution.
My recommendation is that you keep on working to resolve the (much more thorny)
issues of resource access in Hermetic Java in your branch, where you have a
prototype static build that works for you. In the meantime, I will make sure
that there will be a functioning, stable and robust way of creating static
builds in the mainline, that can be regularly tested and not bit-rot, like the
static build hacks that has gone in before.
Most of the JDK resources are now supported as hermetic jimage
(lib/modules) bundled in the
https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch.
The remaining sound.properties, ct.sym and .jfc files can be handled
later. Overally, that part of the work has confirmed the hermetic
jimage bundled solution is robust and helps resolve some of the
difficult start-up sequence issues observed when the hermetic resource
was implemented using JAR file based solution.
It might be a good idea to follow up on the static linking discussion
in tomorrow's zoom meeting (hope you'll be able to join tomorrow).
Thanks!
Jiangli
/Magnus
Thanks!
Jiangli
On Thu, Feb 15, 2024 at 12:01 PM Jiangli Zhou <jiangliz...@google.com> wrote:
On Wed, Feb 14, 2024 at 5:07 PM Jiangli Zhou <jiangliz...@google.com> wrote:
Hi Magnus,
Thanks for looking into this from the build perspective.
On Wed, Feb 14, 2024 at 1:00 AM Magnus Ihse Bursie
<magnus.ihse.bur...@oracle.com> wrote:
First some background for build-dev: I have spent some time looking at
the build implications of the Hermetic Java effort, which is part of
Project Leyden. A high-level overview is available here:
https://cr.openjdk.org/~jiangli/hermetic_java.pdf and the current source
code is here: https://github.com/openjdk/leyden/tree/hermetic-java-runtime.
Some additional hermetic Java related references that are also useful:
- https://bugs.openjdk.org/browse/JDK-8303796 is an umbrella bug that
links to the issues for resolving static linking issues so far
- https://github.com/openjdk/jdk21/pull/26 is the enhancement for
building the complete set of static libraries in JDK/VM, particularly
including libjvm.a
Hermetic Java faces several challenges, but the part that is relevant
for the build system is the ability to create static libraries. We've
had this functionality (in three different ways...) for some time, but
it is rather badly implemented.
As a result of my investigations, I have a bunch of questions. :-) I
have gotten some answers in private discussion, but for the sake of
transparency I will repeat them here, to foster an open dialogue.
1. Am I correct in understanding that the ultimate goal of this exercise
is to be able to have jmods which include static libraries (*.a) of the
native code which the module uses, and that the user can then run a
special jlink command to have this linked into a single executable
binary (which also bundles the *.class files and any additional
resources needed)?
2. If so, is the idea to create special kinds of static jmods, like
java.base-static.jmod, that contains *.a files instead of lib*.so files?
Or is the idea that the normal jmod should contain both?
3. Linking .o and .a files into an executable is a formidable task. Is
the intention to have jlink call a system-provided ld, or to bundle ld
with jlink, or to reimplement this functionality in Java?
I have a similar view as Alan responded in your other email thread.
Things are still in the early stage for the general solution.
In the https://github.com/openjdk/leyden/tree/hermetic-java-runtime
branch, when configuring JDK with --with-static-java=yes, the JDK
binary contains the following extra artifacts:
- static-libs/*.a: The complete set of JDK/VM static libraries
- jdk/bin/javastatic: A demo Java launcher fully statically linked
with the selected JDK .a libraries (e.g. it currently statically link
with the headless) and libjvm.a. It's the standard Java launcher
without additional work for hermetic Java.
In our prototype for hermetic Java, we build the hermetic executable
image (a single image) from the following input (see description on
singlejar packaging tool in
https://cr.openjdk.org/~jiangli/hermetic_java.pdf):
- A customized launcher (with additional work for hermetic) executable
fully statically linked with JDK/VM static libraries (.a files),
application natives and dependencies (e.g. in .a static libraries)
- JDK lib/modules, JDK resource files
- Application classes and resource files
Including a JDK library .a into the corresponding .jmod would require
extracting the .a for linking with the executable. In some systems
that may cause memory overhead due to the extracted copy of the .a
files. I think we should consider the memory overhead issue.
One possibility (as Alan described in his response) is for jlink to
invoke the ld on the build system. jlink could pass the needed JDK
static libraries and libjvm.a (provided as part of the JDK binary) to
ld based on the modules required for the application.
I gave a bit more thoughts on this one. For jlink to trigger ld, it
would need to know the complete linker options and inputs. Those
include options and inputs related to the application part as well. In
some usages, it might be easier to handle native linking separately
and pass the linker output, the executable to jlink directly. Maybe we
could consider supporting different modes for various usages
requirements, from static libraries and native linking point of view:
Mode #1
Support .jmod packaged natives static libraries, for both JDK/VM .a
and application natives and dependencies. If the inputs to jlink
include .jmods, jlink can extract the .a libraries and pass the
information to ld to link the executable.
Mode #2
Support separate .a as jlink input. Jlink could pass the path
information to the .a libraries and other linker options to ld to
create the executable.
For both mode #1 and #2, jlink would then use the linker output
executable to create the final hermetic image.
Mode #3
Support a fully linked executable as a jlink input. When a linked
executable is given to jlink, it can process it directly with other
JDK data/files to create the final image, without native linking step.
Any other thoughts and considerations?
Best,
Jiangli
4. Is the intention is to allow users to create their own jmods with
static libraries, and have these linked in as well? This seems to be the
case.
An alternative with less memory overhead could be using application
modular JAR and separate .a as the input for jlink.
If that is so, then there will always be the risk for name
collisions, and we can only minimize the risk by making sure any global
names are as unique as possible.
Part of the current effort includes resolving the discovered symbol
collision issues with static linking. Will respond to your other email
on the symbol issue separately later.
5. The original implementation of static builds in the JDK, created for
the Mobile project, used a configure flag, --enable-static-builds, to
change the entire behavior of the build system to only produce *.a files
instead of lib*.so. In contrast, the current system is using a special
target instead.
I think we would need both configure flag and special target for the
static builds.
In my eyes, this is a much worse solution. Apart from
the conceptual principle (if the build should generate static or dynamic
libraries is definitely a property of what a "configuration" means),
this makes it much harder to implement efficiently, since we cannot make
changes in NativeCompilation.gmk, where they are needed.
For the potential objcopy work to resolve symbol issues, we can add
that conditionally in NativeCompilation.gmk if STATIC_LIBS is true. We
have an internal prototype (not included in
https://github.com/openjdk/leyden/tree/hermetic-java-runtime yet) done
by one of colleagues for localizing symbols in libfreetype using
objcopy.
That was not as much a question as a statement. 🙂 But here is the
question: Do you think it would be reasonable to restore the old
behavior but with the new methods, so that we don't use special targets,
but instead tells configure to generate static libraries? I'm thinking
we should have a flag like "--with-library-type=" that can have values
"dynamic" (which is default), "static" or "both".
If we want to also build a fully statically linked launcher, maybe
--with-static-java? Being able to configure either dynamic, static or
both as you suggested also seems to be a good idea.
I am not sure if "both" are needed, but if we want to bundle both lib*.so and
*.a files
into a single jmod file (see question 2 above), then it definitely is.
In general, the cost of producing two kinds of libraries are quite
small, compared to the cost of compiling the source code to object files.
Completely agree. It would be good to avoid recompiling the .o file
for static and dynamic builds. As proposed in
https://bugs.openjdk.org/browse/JDK-8303796:
It's beneficial to be able to build both .so and .a from the same set
of .o files. That would involve some changes to handle the dynamic JDK
and static JDK difference at runtime, instead of relying on the
STATIC_BUILD macro.
Finally, I have looked at how to manipulate symbol visibility. There
seems many ways forward, so I feel confident that we can find a good
solution.
One way forward is to use objcopy to manipulate symbol status
(global/local). There is an option --localize-symbol in objcopy, that
has been available in objcopy since at least 2.15, which was released
2004, so it should be safe to use. But ideally we should avoid using
objcopy and do this as part of the linking process. This should be
possible to do, given that we make changes in NativeCompilation.gmk --
see question 5 above.
As a fallback, it is also possible to rename symbols, either piecewise
or wholesale, using objcopy. There are many ways to do this, using
--prefix-symbols, --redefine-sym or --redefine-syms (note the -s, this
takes a file with a list of symbols). Thus we can always introduce a
"post factum namespace" by renaming symbols.
Renaming or redefining the symbol at build time could cause confusions
with debugging. That's a concern raised in
https://github.com/openjdk/jdk/pull/17456 discussions.
Additionally, redefining symbols using tools like objcopy may not
handle member names referenced in string literals. For example, in
https://github.com/openjdk/jdk/pull/17456 additional changes are
needed in assembling and SA to reflect the symbol change.
So in the end, I think it will be fully possible to produce .a files
that only has global symbols for the functions that are part of the API
exposed by that library, and have all other symbols local, and make this
is in a way that is consistent with the rest of the build system.
Finally, a note on Hotspot. Due to debugging reasons, we export
basically all symbols in hotspot as global. This is not reasonable to do
for a static build. The effect of not exporting those symbols will be
that SA will not function to 100%. On the other hand, I have no idea if
SA works at all with a static build. Have you tested this? Is this part
of the plan to support, or will it be officially dropped for Hermetic Java?
We have done some testing with jtreg SA related tests for the fully
statically linked `javastatic`.
If we use objcopy to localize symbols in hotspot, it's not yet clear
what's the impact on SA. We could do some tests. The other question
that I raised is the supported gcc versions (for partial linking)
related to the solution.
Best,
Jiangli
/Magnus