I have good (and bad news) for you!

The good news is that from memory, ld and gcc (but I assume you are
only concerned about ld) can work entirely on cl compiled object files
with no hitch whatsoever, and partial linking is fully functional with
ld on Windows! Symbol hiding also works, but with a gotcha.

Now, the bad news. --localize-hidden, to my knowledge, only works on
ELF visibilities. This means that for the following file:

exports.c
__declspec(dllexport) void exportedMethod() ()
void unexportedMethod() {}

When objcopy is run over exports.o with --localize-hidden, it will
work on the cl compiled object file just fine, but it will not
recognise unexportedMethod() as a hidden method, because dllexport and
hidden visibility are entirely separate concepts to gcc. The resulting
exports-objcopy.o file will still have unexportedMethod as public.
However, not all is lost just yet. --localize-symbol=<symbol> still
works perfectly fine, and can be used as a workaround if we can find a
way to check for non-dllexport methods in cl object files and then
feed that through the command line. I should note that clang also
suffers from the same issue, as --localize-hidden on clang's objcopy
also leaves unexportedMethod as public.

Happy to help!

best regards,
Julian

On Thu, Apr 18, 2024 at 6:28 PM Magnus Ihse Bursie
<magnus.ihse.bur...@oracle.com> wrote:
>
> On 2024-04-17 14:06, Julian Waters wrote:
>
> > Hi Magnus,
> >
> > Yes, I'm talking about the MSYS2 objcopy, but to a lesser extent also
> > standalone Windows objcopy builds too. objcopy should be able to
> > handle .o files from cl.exe (I assume that's what everyone here is
> > after considering all the talk about .o files?), but to answer that
> > question properly, I'd need a bit more detail. What kind of usage of
> > objcopy do you have in mind? A general-ish command line example could
> > be helpful
>
> To make a static library behave somewhat like a dynamic library, and not
> expose all its internal symbols to the rest of the world, there are
> basically two operations that needs to be done:
>
> 1) Combine all .o files into a single .o file, to make it look like it
> was compiled by a single source code. That way, symbols that were
> created in one source file and referenced in another will now behave as
> if they are internal to the "compilation unit", i.e. like they were
> declared static.
>
> 2) Modify the symbol status so that symbols that are not exported are
> changed so they look like they were actually declared "static" in the
> source code.
>
> These operations are based on concepts that exists in the gcc and clang
> toolchain, about symbol visibility. I am not sure how well they
> translate to a Microsoft setting. But, if the dll_export marker
> corresponds to visible symbols, then I guess we should be able to
> achieve something similar.
>
> What needs to be done then is:
>
> 1) Combine multiple .obj (COFF object files) into one.
>
> 2) Change the visibility of symbols that are not marked as dll_export:ed
> to they appear like they were declared static.
>
> In the clang/gcc world, the first step is done by "partial linking" by
> ld. That is our first blocker -- link.exe cannot do that. So the first
> question is really, is there a Windows build of ld that can work on COFF
> files to achieve this?
>
> The second step is done by objcopy using the "--localize-hidden"
> argument. The second question is, could this work on a COFF object file?
>
> /Magnus
>
>
> >
> > best regards,
> > Julian
> >
> > On Wed, Apr 17, 2024 at 5:55 PM Magnus Ihse Bursie
> > <magnus.ihse.bur...@oracle.com> wrote:
> >> On 2024-04-16 07:23, Julian Waters wrote:
> >>
> >>>> And finally, on top of all of this, is the question of widening the 
> >>>> platform support. To support linux/gcc with objcopy is trivial, but the 
> >>>> question about Windows still remain.
> >>> objcopy is also available on Windows, if the question about
> >>> alternative tooling is still unanswered :)
> >> At this point, I think support for static build on Windows is to either
> >> require additional tooling on top of the Microsoft Visual Studio
> >> toolchain, or to drop it completely, so I am definitely interested in
> >> researching alternatives.
> >>
> >> Can objcopy (I assume this is from msys?) deal with COFF files generated
> >> by cl?
> >>
> >> Switching the entire toolchain is not relevant at this point (but if a
> >> non-Microsoft toolchain build for Windows is ever integrated, it might
> >> get static builds with no extra work as a bonus), but I could certainly
> >> accept the idea of having one or a few additional tools required to get
> >> the normal Microsoft toolchain to produce static builds.
> >>
> >> /Magnus
> >>
> >>> best regards,
> >>> Julian
> >>>
> >>> On Fri, Apr 12, 2024 at 7:52 PM Magnus Ihse Bursie
> >>> <magnus.ihse.bur...@oracle.com> wrote:
> >>>> On 2024-04-02 21:16, Jiangli Zhou wrote:
> >>>>
> >>>> Hi Magnus,
> >>>>
> >>>> In today's zoom meeting with Alan, Ron, Liam and Chuck, we (briefly) 
> >>>> discussed how to move forward contributing the static Java related 
> >>>> changes (additional runtime fixes/enhancements on top of the existing 
> >>>> static support in JDK) from 
> >>>> https://github.com/openjdk/leyden/tree/hermetic-java-runtime to JDK 
> >>>> mainline.
> >>>>
> >>>> Just a bit more details/context below, which may be useful for others 
> >>>> reading this thread.
> >>>>
> >>>> The https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch 
> >>>> currently contains following for supporting hermetic Java (without the 
> >>>> launcher work for runtime support):
> >>>>
> >>>> 1. Build change for linking the Java launcher (as bin/javastatic) with 
> >>>> JDK/hotspot static libraries (.a), mainly in 
> >>>> https://github.com/openjdk/leyden/blob/hermetic-java-runtime/make/StaticLink.gmk.
> >>>>  The part for creating the complete sets of static libraries (including 
> >>>> libjvm.a) has already been included in the mainline since last year. 
> >>>> https://github.com/openjdk/leyden/blob/hermetic-java-runtime/make/StaticLink.gmk
> >>>>  is in a very raw state and is intended to demonstrate the capability of 
> >>>> building a static Java launcher.
> >>>>
> >>>> Indeed. It is nowhere near being able to be integrated.
> >>>>
> >>>>
> >>>> 2. Additional runtime fixes/enhancements on top of the existing static 
> >>>> support in JDK, e.g. support further lookup dynamic native library if 
> >>>> the built-in native library cannot be found.
> >>>>
> >>>> 3. Some initial (prototype) work on supporting hermetic JDK resource 
> >>>> files in the jimage (JDK modules image).
> >>>>
> >>>> To move forward, one of the earliest items needed is to add the 
> >>>> capability of building the fully statically linked Java launcher in JDK 
> >>>> mainline. The other static Java runtime changes can be followed up after 
> >>>> the launcher linking part, so they can be built and tested as individual 
> >>>> PRs created for the JDK mainline. Magnus, you have expressed interest in 
> >>>> helping get the launcher linking part (refactor from 
> >>>> https://github.com/openjdk/leyden/blob/hermetic-java-runtime/make/StaticLink.gmk)
> >>>>  into JDK mainline. What's your thought on prioritizing the launcher 
> >>>> static linking part before other makefile clean ups for static libraries?
> >>>>
> >>>> Trust me, my absolute top priority now is working on getting the proper 
> >>>> build support needed for Hermetic Java. I can't prioritize it any higher.
> >>>>
> >>>> I am not sure what you are asking for. We can't just merge 
> >>>> StaticLink.gmk from your prototype. And even if we did, what good will 
> >>>> it do you?
> >>>>
> >>>> The problem you are running into is that the build system has not been 
> >>>> designed to properly support static linking. There are already 3-4 hacks 
> >>>> in place to get something sort-of useful out, but they are prone to 
> >>>> breaking. I assume that we agree that for Hermetic Java to become a 
> >>>> success, we need to have a stable foundation for static builds.
> >>>>
> >>>> The core problem of all static linking hacks is that they are not 
> >>>> integrated in the right place. They need to be a core part of what 
> >>>> NativeCompilation delivers, not something done in a separate file. To 
> >>>> put it in other words, StaticLink.gmk from your branch do not need 
> >>>> cleanup -- it needs to go away, and the functionality moved to the 
> >>>> proper place.
> >>>>
> >>>> My approach is that NativeCompilation should support doing either only 
> >>>> dynamic linking (as today), or static linking (as today with STATIC_LIBS 
> >>>> or STATIC_BUILD), or both. The assumption is that the latter will be 
> >>>> default, or at least should be tested by default in GHA. For this to 
> >>>> work, we need to compile the source code to .o files only once, and then 
> >>>> link these .o files either into a dynamic or a static library (or both).
> >>>>
> >>>> This, in turn, require several changes:
> >>>>
> >>>> 1) The linking code needs to be cleaned up, and all technical debt needs 
> >>>> to be resolved. This is what I have been doing since I started working 
> >>>> on static builds for Hermetic Java. JDK-8329704 (which was integrated 
> >>>> yesterday) was the first major milestone of this cleanup. Now, the path 
> >>>> were to find a library created by the JDK (static or dynamic) is 
> >>>> encapsulated in ResolveLibPath. This is currently a monster, but at 
> >>>> least all knowledge is collected in a single location, instead of spread 
> >>>> over the code base. Getting this simplified is the next step.
> >>>>
> >>>> 2) We need to stop passing the STATIC_BUILD define when compiling. This 
> >>>> is partially addressed in your PR, where you have replaced #ifdef 
> >>>> STATIC_BUILD with a dynamic lookup. But there is also the problem of 
> >>>> JNI/JVMTI entry points. I have been pondering how we can compile the 
> >>>> code in a way so we support both dynamic and static name resolution, and 
> >>>> I think I have a solution.
> >>>>
> >>>> This is unfortunately quite complex, and I have started a discussion 
> >>>> with Alan if it is possible to update the JNI spec so that both static 
> >>>> and dynamic entry points can have the form "JNI_OnLoad_<library-name>". 
> >>>> Ideally, I'd like to see us push for this with as much effort as 
> >>>> possible. If we got this in place, static builds would be much easier, 
> >>>> and the changes required for Hermetic Java even smaller.
> >>>>
> >>>> And finally, on top of all of this, is the question of widening the 
> >>>> platform support. To support linux/gcc with objcopy is trivial, but the 
> >>>> question about Windows still remain. I have two possible ways forward, 
> >>>> one is to check if there is alternative tooling to use (the prime 
> >>>> candidate is the clang-ldd), and the other is to try to "fake" a partial 
> >>>> linking by concatenating all source code before compiling. This is not 
> >>>> ideal, though, for many reasons, and I am not keen on implementing it, 
> >>>> not even for testing. And at this point, I have not had time to 
> >>>> investigate any of these options much further, since I have been 
> >>>> focusing on 1) above.
> >>>>
> >>>> A third option is of course to just say that due to toolchain 
> >>>> limitations, static linking is not available on Windows.
> >>>>
> >>>> My recommendation is that you keep on working to resolve the (much more 
> >>>> thorny) issues of resource access in Hermetic Java in your branch, where 
> >>>> you have a prototype static build that works for you. In the meantime, I 
> >>>> will make sure that there will be a functioning, stable and robust way 
> >>>> of creating static builds in the mainline, that can be regularly tested 
> >>>> and not bit-rot, like the static build hacks that has gone in before.
> >>>>
> >>>> /Magnus
> >>>>
> >>>>
> >>>>
> >>>> Thanks!
> >>>> Jiangli
> >>>>
> >>>> On Thu, Feb 15, 2024 at 12:01 PM Jiangli Zhou <jiangliz...@google.com> 
> >>>> wrote:
> >>>>> On Wed, Feb 14, 2024 at 5:07 PM Jiangli Zhou <jiangliz...@google.com> 
> >>>>> wrote:
> >>>>>> Hi Magnus,
> >>>>>>
> >>>>>> Thanks for looking into this from the build perspective.
> >>>>>>
> >>>>>> On Wed, Feb 14, 2024 at 1:00 AM Magnus Ihse Bursie
> >>>>>> <magnus.ihse.bur...@oracle.com> wrote:
> >>>>>>> First some background for build-dev: I have spent some time looking at
> >>>>>>> the build implications of the Hermetic Java effort, which is part of
> >>>>>>> Project Leyden. A high-level overview is available here:
> >>>>>>> https://cr.openjdk.org/~jiangli/hermetic_java.pdf and the current 
> >>>>>>> source
> >>>>>>> code is here: 
> >>>>>>> https://github.com/openjdk/leyden/tree/hermetic-java-runtime.
> >>>>>> Some additional hermetic Java related references that are also useful:
> >>>>>>
> >>>>>> - https://bugs.openjdk.org/browse/JDK-8303796 is an umbrella bug that
> >>>>>> links to the issues for resolving static linking issues so far
> >>>>>> - https://github.com/openjdk/jdk21/pull/26 is the enhancement for
> >>>>>> building the complete set of static libraries in JDK/VM, particularly
> >>>>>> including libjvm.a
> >>>>>>
> >>>>>>> Hermetic Java faces several challenges, but the part that is relevant
> >>>>>>> for the build system is the ability to create static libraries. We've
> >>>>>>> had this functionality (in three different ways...) for some time, but
> >>>>>>> it is rather badly implemented.
> >>>>>>>
> >>>>>>> As a result of my investigations, I have a bunch of questions. :-) I
> >>>>>>> have gotten some answers in private discussion, but for the sake of
> >>>>>>> transparency I will repeat them here, to foster an open dialogue.
> >>>>>>>
> >>>>>>> 1. Am I correct in understanding that the ultimate goal of this 
> >>>>>>> exercise
> >>>>>>> is to be able to have jmods which include static libraries (*.a) of 
> >>>>>>> the
> >>>>>>> native code which the module uses, and that the user can then run a
> >>>>>>> special jlink command to have this linked into a single executable
> >>>>>>> binary (which also bundles the *.class files and any additional
> >>>>>>> resources needed)?
> >>>>>>>
> >>>>>>> 2. If so, is the idea to create special kinds of static jmods, like
> >>>>>>> java.base-static.jmod, that contains *.a files instead of lib*.so 
> >>>>>>> files?
> >>>>>>> Or is the idea that the normal jmod should contain both?
> >>>>>>>
> >>>>>>> 3. Linking .o and .a files into an executable is a formidable task. Is
> >>>>>>> the intention to have jlink call a system-provided ld, or to bundle ld
> >>>>>>> with jlink, or to reimplement this functionality in Java?
> >>>>>> I have a similar view as Alan responded in your other email thread.
> >>>>>> Things are still in the early stage for the general solution.
> >>>>>>
> >>>>>> In the https://github.com/openjdk/leyden/tree/hermetic-java-runtime
> >>>>>> branch, when configuring JDK with --with-static-java=yes, the JDK
> >>>>>> binary contains the following extra artifacts:
> >>>>>>
> >>>>>> - static-libs/*.a: The complete set of JDK/VM static libraries
> >>>>>> - jdk/bin/javastatic: A demo Java launcher fully statically linked
> >>>>>> with the selected JDK .a libraries (e.g. it currently statically link
> >>>>>> with the headless) and libjvm.a. It's the standard Java launcher
> >>>>>> without additional work for hermetic Java.
> >>>>>>
> >>>>>> In our prototype for hermetic Java, we build the hermetic executable
> >>>>>> image (a single image) from the following input (see description on
> >>>>>> singlejar packaging tool in
> >>>>>> https://cr.openjdk.org/~jiangli/hermetic_java.pdf):
> >>>>>>
> >>>>>> - A customized launcher (with additional work for hermetic) executable
> >>>>>> fully statically linked with JDK/VM static libraries (.a files),
> >>>>>> application natives and dependencies (e.g. in .a static libraries)
> >>>>>> - JDK lib/modules, JDK resource files
> >>>>>> - Application classes and resource files
> >>>>>>
> >>>>>> Including a JDK library .a into the corresponding .jmod would require
> >>>>>> extracting the .a for linking with the executable. In some systems
> >>>>>> that may cause memory overhead due to the extracted copy of the .a
> >>>>>> files. I think we should consider the memory overhead issue.
> >>>>>>
> >>>>>> One possibility (as Alan described in his response) is for jlink to
> >>>>>> invoke the ld on the build system. jlink could pass the needed JDK
> >>>>>> static libraries and libjvm.a (provided as part of the JDK binary) to
> >>>>>> ld based on the modules required for the application.
> >>>>>>
> >>>>> I gave a bit more thoughts on this one. For jlink to trigger ld, it
> >>>>> would need to know the complete linker options and inputs. Those
> >>>>> include options and inputs related to the application part as well. In
> >>>>> some usages, it might be easier to handle native linking separately
> >>>>> and pass the linker output, the executable to jlink directly. Maybe we
> >>>>> could consider supporting different modes for various usages
> >>>>> requirements, from static libraries and native linking point of view:
> >>>>>
> >>>>> Mode #1
> >>>>> Support .jmod packaged natives static libraries, for both JDK/VM .a
> >>>>> and application natives and dependencies. If the inputs to jlink
> >>>>> include .jmods, jlink can extract the .a libraries and pass the
> >>>>> information to ld to link the executable.
> >>>>>
> >>>>> Mode #2
> >>>>> Support separate .a as jlink input. Jlink could pass the path
> >>>>> information to the .a libraries and other linker options to ld to
> >>>>> create the executable.
> >>>>>
> >>>>> For both mode #1 and #2, jlink would then use the linker output
> >>>>> executable to create the final hermetic image.
> >>>>>
> >>>>> Mode #3
> >>>>> Support a fully linked executable as a jlink input. When a linked
> >>>>> executable is given to jlink, it can process it directly with other
> >>>>> JDK data/files to create the final image, without native linking step.
> >>>>>
> >>>>> Any other thoughts and considerations?
> >>>>>
> >>>>> Best,
> >>>>> Jiangli
> >>>>>
> >>>>>>> 4. Is the intention is to allow users to create their own jmods with
> >>>>>>> static libraries, and have these linked in as well? This seems to be 
> >>>>>>> the
> >>>>>>> case.
> >>>>>> An alternative with less memory overhead could be using application
> >>>>>> modular JAR and separate .a as the input for jlink.
> >>>>>>
> >>>>>>> If that is so, then there will always be the risk for name
> >>>>>>> collisions, and we can only minimize the risk by making sure any 
> >>>>>>> global
> >>>>>>> names are as unique as possible.
> >>>>>> Part of the current effort includes resolving the discovered symbol
> >>>>>> collision issues with static linking. Will respond to your other email
> >>>>>> on the symbol issue separately later.
> >>>>>>
> >>>>>>> 5. The original implementation of static builds in the JDK, created 
> >>>>>>> for
> >>>>>>> the Mobile project, used a configure flag, --enable-static-builds, to
> >>>>>>> change the entire behavior of the build system to only produce *.a 
> >>>>>>> files
> >>>>>>> instead of lib*.so. In contrast, the current system is using a special
> >>>>>>> target instead.
> >>>>>> I think we would need both configure flag and special target for the
> >>>>>> static builds.
> >>>>>>
> >>>>>>> In my eyes, this is a much worse solution. Apart from
> >>>>>>> the conceptual principle (if the build should generate static or 
> >>>>>>> dynamic
> >>>>>>> libraries is definitely a property of what a "configuration" means),
> >>>>>>> this makes it much harder to implement efficiently, since we cannot 
> >>>>>>> make
> >>>>>>> changes in NativeCompilation.gmk, where they are needed.
> >>>>>> For the potential objcopy work to resolve symbol issues, we can add
> >>>>>> that conditionally in NativeCompilation.gmk if STATIC_LIBS is true. We
> >>>>>> have an internal prototype (not included in
> >>>>>> https://github.com/openjdk/leyden/tree/hermetic-java-runtime yet) done
> >>>>>> by one of colleagues for localizing symbols in libfreetype using
> >>>>>> objcopy.
> >>>>>>
> >>>>>>> That was not as much a question as a statement. 🙂 But here is the
> >>>>>>> question: Do you think it would be reasonable to restore the old
> >>>>>>> behavior but with the new methods, so that we don't use special 
> >>>>>>> targets,
> >>>>>>> but instead tells configure to generate static libraries? I'm thinking
> >>>>>>> we should have a flag like "--with-library-type=" that can have values
> >>>>>>> "dynamic" (which is default), "static" or "both".
> >>>>>> If we want to also build a fully statically linked launcher, maybe
> >>>>>> --with-static-java? Being able to configure either dynamic, static or
> >>>>>> both as you suggested also seems to be a good idea.
> >>>>>>
> >>>>>>> I am not sure if "both" are needed, but if we want to bundle both 
> >>>>>>> lib*.so and *.a files
> >>>>>>> into a single jmod file (see question 2 above), then it definitely is.
> >>>>>>> In general, the cost of producing two kinds of libraries are quite
> >>>>>>> small, compared to the cost of compiling the source code to object 
> >>>>>>> files.
> >>>>>> Completely agree. It would be good to avoid recompiling the .o file
> >>>>>> for static and dynamic builds. As proposed in
> >>>>>> https://bugs.openjdk.org/browse/JDK-8303796:
> >>>>>>
> >>>>>> It's beneficial to be able to build both .so and .a from the same set
> >>>>>> of .o files. That would involve some changes to handle the dynamic JDK
> >>>>>> and static JDK difference at runtime, instead of relying on the
> >>>>>> STATIC_BUILD macro.
> >>>>>>
> >>>>>>> Finally, I have looked at how to manipulate symbol visibility. There
> >>>>>>> seems many ways forward, so I feel confident that we can find a good
> >>>>>>> solution.
> >>>>>>>
> >>>>>>> One way forward is to use objcopy to manipulate symbol status
> >>>>>>> (global/local). There is an option --localize-symbol in objcopy, that
> >>>>>>> has been available in objcopy since at least 2.15, which was released
> >>>>>>> 2004, so it should be safe to use. But ideally we should avoid using
> >>>>>>> objcopy and do this as part of the linking process. This should be
> >>>>>>> possible to do, given that we make changes in NativeCompilation.gmk --
> >>>>>>> see question 5 above.
> >>>>>>>
> >>>>>>> As a fallback, it is also possible to rename symbols, either piecewise
> >>>>>>> or wholesale, using objcopy. There are many ways to do this, using
> >>>>>>> --prefix-symbols, --redefine-sym or --redefine-syms (note the -s, this
> >>>>>>> takes a file with a list of symbols). Thus we can always introduce a
> >>>>>>> "post factum namespace" by renaming symbols.
> >>>>>> Renaming or redefining the symbol at build time could cause confusions
> >>>>>> with debugging. That's a concern raised in
> >>>>>> https://github.com/openjdk/jdk/pull/17456 discussions.
> >>>>>>
> >>>>>> Additionally, redefining symbols using tools like objcopy may not
> >>>>>> handle member names referenced in string literals. For example, in
> >>>>>> https://github.com/openjdk/jdk/pull/17456 additional changes are
> >>>>>> needed in assembling and SA to reflect the symbol change.
> >>>>>>
> >>>>>>> So in the end, I think it will be fully possible to produce .a files
> >>>>>>> that only has global symbols for the functions that are part of the 
> >>>>>>> API
> >>>>>>> exposed by that library, and have all other symbols local, and make 
> >>>>>>> this
> >>>>>>> is in a way that is consistent with the rest of the build system.
> >>>>>>>
> >>>>>>> Finally, a note on Hotspot. Due to debugging reasons, we export
> >>>>>>> basically all symbols in hotspot as global. This is not reasonable to 
> >>>>>>> do
> >>>>>>> for a static build. The effect of not exporting those symbols will be
> >>>>>>> that SA will not function to 100%. On the other hand, I have no idea 
> >>>>>>> if
> >>>>>>> SA works at all with a static build. Have you tested this? Is this 
> >>>>>>> part
> >>>>>>> of the plan to support, or will it be officially dropped for Hermetic 
> >>>>>>> Java?
> >>>>>> We have done some testing with jtreg SA related tests for the fully
> >>>>>> statically linked `javastatic`.
> >>>>>>
> >>>>>> If we use objcopy to localize symbols in hotspot, it's not yet clear
> >>>>>> what's the impact on SA. We could do some tests. The other question
> >>>>>> that I raised is the supported gcc versions (for partial linking)
> >>>>>> related to the solution.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jiangli
> >>>>>>
> >>>>>>> /Magnus
> >>>>>>>

Reply via email to