Hi,
I’ve made some major progress on this work in this PR:
https://github.com/apache/arrow/pull/38876
* The maven plugin for compiling module-info.java files using JDK 8 is
working correctly.
* arrow-format, arrow-memory-core, arrow-memory-netty, arrow-memory-unsafe,
and arrow-vector have been modularized successfully.
* Tests pass locally for all of these modules.
* They fail in CI. This is likely from me not updating a profile
somewhere.
Similar to David’s PR from below, arrow-memory and modules needed to be
refactored fairly significantly and split into two modules: a public-facing
JPMS module and a separate module which adds to Netty’s packages
(memory-netty-buffer-patch). What’s more problematic is that because we are
using named modules now, users need to add more arguments to their Java command
line to use arrow. If one were to use arrow-memory-netty they would need to add
the following:
--add-opens java.base/jdk.internal.misc=io.netty.common
--patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar
--add-opens=java.base/java.nio=org.apache.arrow.memory.core,io.netty.common,ALL-UNNAMED
Depending on where the memory-netty-buffer-patch JAR is located, and what
version, the command the user needs to supply changes, so this seems like it’d
be really inconvenient.
Do we want to proceed with modularizing existing memory modules? Both netty and
unsafe? Or wait until the new memory module from Java 21 is available?
The module-info.java files are written fairly naively. I haven’t inspected
thoroughly to determine what packages users will need.
We can continue modularizing more components in a separate PR. Ideally all the
user breakage (class movement, new command-line argument requirements) happens
within one major Arrow version.
From: James Duong <[email protected]>
Date: Tuesday, November 21, 2023 at 1:16 PM
To: [email protected] <[email protected]>
Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module
System
I’m following up on this topic.
David has a PR from last year that’s done much of the heavy lifting for
refactoring the codebase to be package-friendly.
https://github.com/apache/arrow/pull/13072
What’s changed since and what’s left:
* New components have been added (Flight SQL for example) that will need to
be updated for modules.
* There wasn’t a clear solution on how to do this without breaking JDK 8
support. Compiling module-info.java files require using JDK9, but using JDK9
breaks using JDK8 methods of accessing sun.misc.Unsafe.
* There is a Gradle plugin that can compile module-info.java files
purely syntactically that we can adapt to maven. It has limitations (the one I
see is that it can’t iterate through classloaders to handle annotations), but
using this might be a good stopgap until we JDK 8 support is deprecated.
* Some plugins need to be updated:
* maven-dependency-plugin 3.0.1 can’t parse module-info.class files.
* checkstyle 3.1.0 can’t parse module-info.java files. Our existing
checkstyle rules file can’t be loaded with newer versions. We can exclude
module-info.java for now and have a separate Issue for updating checkstyle
itself and the rules file.
* grpc-java could not be modularized when the PR above was written.
* Grpc 1.57 now can be modularized
(grpc/grpc-java#3522<https://github.com/grpc/grpc-java/issues/3522>)
From: David Dali Susanibar Arce <[email protected]>
Date: Wednesday, May 25, 2022 at 5:02 AM
To: [email protected] <[email protected]>
Subject: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module System
Hi All,
This email's purpose is a request for comments to migrate Arrow Java to JPMS
Java Platform Module System <https://openjdk.java.net/projects/jigsaw/spec/>
JSE 9+ (1).
Current status:
- Arrow Java use JSE1.8 specification
- Arrow Java works with JSE1.8/9/11/17
- This is possible because Java offers “legacy mode”
Proposal:
Migrate to JPMS Java Platform Module System. This Draft PR
<https://github.com/apache/arrow/pull/13072>(2<https://github.com/apache/arrow/pull/13072%3e(2<https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2>>)
contains an initial port of
the modules: Format / Memory Core / Memory Netty / Memory Unsafe / Vector
for evaluation.
Main Reason to migrate:
- JPMS offer Strong encapsulation, Well-defined interfaces
<https://github.com/nipafx/demo-jigsaw-reflection>, Explicit dependencies.
<https://nipafx.dev/java-modules-reflection-vs-encapsulation/> (3)(4)
- JPMS offer reliable configuration and security to hide platform internals.
- JPMS offers a partial solution to solve problems about read (80%) /write
(20%) code.
- JPMS offer optimization for readability about read/write ratio (90/10)
thru module-info.java.
- Consistency logs, JPMS implement consistency logs to really use that to
solve the current problem.
- Be able to customize JRE needed with only modules needed (not
java.desktop for example and others) thru JLink.
- Modules have also been implemented by other languages such as Javascript
(ES2015), C++(C++20), Net (Nuget/NetCore)..
- Consider taking a look at this discussion about pros/cons
<https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/>
(5).
- Eventual migration to JPMS is a practical necessity as more projects
migrate.
Effort:
- First of all we need to decide to move from JSE1.8 to JSE9+ or be able to
offer support for both jar components JSE1.8 and JSE9+ included.
- Go bottom up for JPMS.
- Packages need to be unique (i.e. org.apache.arrow.memory /
io.netty.buffer). Review Draft PR with initial proposal.
- Dependencies also need to be modularized. If some of our current
dependencies are not able to be used as a module this will be a blocker for
our modules (we could patch that but this is an extra effort).
Killers:
- FIXME! I need your support to identify killer reasons to be able to push
this implementation.
Please let us know if Arrow Java to JPMS Java Platform Module System is
needed and should be implemented.
Please use this file for any comments
https://docs.google.com/document/d/1qcJ8LPm33UICuGjRnsGBcm8dLI08MyiL8BO5JVzTutA/edit?usp=sharing
Resources used:
(1): https://openjdk.java.net/projects/jigsaw/spec/
(2): https://github.com/apache/arrow/pull/13072
(3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/
(4): https://github.com/nipafx/demo-jigsaw-reflection
(5):
https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/
Best regards,
--
David