[ 
https://issues.apache.org/jira/browse/FLINK-38284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18020284#comment-18020284
 ] 

Jaehyun Kim commented on FLINK-38284:
-------------------------------------

[~tomncooper] Yes, I'm planning to work on this.

I've tested upgrading `fs.hadoopshaded.version` to 3.4.2 locally.  
The `flink-azure-fs-hadoop` module builds and passes tests,
but the `flink-s3-fs-hadoop` module fails due to upstream S3A API changes in 
Hadoop 3.4.2.

To keep progress moving, I’m preparing an initial PR that upgrades the shared 
version
and excludes the S3 module temporarily, and then follow up with a dedicated 
JIRA and PR to refactor the S3 connector.

Let me know if you'd prefer a different direction.

> Prepare to upgrade hadoop version to 3.4.2 across Flink's Hadoop-based FS 
> connectors for OpenSSL 3 and Java 17 compatibility
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-38284
>                 URL: https://issues.apache.org/jira/browse/FLINK-38284
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / FileSystem, FileSystems
>            Reporter: Jaehyun Kim
>            Priority: Major
>
> h3. *Description*
> Apache Hadoop has merged [PR 
> #7032|https://github.com/apache/hadoop/pull/7032] and HADOOP-19262, upgrading 
> wildfly-openssl to 2.1.6.Final to compatibility with Java 17 and OpenSSL 3. 
> This fix is planned to be included in the upcoming Hadoop 3.4.2 release.
> Currently, Flink sets in {{flink-fliesystems/pom.xml}} :
> {code:java}
> <fs.hadoopshaded.version>3.3.4</fs.hadoopshaded.version> {code}
> which means modules like {{flink-azure-fs-hadoop.jar}} transitively include 
> {{wildfly-oepnssl:1.0.7:Final}} via {{{}hadoop-azure:3.3.4{}}}. This version 
> is not compatible with OpenSSL 3 and causes runtime issues on modern 
> platforms.
> h3. *Impact and Scope*
> This issue originates in Apache Hadoop's {{hadoop-azure}} module, which 
> transitively includes an outdated version of {{{}wildfly-openssl{}}}. As a 
> result, all Flink modules depending on this (e.g., 
> {{{}flink-azure-fs-hadoop{}}}) are affected.
> Furthermore, other Flink filesystem connectors that rely on Hadoop (directly 
> or via {{{}flink-shaded-hadoop{}}}) may also benefit from this upgrade:
>  * {{flink-azure-fs-hadoop}}
>  * {{flink-gs-fs-hadoop}}
>  * {{flink-oss-fs-hadoop}}
>  * {{flink-s3-fs-hadoop}}
> This change is particularly relevant for users running Flink on:
>  * {*}Java 17{*}, where {{X509V1CertImpl}} was removed from the JDK
>  * *OpenSSL 3.x systems* (e.g., RHEL 9), where older {{wildfly-openssl}} 
> versions fail to load
> h3. *Motivation*
> Upgrading to {{hadoop-azure:3.4.2}} will:
>  * Ensure compatibility with Java 17+ and OpenSSL 3
>  * Resolve {{ClassNotFoundException: 
> com.sun.security.cert.internal.x509.X509V1CertImpl}} errors on OpenSSL 
> 1.1-based systems (e.g., RHEL 8.10)
>  * Align with Hadoop upstream fixes
>  * Avoid of performance-impacting workarounds like forcing 
> {{fs.azure.ssl.channel.mode=Default_JSSE}}
>  * Even when JSSE fallback avoids the crash, {*}it is not ideal for 
> performance and stability{*}.
> Using native OpenSSL via JNI (as intended by {{{}wildfly-openssl{}}}) is 
> preferred in high-throughput or secure production environments.
> h3. *Proposed Plan*
> Once Hadoop 3.4.2 is officially released:
>  # Update {{fs.hadoopshaded.version}} to {{3.4.2}} in 
> {{flink-filesystems/pom.xml}} 
>  # Verify and update NOTICE/LICENSE files as required
>  # Rebuild {{flink-azure-fs-hadoop}} to confirm correct shading of the 
> updated dependencies
>  # Ensure that native SSL initialization works in both OpenSSL 1.1 and 3 
> environments
>  # Optionally, update test coverage for ABFS + SSL
> This ticket serves to track the upgrade preparation and corresponding work 
> once the upstream Hadoop release is available.
> h3. *Environment Affected*
>  * Flink 1.19.0 - 2.1.0
>  * Java 17 (OracleJDK, OpenJDK, Amazon Corretto)
>  * RHEL 8.10 (OpenSSL 1.1.1) → native loads, causes error
> {code:java}
> [ERROR] org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - Fatal error 
> occurred in the cluster entrypoint.java.util.concurrent.CompletionException: 
> java.lang.RuntimeException: java.lang.IllegalStateException: 
> javax.security.cert.CertificateException: Could not find class: 
> java.lang.ClassNotFoundException: 
> com/sun/security/cert/internal/x509/X509V1CertImpl{code}
>  * RHEL 9.3 (OpenSSL 3.x) → native fails, JSSE fallback
> {code:java}
> [DEBUG] org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory   [] - 
> Failed to load OpenSSL. Falling back to the JSSE{code}
>  * ABFS with HA enabled ({{{}abfss://{}}})
> h3. *Workarounds Today*
>  * Set {{fs.azure.ssl.channel.mode:Default_JSSE}} in {{config.yaml}} to 
> disable native OpenSSL
>  * Avoid OpenSSL 1.1 platforms
>  * Remove the {{wildfly-openssl}} JAR from the opt plugin (not ideal)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to