[
https://issues.apache.org/jira/browse/SPARK-47172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18050862#comment-18050862
]
Gabor Roczei commented on SPARK-47172:
--------------------------------------
According to this jira, the fix is included in the following versions: 3.5.2,
3.4.4, 4.0.0. I downloaded the below Spark releases from the official archive
([https://archive.apache.org/dist/spark|https://archive.apache.org/dist/spark/]):
* spark-3.4.4-bin-hadoop3
* spark-3.5.2-bin-hadoop3
* spark-3.5.7-bin-hadoop3
* spark-4.1.1-bin-hadoop3
I have tested this "AES/GCM/NoPadding" cipher with both YARN and local-cluster
modes, but unfortunately it did not work for me with any of these releases. My
repro steps:
I updated the conf/log4j2.properties file to use TRACE level logging:
{code:java}
rootLogger.level = trace
logger.repl.level = trace{code}
Test 1: authEngineVersion=2 (Fails)
{code:java}
bin/spark-shell --conf spark.authenticate=true --conf
spark.authenticate.secret=secret --conf spark.network.crypto.enabled=true
--conf spark.network.crypto.cipher="AES/GCM/NoPadding" --conf
spark.network.crypto.authEngineVersion=2 --master local-cluster[2,2,3000]{code}
Log file: [^spark-4.1.1-AES-GCM-NoPadding-authEngineVersion-2-bad.txt]
Test 2: authEngineVersion=1 (Fails)
{code:java}
bin/spark-shell --conf spark.authenticate=true --conf
spark.authenticate.secret=secret --conf spark.network.crypto.enabled=true
--conf spark.network.crypto.cipher="AES/GCM/NoPadding" --conf
spark.network.crypto.authEngineVersion=1 --master local-cluster[2,2,3000]{code}
Log file: [^spark-4.1.1-AES-GCM-NoPadding-authEngineVersion-1-bad.txt]
spark-4.1.1-AES-GCM-NoPadding-authEngineVersion-1-bad.txt
Test 3: Default Cipher (Works) it works across all releases when using the
default cipher:
{code:java}
bin/spark-shell --conf spark.authenticate=true --conf
spark.authenticate.secret=secret --conf spark.network.crypto.enabled=true
--master local-cluster[2,2,3000]{code}
Log file: [^spark-4.1.1-AES-CTR-NoPadding-good.txt]
The logs show that after the RpcEndpointVerifier.CheckExistence message, no
further messages are processed in case of "AES/GCM/NoPadding". I encounter a
similar error when trying to run Spark with the YARN scheduler.
[~sweisdb],
Do you have any insight into what I might be doing wrong? Thanks a lot!
> Upgrade Transport block cipher mode to GCM
> ------------------------------------------
>
> Key: SPARK-47172
> URL: https://issues.apache.org/jira/browse/SPARK-47172
> Project: Spark
> Issue Type: Improvement
> Components: Security
> Affects Versions: 3.4.2, 3.5.0
> Reporter: Steve Weis
> Assignee: Steve Weis
> Priority: Minor
> Labels: pull-request-available
> Fix For: 3.5.2, 3.4.4, 4.0.0
>
> Attachments: spark-4.1.1-AES-CTR-NoPadding-good.txt,
> spark-4.1.1-AES-GCM-NoPadding-authEngineVersion-1-bad.txt,
> spark-4.1.1-AES-GCM-NoPadding-authEngineVersion-2-bad.txt
>
>
> The cipher transformation currently used for encrypting RPC calls is an
> unauthenticated mode (AES/CTR/NoPadding). This needs to be upgraded to an
> authenticated mode (AES/GCM/NoPadding) to prevent ciphertext from being
> modified in transit.
> The relevant line is here:
> [https://github.com/apache/spark/blob/a939a7d0fd9c6b23c879cbee05275c6fbc939e38/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java#L220]
> GCM is relatively more computationally expensive than CTR and adds a 16-byte
> block of authentication tag data to each payload.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]