Re: [EXTERNAL] Re: [Spark]: Spark / Iceberg / hadoop-aws compatibility matrix

2024-04-03 Thread Oxlade, Dan
I don't really understand how Iceberg and the hadoop libraries can coexist in a deployment. The latest spark (3.5.1) base image contains the hadoop-client*-3.3.4.jar. The AWS v2 SDK is only supported in hadoop*-3.4.0.jar and onward. Iceberg AWS integration states AWS v2 SDK is

Re: [EXTERNAL] Re: [Spark]: Spark / Iceberg / hadoop-aws compatibility matrix

2024-04-03 Thread Oxlade, Dan
Swapping out the iceberg-aws-bundle for the very latest aws provided sdk ('software.amazon.awssdk:bundle:2.25.23') produces an incompatibility from a slightly different code path: java.lang.NoSuchMethodError: 'void

Participate in the ASF 25th Anniversary Campaign

2024-04-03 Thread Brian Proffitt
Hi everyone, As part of The ASF’s 25th anniversary campaign[1], we will be celebrating projects and communities in multiple ways. We invite all projects and contributors to participate in the following ways: * Individuals - submit your first contribution:

Re: [EXTERNAL] Re: [Spark]: Spark / Iceberg / hadoop-aws compatibility matrix

2024-04-03 Thread Oxlade, Dan
[sorry; replying all this time] With hadoop-*-3.3.6 in place of the 3.4.0 below I get java.lang.NoClassDefFoundError: com/amazonaws/AmazonClientException I think that the below iceberg-aws-bundle version supplies the v2 sdk. Dan From: Aaron Grubb Sent: 03

Re: [Spark]: Spark / Iceberg / hadoop-aws compatibility matrix

2024-04-03 Thread Aaron Grubb
Downgrade to hadoop-*:3.3.x, Hadoop 3.4.x is based on the AWS SDK v2 and should probably be considered as breaking for tools that build on < 3.4.0 while using AWS. From: Oxlade, Dan Sent: Wednesday, April 3, 2024 2:41:11 PM To: user@spark.apache.org Subject:

[Spark]: Spark / Iceberg / hadoop-aws compatibility matrix

2024-04-03 Thread Oxlade, Dan
Hi all, I've struggled with this for quite some time. My requirement is to read a parquet file from s3 to a Dataframe then append to an existing iceberg table. In order to read the parquet I need the hadoop-aws dependency for s3a:// . In order to write to iceberg I need the iceberg dependency.