Hello all, we have a hadoop cluster (using yarn) using s3 as filesystem with s3guard is enabled. We are using hadoop 3.2.1 with spark 2.4.5.
When I try to save a dataframe in parquet format, I get the following exception: java.lang.ClassNotFoundException: com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol My relevant spark configurations are as following: "hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory", "fs.s3a.committer.name": "magic", "fs.s3a.committer.magic.enabled": true, "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem", While spark streaming fails with the exception above, apache beam succeeds writing parquet files. What might be the problem? Thanks in advance -- "Talkers aren’t good doers. Rest assured that we’re going there to use our hands, not our tongues." W. Shakespeare