Hi, I am not sure my problem is relevant to spark, but perhaps someone else had the same error. When I try to write files that need multipart upload to S3 from a job on EMR I always get this error:
com.amazonaws.services.s3.model.AmazonS3Exception: The Content-MD5 you specified did not match what we received. If I disable multipart upload via fs.s3n.multipart.uploads.enabled (or output smaller files that don't require multi part upload), then everything works fine. I've seen an old thread on the ML where someone has the same error, but in my case I don't have any other errors on the worker nodes. I am using spark 1.2.1 and hadoop 2.4.0. Thanks, Eugen