RE: Understanding Spark S3 Read Performance

2023-05-16 Thread info
: Understanding Spark S3 Read Performance Hi,I'm trying to set up a Spark pipeline which reads data from S3 and writes it into Google Big Query.Environment Details:---Java 8AWS EMR-6.10.0Spark v3.3.12 m5.xlarge executor nodesS3 Directory structure:--- bucket-name:|---folder1

Understanding Spark S3 Read Performance

2023-05-16 Thread Shashank Rao
Hi, I'm trying to set up a Spark pipeline which reads data from S3 and writes it into Google Big Query. Environment Details: --- Java 8 AWS EMR-6.10.0 Spark v3.3.1 2 m5.xlarge executor nodes S3 Directory structure: --- bucket-name: |---folder1: |---folder2: