Reading AVRO from S3 - No parallelism

Prithish Thu, 27 Oct 2016 05:20:08 -0700

I am trying to read a bunch of AVRO files from a S3 folder using Spark 2.0.
No matter how many executors I use or what configuration changes I make,
the cluster doesn't seem to use all the executors. I am using the
com.databricks.spark.avro library from databricks to read the AVRO.


However, if I try the same on CSV files (same S3 folder, same configuration
and cluster), it does use all executors.

Is there something that I need to do to enable parallelism when using the
AVRO databricks library?

Thanks for your help.

Reading AVRO from S3 - No parallelism

Reply via email to