The Avro files were 500-600kb in size and that folder contained around 1200
files. The total folder size was around 600mb. Will try repartition. Thank you.
>
> On Oct 28, 2016 at 2:24 AM, (mailto:mich...@databricks.com)> wrote:
>
>
>
> How big are your
How big are your avro files? We collapse many small files into a single
partition to eliminate scheduler overhead. If you need explicit
parallelism you can also repartition.
On Thu, Oct 27, 2016 at 5:19 AM, Prithish wrote:
> I am trying to read a bunch of AVRO files from a
I am trying to read a bunch of AVRO files from a S3 folder using Spark 2.0.
No matter how many executors I use or what configuration changes I make,
the cluster doesn't seem to use all the executors. I am using the
com.databricks.spark.avro library from databricks to read the AVRO.
However, if I