Hello,
I'm trying to load in an Avro file and write it out as Parquet. I would
like to have enough partitions to properly parallelize on. When I do the
simple load and save I get 1 partition out. I thought I would be able to
use repartition like the following:
val avroFile =
uot;user@spark.apache.org<mailto:user@spark.apache.org>"
Subject: DataFrame repartition not repartitioning
Hello,
I'm trying to load in an Avro file and write it out as Parquet. I would like to
have enough partitions to properly parallelize on. When I do the simple load
and save I get 1 part