sc.textFile takes a minimum # of partitions to use.

is there a way to get sc.newAPIHadoopFile to do the same?

I know I can repartition() and get a shuffle.  I'm wondering if there's a
way to tell the underlying InputFormat (AvroParquet, in my case) how many
partitions to use at the outset.

What I'd really prefer is to get the partitions automatically defined based
on the number of blocks.

Reply via email to