sc.textFile takes a minimum # of partitions to use. is there a way to get sc.newAPIHadoopFile to do the same?
I know I can repartition() and get a shuffle. I'm wondering if there's a way to tell the underlying InputFormat (AvroParquet, in my case) how many partitions to use at the outset. What I'd really prefer is to get the partitions automatically defined based on the number of blocks.