I am running a spark streaming process where I am getting batch of data after n 
seconds. I am using repartition to scale the application. Since the repartition 
size is fixed we are getting lots of small files when batch size is very small. 
Is there anyway I can change the partitioner logic based on the input batch 
size in order to avoid lots of small files.

Reply via email to