How do we process/scale variable size batches in Apache Spark Streaming

Rachana Srivastava Tue, 23 Aug 2016 15:21:33 -0700

I am running a spark streaming process where I am getting batch of data after n 
seconds. I am using repartition to scale the application. Since the repartition 
size is fixed we are getting lots of small files when batch size is very small. 
Is there anyway I can change the partitioner logic based on the input batch 
size in order to avoid lots of small files.

How do we process/scale variable size batches in Apache Spark Streaming

Reply via email to