How to increase the parallelism of Spark Streaming application？

JF Chen Tue, 06 Nov 2018 23:28:08 -0800

I have a Spark Streaming application which reads data from kafka and save
the the transformation result to hdfs.
My original partition number of kafka topic is 8, and repartition the data
to 100 to increase the parallelism of spark job.
Now I am wondering if I increase the kafka partition number to 100 instead
of setting repartition to 100, will the performance be enhanced? (I know
repartition action cost a lot cpu resource)
If I set the kafka partition number to 100, does it have any negative
efficiency?
I just have one production environment so it's not convenient for me to do
the test....


Thanks!

Regard,
Junfeng Chen

How to increase the parallelism of Spark Streaming application？

Reply via email to