I have a Spark Streaming application which reads data from kafka and save
the the transformation result to hdfs.
My original partition number of kafka topic is 8, and repartition the data
to 100 to increase the parallelism of spark job.
Now I am wondering if I increase the kafka partition number to 100 instead
of setting repartition to 100, will the performance be enhanced? (I know
repartition action cost a lot cpu resource)
If I set the kafka partition number to 100, does it have any negative
efficiency?
I just have one production environment so it's not convenient for me to do
the test....

Thanks!

Regard,
Junfeng Chen

Reply via email to