GitHub user lonelytrooper opened a pull request: https://github.com/apache/spark/pull/19274
[SPARK-22056] Add subconcurrency for KafkaRDDPartition JIRA Issueï¼https://issues.apache.org/jira/browse/SPARK-22056 When spark streaming consuming data from Kafka in direct way , partition in Kafka and KafkaRDDPartition in spark streaming are now bijection. To enhance the computing ability of spark streaming, we always to increase the number of partitions in Kafka , but too many increments may lead problems in Kafka like leader selection. So , we introduce a new mechanism that change bijection to one-to-many which controls by a new parameter named "topic.partition.subconcurrency". This mechanism will divide one KafkaRDDPartition to many according to the parameter, thus will make spark streaming use computing resources more efficient and avoid the problems caused by increase the Kafka partitions. (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/lonelytrooper/spark add_partition_concurrency Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19274.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19274 ---- commit a89663411e568f265103f0b695168d4db68a2b36 Author: bjyfhanfei <yfhan...@jd.com> Date: 2017-09-04T09:00:25Z add partition subconcurrency commit d1132195d6b2087be4f18ad25614836c46512fe7 Author: bjyfhanfei <yfhan...@jd.com> Date: 2017-09-19T06:12:29Z add topic.partition.subconcurrency ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org