GitHub user lonelytrooper opened a pull request:

    https://github.com/apache/spark/pull/19274

    [SPARK-22056] Add subconcurrency for KafkaRDDPartition

    JIRA Issue:https://issues.apache.org/jira/browse/SPARK-22056
    
    When spark streaming consuming data from Kafka in direct way , partition in 
Kafka and KafkaRDDPartition in spark streaming are now bijection. To enhance 
the computing ability of spark streaming, we always to increase the number of 
partitions in Kafka , but too many increments may lead problems in Kafka like 
leader selection. 
    So , we introduce a new mechanism that change bijection to one-to-many 
which controls by a new parameter named "topic.partition.subconcurrency". This 
mechanism will divide one KafkaRDDPartition to many according to the parameter, 
thus will make spark streaming use computing resources more efficient  and 
avoid the problems caused by increase the Kafka partitions.  
    
    
    
    (Please fill in changes proposed in this fix)
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lonelytrooper/spark add_partition_concurrency

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19274.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19274
    
----
commit a89663411e568f265103f0b695168d4db68a2b36
Author: bjyfhanfei <yfhan...@jd.com>
Date:   2017-09-04T09:00:25Z

    add partition subconcurrency

commit d1132195d6b2087be4f18ad25614836c46512fe7
Author: bjyfhanfei <yfhan...@jd.com>
Date:   2017-09-19T06:12:29Z

    add topic.partition.subconcurrency

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to