[jira] [Resolved] (SPARK-29799) Split a kafka partition into multiple KafkaRDD partitions in the kafka external plugin for Spark Streaming

Hyukjin Kwon (Jira) Sun, 12 Apr 2020 20:28:26 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-29799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hyukjin Kwon resolved SPARK-29799.
----------------------------------
    Resolution: Duplicate

> Split a kafka partition into multiple KafkaRDD partitions in the kafka 
> external plugin for Spark Streaming
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-29799
>                 URL: https://issues.apache.org/jira/browse/SPARK-29799
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.1.0
>            Reporter: zengrui
>            Priority: Major
>         Attachments: 0001-add-implementation-for-issue-SPARK-29799.patch
>
>
> When we use Spark Streaming to consume records from kafka, the generated 
> KafkaRDD‘s partition number is equal to kafka topic's partition number, so we 
> can not use more cpu cores to execute the streaming task except we change the 
> topic's partition number，but we can not increase the topic's partition number 
> infinitely.
> Now I think we can split a kafka partition into multiple KafkaRDD partitions, 
> and we can config
> it, then we can use more cpu cores to execute the streaming task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-29799) Split a kafka partition into multiple KafkaRDD partitions in the kafka external plugin for Spark Streaming

Reply via email to