zengrui created SPARK-29799: ------------------------------- Summary: Split a kafka partition into multiple KafkaRDD partitions in the kafka external plugin for Spark Streaming Key: SPARK-29799 URL: https://issues.apache.org/jira/browse/SPARK-29799 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 2.4.3, 2.1.0 Reporter: zengrui
When we use Spark Streaming to consume records from kafka, the generated KafkaRDD‘s partition number is equal to kafka topic's partition number, so we can not use more cpu cores to execute the streaming task except we change the topic's partition number,but we can not increase the topic's partition number infinitely. Now I think we can split a kafka partition into multiple KafkaRDD partitions, and we can config it, then we can use more cpu cores to execute the streaming task. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org