spark git commit: [SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce the load
Repository: spark Updated Branches: refs/heads/branch-2.1 2a7f5dae5 -> 704b249b6 [SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce the load ## What changes were proposed in this pull request? I checked the logs of https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.2-test-maven-hadoop-2.7/47/ and found it took several seconds to create Kafka internal topic `__consumer_offsets`. As Kafka creates this topic lazily, the topic creation happens in the first test `deserialization of initial offset with Spark 2.1.0` and causes it timeout. This PR changes `offsets.topic.num.partitions` from the default value 50 to 1 to make creating `__consumer_offsets` (50 partitions -> 1 partition) much faster. ## How was this patch tested? Jenkins Author: Shixiong ZhuCloses #17863 from zsxwing/fix-kafka-flaky-test. (cherry picked from commit bd5788287957d8610a6d19c273b75bd4cdd2d166) Signed-off-by: Shixiong Zhu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/704b249b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/704b249b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/704b249b Branch: refs/heads/branch-2.1 Commit: 704b249b6a3ea956086d6c6ef50da18e8228eeb4 Parents: 2a7f5da Author: Shixiong Zhu Authored: Fri May 5 11:08:26 2017 -0700 Committer: Shixiong Zhu Committed: Fri May 5 11:08:43 2017 -0700 -- .../test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/704b249b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala -- diff --git a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala index c2cbd86..4345f88 100644 --- a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala +++ b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala @@ -281,6 +281,7 @@ class KafkaTestUtils(withBrokerProps: Map[String, Object] = Map.empty) extends L props.put("log.flush.interval.messages", "1") props.put("replica.socket.timeout.ms", "1500") props.put("delete.topic.enable", "true") +props.put("offsets.topic.num.partitions", "1") props.putAll(withBrokerProps.asJava) props } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce the load
Repository: spark Updated Branches: refs/heads/branch-2.2 f71aea6a0 -> 24fffacad [SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce the load ## What changes were proposed in this pull request? I checked the logs of https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.2-test-maven-hadoop-2.7/47/ and found it took several seconds to create Kafka internal topic `__consumer_offsets`. As Kafka creates this topic lazily, the topic creation happens in the first test `deserialization of initial offset with Spark 2.1.0` and causes it timeout. This PR changes `offsets.topic.num.partitions` from the default value 50 to 1 to make creating `__consumer_offsets` (50 partitions -> 1 partition) much faster. ## How was this patch tested? Jenkins Author: Shixiong ZhuCloses #17863 from zsxwing/fix-kafka-flaky-test. (cherry picked from commit bd5788287957d8610a6d19c273b75bd4cdd2d166) Signed-off-by: Shixiong Zhu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/24fffaca Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/24fffaca Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/24fffaca Branch: refs/heads/branch-2.2 Commit: 24fffacad709c553e0f24ae12a8cca3ab980af3c Parents: f71aea6 Author: Shixiong Zhu Authored: Fri May 5 11:08:26 2017 -0700 Committer: Shixiong Zhu Committed: Fri May 5 11:08:32 2017 -0700 -- .../test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/24fffaca/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala -- diff --git a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala index 2ce2760..f86b8f5 100644 --- a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala +++ b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala @@ -292,6 +292,7 @@ class KafkaTestUtils(withBrokerProps: Map[String, Object] = Map.empty) extends L props.put("log.flush.interval.messages", "1") props.put("replica.socket.timeout.ms", "1500") props.put("delete.topic.enable", "true") +props.put("offsets.topic.num.partitions", "1") props.putAll(withBrokerProps.asJava) props } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce the load
Repository: spark Updated Branches: refs/heads/master 41439fd52 -> bd5788287 [SPARK-20603][SS][TEST] Set default number of topic partitions to 1 to reduce the load ## What changes were proposed in this pull request? I checked the logs of https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.2-test-maven-hadoop-2.7/47/ and found it took several seconds to create Kafka internal topic `__consumer_offsets`. As Kafka creates this topic lazily, the topic creation happens in the first test `deserialization of initial offset with Spark 2.1.0` and causes it timeout. This PR changes `offsets.topic.num.partitions` from the default value 50 to 1 to make creating `__consumer_offsets` (50 partitions -> 1 partition) much faster. ## How was this patch tested? Jenkins Author: Shixiong ZhuCloses #17863 from zsxwing/fix-kafka-flaky-test. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bd578828 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bd578828 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bd578828 Branch: refs/heads/master Commit: bd5788287957d8610a6d19c273b75bd4cdd2d166 Parents: 41439fd Author: Shixiong Zhu Authored: Fri May 5 11:08:26 2017 -0700 Committer: Shixiong Zhu Committed: Fri May 5 11:08:26 2017 -0700 -- .../test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/bd578828/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala -- diff --git a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala index 2ce2760..f86b8f5 100644 --- a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala +++ b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala @@ -292,6 +292,7 @@ class KafkaTestUtils(withBrokerProps: Map[String, Object] = Map.empty) extends L props.put("log.flush.interval.messages", "1") props.put("replica.socket.timeout.ms", "1500") props.put("delete.topic.enable", "true") +props.put("offsets.topic.num.partitions", "1") props.putAll(withBrokerProps.asJava) props } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org