[GitHub] spark issue #17774: [SPARK-18371][Streaming] Spark Streaming backpressure ge...

2017-05-29 Thread arzt
Github user arzt commented on the issue: https://github.com/apache/spark/pull/17774 It's been a while. What can I do to draw some attention to this request? Is this issue not relevant enough? Thanks for reconsideration @felixcheung @brkyvz @zsxwing --- If your project is set u

[GitHub] spark issue #17774: [SPARK-18371][Streaming] Spark Streaming backpressure ge...

2017-05-09 Thread arzt
Github user arzt commented on the issue: https://github.com/apache/spark/pull/17774 @felixcheung will this be merged? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17774: [SPARK-18371][Streaming] Spark Streaming backpressure ge...

2017-05-02 Thread arzt
Github user arzt commented on the issue: https://github.com/apache/spark/pull/17774 Sorry for being inactive. All good with this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17774: [SPARK-18371][Streaming] Spark Streaming backpres...

2017-04-28 Thread arzt
Github user arzt commented on a diff in the pull request: https://github.com/apache/spark/pull/17774#discussion_r113871876 --- Diff: external/kafka-0-10/src/test/scala/org/apache/spark/streaming/kafka010/DirectKafkaStreamSuite.scala --- @@ -617,6 +617,94 @@ class

[GitHub] spark issue #17774: [SPARK-18371][Streaming] Spark Streaming backpressure ge...

2017-04-27 Thread arzt
Github user arzt commented on the issue: https://github.com/apache/spark/pull/17774 I changed the max messages per partition to be at least 1. Agreed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17774: [SPARK-18371][Streaming] Spark Streaming backpressure ge...

2017-04-27 Thread arzt
Github user arzt commented on the issue: https://github.com/apache/spark/pull/17774 @koeninger I agree that assuming a long batch size is wrong, not sure whether it even matters. But what if for one partition there is no lack in the current batch? Then fetching 1 message for

[GitHub] spark issue #17774: [SPARK-18371][Streaming] Spark Streaming backpressure ge...

2017-04-27 Thread arzt
Github user arzt commented on the issue: https://github.com/apache/spark/pull/17774 To run tests or debug using IntelliJ: `mvn test -DforkMode=never -pl external/kafka-0-8 "-Dsuites=org.apache.spark.streaming.kafka.DirectKafkaStreamSuite maxMessagesPerPartition"` -

[GitHub] spark issue #17774: [SPARK-18371][Streaming] Spark Streaming backpressure ge...

2017-04-27 Thread arzt
Github user arzt commented on the issue: https://github.com/apache/spark/pull/17774 Thanks for your valuable feedback. I added tests as suggested by @JasonMWhite . @koeninger the estimated rate is per second summed over all partitions, is it? The batch time usually is longer. So even

[GitHub] spark pull request #17774: [SPARK-18371][Streaming] Spark Streaming backpres...

2017-04-26 Thread arzt
GitHub user arzt opened a pull request: https://github.com/apache/spark/pull/17774 [SPARK-18371][Streaming] Spark Streaming backpressure generates batch with large number of records ## What changes were proposed in this pull request? Omit rounding of backpressure rate