[GitHub] spark issue #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to ...

2018-07-31 Thread sidhavratha
Github user sidhavratha commented on the issue: https://github.com/apache/spark/pull/21685 Our kafka team have resolved issue regarding 40 sec poll delay, due to some faulty hardware. However, these changes still make sense to get better throughput per batch. As you know kafka

[GitHub] spark issue #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to ...

2018-07-02 Thread sidhavratha
Github user sidhavratha commented on the issue: https://github.com/apache/spark/pull/21685 And yes, both application are tested on same dataset, with only additional buffer logic applied, and consumer group-id changed. In before case scheduling delay is increasing because

[GitHub] spark issue #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to ...

2018-07-02 Thread sidhavratha
Github user sidhavratha commented on the issue: https://github.com/apache/spark/pull/21685 If batch duration is 10 second, every 10 second 1 new batch will start irrespective of last batch was completed or not. If a particular batch (10 second duration - which is supposed

[GitHub] spark issue #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to ...

2018-07-02 Thread sidhavratha
Github user sidhavratha commented on the issue: https://github.com/apache/spark/pull/21685 Thanks a lot for looking into this. Please find comments in [] below every points. - You're trying to commit something into 2.4 but in the test result I see with 2.1.0 version. Have

[GitHub] spark issue #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to ...

2018-07-01 Thread sidhavratha
Github user sidhavratha commented on the issue: https://github.com/apache/spark/pull/21685 @gaborgsomogyi Can you please review this PR and approve for test. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-stream...

2018-06-30 Thread sidhavratha
GitHub user sidhavratha opened a pull request: https://github.com/apache/spark/pull/21685 [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to maintain min … …buffer using async thread to avoid blocking kafka poll ## What changes were proposed in this pull request