Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21685
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user sidhavratha commented on the issue:
https://github.com/apache/spark/pull/21685
Our kafka team have resolved issue regarding 40 sec poll delay, due to some
faulty hardware.
However, these changes still make sense to get better throughput per batch.
As you know kafka
Github user gaborgsomogyi commented on the issue:
https://github.com/apache/spark/pull/21685
In the meantime came something into my mind (the most obvious question).
What is the size of kafka events which is processed? Big events could end
up in high polling time.
Maybe some
Github user sidhavratha commented on the issue:
https://github.com/apache/spark/pull/21685
And yes, both application are tested on same dataset, with only additional
buffer logic applied, and consumer group-id changed.
In before case scheduling delay is increasing because of
Github user sidhavratha commented on the issue:
https://github.com/apache/spark/pull/21685
If batch duration is 10 second, every 10 second 1 new batch will start
irrespective of last batch was completed or not.
If a particular batch (10 second duration - which is supposed to
Github user gaborgsomogyi commented on the issue:
https://github.com/apache/spark/pull/21685
What I can't really understand is why the `Scheduler Delay` is so different.
`
Scheduler delay includes time to ship the task from the scheduler to the
executor, and time to send
Github user sidhavratha commented on the issue:
https://github.com/apache/spark/pull/21685
Thanks a lot for looking into this. Please find comments in [] below every
points.
- You're trying to commit something into 2.4 but in the test result I see
with 2.1.0 version. Have
Github user gaborgsomogyi commented on the issue:
https://github.com/apache/spark/pull/21685
In general `KafkaConsumer.poll` should take couple of seconds but 10+ is
extreme high. The question `why it takes so long?` has to be answered first. In
the processing time chart I see a
Github user sidhavratha commented on the issue:
https://github.com/apache/spark/pull/21685
@gaborgsomogyi Can you please review this PR and approve for test.
---
-
To unsubscribe, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21685
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21685
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21685
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
12 matches
Mail list logo