Re: submissionTime vs batchTime, DirectKafka

Cody Koeninger Wed, 09 Mar 2016 10:53:23 -0800

I'm really not sure what you're asking.

On Wed, Mar 9, 2016 at 12:43 PM, Sachin Aggarwal
<different.sac...@gmail.com> wrote:
> where are we capturing this delay?
> I am aware of scheduling delay which is defined as processing
> time-submission time not the batch create time
>
> On Wed, Mar 9, 2016 at 10:46 PM, Cody Koeninger <c...@koeninger.org> wrote:
>>
>> Spark streaming by default will not start processing a batch until the
>> current batch is finished.  So if your processing time is larger than
>> your batch time, delays will build up.
>>
>> On Wed, Mar 9, 2016 at 11:09 AM, Sachin Aggarwal
>> <different.sac...@gmail.com> wrote:
>> > Hi All,
>> >
>> > we have batchTime and submissionTime.
>> >
>> > @param batchTime   Time of the batch
>> >
>> > @param submissionTime  Clock time of when jobs of this batch was
>> > submitted
>> > to the streaming scheduler queue
>> >
>> > 1) we are seeing difference between batchTime and submissionTime for
>> > small
>> > batches(300ms) even in minutes for direct kafka this we see, only when
>> > the
>> > processing time is more than the batch interval. how can we explain this
>> > delay??
>> >
>> > 2) In one of case batch processing time is more then batch interval,
>> > then
>> > will spark fetch the next batch data from kafka parallelly processing
>> > the
>> > current batch or it will wait for current batch to finish first ?
>> >
>> > I would be thankful if you give me some pointers
>> >
>> > Thanks!
>> > --
>> >
>> > Thanks & Regards
>> >
>> > Sachin Aggarwal
>> > 7760502772
>
>
>
>
> --
>
> Thanks & Regards
>
> Sachin Aggarwal
> 7760502772


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: submissionTime vs batchTime, DirectKafka

Reply via email to