Re: Are these numbers abnormal for spark streaming?

Gerard Maas Thu, 22 Jan 2015 03:31:32 -0800

and post the code (if possible).
In a nutshell, your processing time > batch interval,  resulting in an
ever-increasing delay that will end up in a crash.
3 secs to process 14 messages looks like a lot. Curious what the job logic
is.


-kr, Gerard.

On Thu, Jan 22, 2015 at 12:15 PM, Tathagata Das <tathagata.das1...@gmail.com
> wrote:

> This is not normal. Its a huge scheduling delay!! Can you tell me more
> about the application?
> - cluser setup, number of receivers, whats the computation, etc.
>
> On Thu, Jan 22, 2015 at 3:11 AM, Ashic Mahtab <as...@live.com> wrote:
>
>> Hate to do this...but...erm...bump? Would really appreciate input from
>> others using Streaming. Or at least some docs that would tell me if these
>> are expected or not.
>>
>> ------------------------------
>> From: as...@live.com
>> To: user@spark.apache.org
>> Subject: Are these numbers abnormal for spark streaming?
>> Date: Wed, 21 Jan 2015 11:26:31 +0000
>>
>>
>> Hi Guys,
>> I've got Spark Streaming set up for a low data rate system (using spark's
>> features for analysis, rather than high throughput). Messages are coming in
>> throughout the day, at around 1-20 per second (finger in the air
>> estimate...not analysed yet).  In the spark streaming UI for the
>> application, I'm getting the following after 17 hours.
>>
>> Streaming
>>
>>    - *Started at: *Tue Jan 20 16:58:43 GMT 2015
>>    - *Time since start: *18 hours 24 minutes 34 seconds
>>    - *Network receivers: *2
>>    - *Batch interval: *2 seconds
>>    - *Processed batches: *16482
>>    - *Waiting batches: *1
>>
>>
>>
>> Statistics over last 100 processed batchesReceiver Statistics
>>
>>    - Receiver
>>
>>
>>    - Status
>>
>>
>>    - Location
>>
>>
>>    - Records in last batch
>>    - [2015/01/21 11:23:18]
>>
>>
>>    - Minimum rate
>>    - [records/sec]
>>
>>
>>    - Median rate
>>    - [records/sec]
>>
>>
>>    - Maximum rate
>>    - [records/sec]
>>
>>
>>    - Last Error
>>
>> RmqReceiver-0ACTIVEFOOOO
>> 144727-RmqReceiver-1ACTIVEBAAAAR
>> 124726-
>> Batch Processing Statistics
>>
>>    MetricLast batchMinimum25th percentileMedian75th 
>> percentileMaximumProcessing
>>    Time3 seconds 994 ms157 ms4 seconds 16 ms4 seconds 961 ms5 seconds 3
>>    ms5 seconds 171 msScheduling Delay9 hours 15 minutes 4 seconds9 hours
>>    10 minutes 54 seconds9 hours 11 minutes 56 seconds9 hours 12 minutes
>>    57 seconds9 hours 14 minutes 5 seconds9 hours 15 minutes 4 secondsTotal
>>    Delay9 hours 15 minutes 8 seconds9 hours 10 minutes 58 seconds9 hours
>>    12 minutes9 hours 13 minutes 2 seconds9 hours 14 minutes 10 seconds9
>>    hours 15 minutes 8 seconds
>>
>>
>> Are these "normal". I was wondering what the scheduling delay and total
>> delay terms are, and if it's normal for them to be 9 hours.
>>
>> I've got a standalone spark master and 4 spark nodes. The streaming app
>> has been given 4 cores, and it's using 1 core per worker node. The
>> streaming app is submitted from a 5th machine, and that machine has nothing
>> but the driver running. The worker nodes are running alongside Cassandra
>> (and reading and writing to it).
>>
>> Any insights would be appreciated.
>>
>> Regards,
>> Ashic.
>>
>
>

Re: Are these numbers abnormal for spark streaming?

Reply via email to