Re: Out of memory with Spark Streaming

2014-10-31 Thread Aniket Bhatnagar
Thanks Chris for looking at this. I was putting data at roughly the same 50 records per batch max. This issue was purely because of a bug in my persistence logic that was leaking memory. Overall, I haven't seen a lot of lag with kinesis + spark setup and I am able to process records at roughly

Re: Out of memory with Spark Streaming

2014-10-30 Thread Chris Fregly
curious about why you're only seeing 50 records max per batch. how many receivers are you running? what is the rate that you're putting data onto the stream? per the default AWS kinesis configuration, the producer can do 1000 PUTs per second with max 50k bytes per PUT and max 1mb per second per

Re: Out of memory with Spark Streaming

2014-09-12 Thread Aniket Bhatnagar
Hi all Sorry but this was totally my mistake. In my persistence logic, I was creating async http client instance in RDD foreach but was never closing it leading to memory leaks. Apologies for wasting everyone's time. Thanks, Aniket On 12 September 2014 02:20, Tathagata Das

Out of memory with Spark Streaming

2014-09-11 Thread Aniket Bhatnagar
I am running a simple Spark Streaming program that pulls in data from Kinesis at a batch interval of 10 seconds, windows it for 10 seconds, maps data and persists to a store. The program is running in local mode right now and runs out of memory after a while. I am yet to investigate heap dumps

Re: Out of memory with Spark Streaming

2014-09-11 Thread Aniket Bhatnagar
I did change it to be 1 gb. It still ran out of memory but a little later. The streaming job isnt handling a lot of data. In every 2 seconds, it doesn't get more than 50 records. Each record size is not more than 500 bytes. On Sep 11, 2014 10:54 PM, Bharat Venkat bvenkat.sp...@gmail.com wrote:

Re: Out of memory with Spark Streaming

2014-09-11 Thread Tathagata Das
Which version of spark are you running? If you are running the latest one, then could try running not a window but a simple event count on every 2 second batch, and see if you are still running out of memory? TD On Thu, Sep 11, 2014 at 10:34 AM, Aniket Bhatnagar aniket.bhatna...@gmail.com

help me: Out of memory when spark streaming

2014-05-16 Thread Francis . Hu
hi, All I encountered OOM when streaming. I send data to spark streaming through Zeromq at a speed of 600 records per second, but the spark streaming only handle 10 records per 5 seconds( set it in streaming program) my two workers have 4 cores CPU and 1G RAM. These workers always occur Out