Re: updateStateByKey performance API

2015-03-23 Thread Andre Schumacher
Message Subject: Re: updateStateByKey performance API Date: Wed, 18 Mar 2015 13:06:15 +0200 From: Nikos Viorres nvior...@gmail.com To: Akhil Das ak...@sigmoidanalytics.com CC: user@spark.apache.org user@spark.apache.org Hi Akhil, Yes, that's what we are planning on doing at the end

Re: updateStateByKey performance API

2015-03-18 Thread Akhil Das
You can always throw more machines at this and see if the performance is increasing. Since you haven't mentioned anything regarding your # cores etc. Thanks Best Regards On Wed, Mar 18, 2015 at 11:42 AM, nvrs nvior...@gmail.com wrote: Hi all, We are having a few issues with the performance

Re: updateStateByKey performance API

2015-03-18 Thread Nikos Viorres
Hi Akhil, Yes, that's what we are planning on doing at the end of the data. At the moment I am doing performance testing before the job hits production and testing on 4 cores to get baseline figures and deduced that in order to grow to 10 - 15 million keys we ll need at batch interval of ~20 secs