Re: Storing spark processed output to Database asynchronously.

2015-05-22 Thread Gautam Bajaj
This is just a friendly ping, just to remind you of my query. Also, is there a possible explanation/example on the usage of AsyncRDDActions in Java ? On Thu, May 21, 2015 at 7:18 PM, Gautam Bajaj gautam1...@gmail.com wrote: I am received data at UDP port 8060 and doing processing on it using

RE: Storing spark processed output to Database asynchronously.

2015-05-22 Thread Evo Eftimov
to resort to the disk your performance will get hit From: Tathagata Das [mailto:t...@databricks.com] Sent: Friday, May 22, 2015 8:55 PM To: Gautam Bajaj Cc: user Subject: Re: Storing spark processed output to Database asynchronously. Something does not make sense. Receivers (currently) does

RE: Storing spark processed output to Database asynchronously.

2015-05-22 Thread Evo Eftimov
performance in the name of the reliability/integrity of your system ie not loosing messages) From: Evo Eftimov [mailto:evo.efti...@isecc.com] Sent: Friday, May 22, 2015 9:39 PM To: 'Tathagata Das'; 'Gautam Bajaj' Cc: 'user' Subject: RE: Storing spark processed output to Database asynchronously

Re: Storing spark processed output to Database asynchronously.

2015-05-22 Thread Tathagata Das
Something does not make sense. Receivers (currently) does not get blocked (unless rate limit has been set) due to processing load. The receiver will continue to receive data and store it in memory and until it is processed. So I am still not sure how the data loss is happening. Unless you are

Re: Storing spark processed output to Database asynchronously.

2015-05-21 Thread Gautam Bajaj
That is completely alright, as the system will make sure the works get done. My major concern is, the data drop. Will using async stop data loss? On Thu, May 21, 2015 at 4:55 PM, Tathagata Das t...@databricks.com wrote: If you cannot push data as fast as you are generating it, then async isnt

Re: Storing spark processed output to Database asynchronously.

2015-05-21 Thread Tathagata Das
If you cannot push data as fast as you are generating it, then async isnt going to help either. The work is just going to keep piling up as many many async jobs even though your batch processing times will be low as that processing time is not going to reflect how much of overall work is pending

Storing spark processed output to Database asynchronously.

2015-05-20 Thread Gautam Bajaj
Hi, From my understanding of Spark Streaming, I created a spark entry point, for continuous UDP data, using: SparkConf conf = new SparkConf().setMaster(local[2]).setAppName(NetworkWordCount);JavaStreamingContext jssc = new JavaStreamingContext(conf, new