This is just a friendly ping, just to remind you of my query.
Also, is there a possible explanation/example on the usage of
AsyncRDDActions in Java ?
On Thu, May 21, 2015 at 7:18 PM, Gautam Bajaj gautam1...@gmail.com wrote:
I am received data at UDP port 8060 and doing processing on it using
to resort to the disk your performance will get hit
From: Tathagata Das [mailto:t...@databricks.com]
Sent: Friday, May 22, 2015 8:55 PM
To: Gautam Bajaj
Cc: user
Subject: Re: Storing spark processed output to Database asynchronously.
Something does not make sense. Receivers (currently) does
performance in the
name of the reliability/integrity of your system ie not loosing messages)
From: Evo Eftimov [mailto:evo.efti...@isecc.com]
Sent: Friday, May 22, 2015 9:39 PM
To: 'Tathagata Das'; 'Gautam Bajaj'
Cc: 'user'
Subject: RE: Storing spark processed output to Database asynchronously
Something does not make sense. Receivers (currently) does not get blocked
(unless rate limit has been set) due to processing load. The receiver will
continue to receive data and store it in memory and until it is processed.
So I am still not sure how the data loss is happening. Unless you are
That is completely alright, as the system will make sure the works get done.
My major concern is, the data drop. Will using async stop data loss?
On Thu, May 21, 2015 at 4:55 PM, Tathagata Das t...@databricks.com wrote:
If you cannot push data as fast as you are generating it, then async isnt
If you cannot push data as fast as you are generating it, then async isnt
going to help either. The work is just going to keep piling up as many
many async jobs even though your batch processing times will be low as that
processing time is not going to reflect how much of overall work is pending
Hi,
From my understanding of Spark Streaming, I created a spark entry point,
for continuous UDP data, using:
SparkConf conf = new
SparkConf().setMaster(local[2]).setAppName(NetworkWordCount);JavaStreamingContext
jssc = new JavaStreamingContext(conf, new