Re: Using GraphX with Spark Streaming?

Tobias Pfeiffer Sun, 05 Oct 2014 18:46:10 -0700

Arko,

On Sat, Oct 4, 2014 at 1:40 AM, Arko Provo Mukherjee <
arkoprovomukher...@gmail.com> wrote:
>
> Apologies if this is a stupid question but I am trying to understand
> why this can or cannot be done. As far as I understand that streaming
> algorithms need to be different from batch algorithms as the streaming
> algorithms are generally incremental. Hence the question whether the
> RDD transformations can be extended to streaming or not.
>


I don't think that streaming algorithms are "generally incremental" in
Spark Streaming. In fact, data is collected and every N seconds
(minutes/...), the data collected during that interval is batch-processed
as with normal batch operations. In fact, using data previously obtained
from the stream (in previous intervals) is a bit more complicated than
plain batch processing. If the graph you want to create only uses data from
one interval/batch, that should be dead simple. You might want to have a
look at
https://spark.apache.org/docs/latest/streaming-programming-guide.html#discretized-streams-dstreams

Tobias

Re: Using GraphX with Spark Streaming?

Reply via email to