Re: Create DStream consisting of HDFS and (then) Kafka data

rektide Wed, 07 Jan 2015 23:03:07 -0800

On Thu, Jan 08, 2015 at 02:33:30PM +0900, Tobias Pfeiffer wrote:
> Hi,
> 
> On Thu, Jan 8, 2015 at 2:19 PM, <rekt...@voodoowarez.com> wrote:
> 
> > dstream processing bulk HDFS data- is something I don't feel is super
> 
> well socialized yet, & fingers crossed that base gets built up a little
> > more.
> 
> 
> Just out of interest (and hoping not to hijack my own thread), why are you
> not doing plain RDD processing when you are only processing HDFS data?
> What's the advantage of doing DStream?
> 
> Thanks
> Tobias


Like you- in the old Storm use case, we were doing a lot of windowing 
functions, &c.

We want a consistent discretization process for all our intake data, whether
it's realtime or not, and we want to use the same discretized stream tech,
whether we're discretizing here now or whether it's historical data.

Only then is Lambda-beast anywhere near slain.  To the single-system. o7
-rektide

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Create DStream consisting of HDFS and (then) Kafka data

Reply via email to