Hi,
On Thu, Jan 8, 2015 at 2:19 PM, rekt...@voodoowarez.com wrote:
dstream processing bulk HDFS data- is something I don't feel is super
well socialized yet, fingers crossed that base gets built up a little
more.
Just out of interest (and hoping not to hijack my own thread), why are you
Hi,
I have a setup where data from an external stream is piped into Kafka and
also written to HDFS periodically for long-term storage. Now I am trying to
build an application that will first process the HDFS files and then switch
to Kafka, continuing with the first item that was not yet in HDFS.
I've started 1 or 2 emails to ask more broadly- what are good practices
for doing DStream computations in a non-realtime fashion? I'd love to have a
good intro article to pass around to people, and some resources for those
others chasing this problem.
Back when I was working with Storm, managing
On Thu, Jan 08, 2015 at 02:33:30PM +0900, Tobias Pfeiffer wrote:
Hi,
On Thu, Jan 8, 2015 at 2:19 PM, rekt...@voodoowarez.com wrote:
dstream processing bulk HDFS data- is something I don't feel is super
well socialized yet, fingers crossed that base gets built up a little
more.