>not being able to read from Kafka using multiple nodes Kafka is plenty capable of doing this, by clustering together multiple consumer instances into a consumer group. If your topic is sufficiently partitioned, the consumer group can consume the topic in a parallelized fashion. If it isn't, you still have the fault tolerance associated with clustering the consumers.
OK JRP On Jun 17, 2015 1:27 AM, "Enno Shioji" <eshi...@gmail.com> wrote: > We've evaluated Spark Streaming vs. Storm and ended up sticking with Storm. > > Some of the important draw backs are: > Spark has no back pressure (receiver rate limit can alleviate this to a > certain point, but it's far from ideal) > There is also no exactly-once semantics. (updateStateByKey can achieve > this semantics, but is not practical if you have any significant amount of > state because it does so by dumping the entire state on every checkpointing) > > There are also some minor drawbacks that I'm sure will be fixed quickly, > like no task timeout, not being able to read from Kafka using multiple > nodes, data loss hazard with Kafka. > > It's also not possible to attain very low latency in Spark, if that's what > you need. > > The pos for Spark is the concise and IMO more intuitive syntax, especially > if you compare it with Storm's Java API. > > I admit I might be a bit biased towards Storm tho as I'm more familiar > with it. > > Also, you can do some processing with Kinesis. If all you need to do is > straight forward transformation and you are reading from Kinesis to begin > with, it might be an easier option to just do the transformation in Kinesis. > > > > > > On Wed, Jun 17, 2015 at 7:15 AM, Sabarish Sasidharan < > sabarish.sasidha...@manthan.com> wrote: > >> Whatever you write in bolts would be the logic you want to apply on your >> events. In Spark, that logic would be coded in map() or similar such >> transformations and/or actions. Spark doesn't enforce a structure for >> capturing your processing logic like Storm does. >> >> Regards >> Sab >> Probably overloading the question a bit. >> >> In Storm, Bolts have the functionality of getting triggered on events. Is >> that kind of functionality possible with Spark streaming? During each phase >> of the data processing, the transformed data is stored to the database and >> this transformed data should then be sent to a new pipeline for further >> processing >> >> How can this be achieved using Spark? >> >> >> >> On Wed, Jun 17, 2015 at 10:10 AM, Spark Enthusiast < >> sparkenthusi...@yahoo.in> wrote: >> >>> I have a use-case where a stream of Incoming events have to be >>> aggregated and joined to create Complex events. The aggregation will have >>> to happen at an interval of 1 minute (or less). >>> >>> The pipeline is : >>> send events >>> enrich event >>> Upstream services -------------------> KAFKA ---------> event Stream >>> Processor ------------> Complex Event Processor ------------> Elastic >>> Search. >>> >>> From what I understand, Storm will make a very good ESP and Spark >>> Streaming will make a good CEP. >>> >>> But, we are also evaluating Storm with Trident. >>> >>> How does Spark Streaming compare with Storm with Trident? >>> >>> Sridhar Chellappa >>> >>> >>> >>> >>> >>> >>> >>> On Wednesday, 17 June 2015 10:02 AM, ayan guha <guha.a...@gmail.com> >>> wrote: >>> >>> >>> I have a similar scenario where we need to bring data from kinesis to >>> hbase. Data volecity is 20k per 10 mins. Little manipulation of data will >>> be required but that's regardless of the tool so we will be writing that >>> piece in Java pojo. >>> All env is on aws. Hbase is on a long running EMR and kinesis on a >>> separate cluster. >>> TIA. >>> Best >>> Ayan >>> On 17 Jun 2015 12:13, "Will Briggs" <wrbri...@gmail.com> wrote: >>> >>> The programming models for the two frameworks are conceptually rather >>> different; I haven't worked with Storm for quite some time, but based on my >>> old experience with it, I would equate Spark Streaming more with Storm's >>> Trident API, rather than with the raw Bolt API. Even then, there are >>> significant differences, but it's a bit closer. >>> >>> If you can share your use case, we might be able to provide better >>> guidance. >>> >>> Regards, >>> Will >>> >>> On June 16, 2015, at 9:46 PM, asoni.le...@gmail.com wrote: >>> >>> Hi All, >>> >>> I am evaluating spark VS storm ( spark streaming ) and i am not able to >>> see what is equivalent of Bolt in storm inside spark. >>> >>> Any help will be appreciated on this ? >>> >>> Thanks , >>> Ashish >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >>> >>> >> >