@David, I am going through the articles you have shared. Will message you
if i need any hellp. Thanks
@Ayan, Yes, it looks like i can get every thing done with spark streaming.
In fact we have storm already in the architecture sanitizing the data and
dumping into cassandra. Now, i got some new
The batch approach i had implemented takes about 10 minutes to complete all
the pre-computation tasks for the one hour worth of data. When i went
through my code, i figured out that most of the time consuming tasks are
the ones, which read data from cassandra and the places where i perform
I think you need to make up your mind about storm vs spark. Using both in
this context does not make much sense to me.
On 15 Sep 2015 22:54, "David Morales" wrote:
> Hi there,
>
> This is exactly our goal in Stratio Sparkta, a real-time aggregation
> engine fully developed
Hi there,
This is exactly our goal in Stratio Sparkta, a real-time aggregation engine
fully developed with spark streaming (and fully open source).
Take a look at:
- the docs: http://docs.stratio.com/modules/sparkta/development/
- the repository: https://github.com/Stratio/sparkta
-
Why did you not stay with the batch approach? For me the architecture looks
very complex for a simple thing you want to achieve. Why don't you process
the data already in storm ?
Le mar. 15 sept. 2015 à 6:20, srungarapu vamsi a
écrit :
> I am pretty new to spark.