, 2015 at 2:05 PM
To: Adrian Tanase
Subject: Re: Using Spark for portfolio manager app
Hi Adrian,
Thanks Cassandra seems to be good candidate too. I will give it a try.
Do you know any stable connector that help Spark work with Cassandra? Or I
should write it myself.
Regards my second question, i
: Friday, September 25, 2015 at 10:31 AM
To: ALEX K
Cc: "user@spark.apache.org<mailto:user@spark.apache.org>"
Subject: Re: Using Spark for portfolio manager app
Thanks all for the feedback so far.
I havn't decided which external storage will be used yet.
HBase is cool but
Thanks all for the feedback so far.
I havn't decided which external storage will be used yet.
HBase is cool but it requires Hadoop in production. I only have 3-4 servers
for the whole things ( i am thinking of a relational database for this, can
be MariaDB, Memsql or mysql) but they are hard to sca
boarding
>> 3. collecting the metrics is a bit hairy in a streaming app - we
>> have experimented with both accumulators and RDDs specific for metrics
>> -
>> chose the RDDs that write to OpenTSDB using foreachRdd
>>
>> -adrian
>>
>>
d RDDs specific for metrics -
> chose the RDDs that write to OpenTSDB using foreachRdd
>
> -adrian
>
> --
> *From:* Thúy Hằng Lê >
> *Sent:* Sunday, September 20, 2015 7:26 AM
> *To:* Jörn Franke
> *Cc:* user@spark.apache.org
>
> *Subject:*
___
From: Thúy Hằng Lê
Sent: Sunday, September 20, 2015 7:26 AM
To: Jörn Franke
Cc: user@spark.apache.org
Subject: Re: Using Spark for portfolio manager app
Thanks Adrian and Jorn for the answers.
Yes, you're right there are lot of things I need to consider if I want to use
Spark for
Thanks all,
Using external storage seems to be the best solution for now.
Btw, have any one heard about following spark streaming module from Intel?
https://github.com/Intel-bigdata/spark-streamingsql
Seems it allow us to query on Spark stream on the fly, however it haven't
updated for 9 months,
I think generally the way forward would be to put aggregate statistics to
an external storage (eg hbase) - it should not have that much influence on
latency. You will probably need it anyway if you need to store historical
information. Wrt to deltas - always a tricky topic. You may want to work
wit
Hi Thuy,
You can check Rdd.lookup(). It requires the rdd is partitioned, and of
course, cached in memory. Or you may consider a distributed cache like
ehcache, aws elastic cache.
I think an external storage is an option, too. Especially nosql databases,
they can handle updates at high speed, at c
Thanks Adrian and Jorn for the answers.
Yes, you're right there are lot of things I need to consider if I want to
use Spark for my app.
I still have few concerns/questions from your information:
1/ I need to combine trading stream with tick stream, I am planning to use
Kafka for that
If I am usi
If you want to be able to let your users query their portfolio then you may
want to think about storing the current state of the portfolios in
hbase/phoenix or alternatively a cluster of relationaldatabases can make
sense. For the rest you may use Spark.
Le sam. 19 sept. 2015 à 4:43, Thúy Hằng Lê
Cool use case! You should definitely be able to model it with Spark.
For the first question it's pretty easy - you probably need to keep the user
portfolios as state using updateStateByKey.
You need to consume 2 event sources - user trades and stock changes. You
probably want to Cogroup the stoc
12 matches
Mail list logo