I use spark with cassandra, and you dont need DSE.

I see a lot of people ask this same question below (how do I get a lot of data 
out of cassandra?), and my question is always, why arent you updating both 
places at once?

For example, we use hadoop and cassandra in conjunction with each other, we use 
a message bus to store every event in both, aggregrate in both, but only keep 
current data in cassandra (cassandra makes a very poor datawarehouse ot long 
term time series store) and then use services to process queries that merge 
data from hadoop and cassandra.  

Also, spark on hdfs gives more flexibility in terms of large datasets and 
performance.  The very nature of cassandra's distributed nature vs partitioning 
data on hadoop makes spark on hdfs actually fasted than on cassandra....



--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

> On Feb 11, 2015, at 4:49 AM, Jens Rantil <jens.ran...@tink.se> wrote:
> 
> 
>> On Wed, Feb 11, 2015 at 11:40 AM, Marcelo Valle (BLOOMBERG/ LONDON) 
>> <mvallemil...@bloomberg.net> wrote:
>> If you use Cassandra enterprise, you can use hive, AFAIK.
> 
> Even better, you can use Spark/Shark with DSE.
> 
> Cheers,
> Jens
> 
> 
> -- 
> Jens Rantil
> Backend engineer
> Tink AB
> 
> Email: jens.ran...@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
> 
> Facebook Linkedin Twitter

Reply via email to