Thanks everyone for your responses. I'll definitely think carefully about
the data models, querying patterns and fragmentation side-effects.

Cheers, Mike.

On Wed, Feb 11, 2015 at 1:14 AM, Franc Carter <franc.car...@rozettatech.com>
wrote:

>
> I forgot to mention that if you do decide to use Cassandra I'd highly
> recommend jumping on the Cassandra mailing list, if we had taken in come of
> the advice on that list things would have been considerably smoother
>
> cheers
>
> On Wed, Feb 11, 2015 at 8:12 PM, Christian Betz <
> christian.b...@performance-media.de> wrote:
>
>>   Hi
>>
>>  Regarding the Cassandra Data model, there's an excellent post on the
>> ebay tech blog:
>> http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/.
>> There's also a slideshare for this somewhere.
>>
>>  Happy hacking
>>
>>  Chris
>>
>>   Von: Franc Carter <franc.car...@rozettatech.com>
>> Datum: Mittwoch, 11. Februar 2015 10:03
>> An: Paolo Platter <paolo.plat...@agilelab.it>
>> Cc: Mike Trienis <mike.trie...@orcsol.com>, "user@spark.apache.org" <
>> user@spark.apache.org>
>> Betreff: Re: Datastore HDFS vs Cassandra
>>
>>
>> One additional comment I would make is that you should be careful with
>> Updates in Cassandra, it does support them but large amounts of Updates
>> (i.e changing existing keys) tends to cause fragmentation. If you are
>> (mostly) adding new keys (e.g new records in the the time series) then
>> Cassandra can be excellent
>>
>>  cheers
>>
>>
>> On Wed, Feb 11, 2015 at 6:13 PM, Paolo Platter <paolo.plat...@agilelab.it
>> > wrote:
>>
>>>   Hi Mike,
>>>
>>> I developed a Solution with cassandra and spark, using DSE.
>>> The main difficult is about cassandra, you need to understand very well
>>> its data model and its Query patterns.
>>> Cassandra has better performance than hdfs and it has DR and stronger
>>> availability.
>>> Hdfs is a filesystem, cassandra is a dbms.
>>> Cassandra supports full CRUD without acid.
>>> Hdfs is more flexible than cassandra.
>>>
>>> In my opinion, if you have a real time series, go with Cassandra paying
>>> attention at your reporting data access patterns.
>>>
>>> Paolo
>>>
>>> Inviata dal mio Windows Phone
>>>  ------------------------------
>>> Da: Mike Trienis <mike.trie...@orcsol.com>
>>> Inviato: ?11/?02/?2015 05:59
>>> A: user@spark.apache.org
>>> Oggetto: Datastore HDFS vs Cassandra
>>>
>>>   Hi,
>>>
>>> I am considering implement Apache Spark on top of Cassandra database
>>> after
>>> listing to related talk and reading through the slides from DataStax. It
>>> seems to fit well with our time-series data and reporting requirements.
>>>
>>>
>>> http://www.slideshare.net/patrickmcfadin/apache-cassandra-apache-spark-for-time-series-data
>>>
>>> Does anyone have any experiences using Apache Spark and Cassandra,
>>> including
>>> limitations (and or) technical difficulties? How does Cassandra compare
>>> with
>>> HDFS and what use cases would make HDFS more suitable?
>>>
>>> Thanks, Mike.
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Datastore-HDFS-vs-Cassandra-tp21590.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>>
>>  --
>>
>> *Franc Carter* | Systems Architect | Rozetta Technology
>>
>> franc.car...@rozettatech.com  <franc.car...@rozettatech.com>|
>> www.rozettatechnology.com
>>
>> Tel: +61 2 8355 2515
>>
>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>
>> PO Box H58, Australia Square, Sydney NSW 1215
>>
>> AUSTRALIA
>>
>>
>
>
> --
>
> *Franc Carter* | Systems Architect | Rozetta Technology
>
> franc.car...@rozettatech.com  <franc.car...@rozettatech.com>|
> www.rozettatechnology.com
>
> Tel: +61 2 8355 2515
>
> Level 4, 55 Harrington St, The Rocks NSW 2000
>
> PO Box H58, Australia Square, Sydney NSW 1215
>
> AUSTRALIA
>
>

Reply via email to