Thanks Jonathan, I believe we have to reconsider the way analytics have to be performed.
On Fri, Jan 4, 2019 at 1:46 PM Jonathan Haddad <j...@jonhaddad.com> wrote: > If you absolutely have to use Cassandra as the source of your data, I > agree with Dor. > > That being said, if you're going to be doing a lot of analytics, I > recommend using something other than Cassandra with Spark. The performance > isn't particularly wonderful and you'll likely get anywhere from 10-50x > improvement from putting the data in an analytics friendly format (parquet) > and on a block / blob store (DFS or S3) instead. > > On Fri, Jan 4, 2019 at 1:43 PM Goutham reddy <goutham.chiru...@gmail.com> > wrote: > >> Thank you very much Dor for the detailed information, yes that should be >> the primary reason why we have to isolate from Cassandra. >> >> Thanks and Regards, >> Goutham Reddy >> >> >> On Fri, Jan 4, 2019 at 1:29 PM Dor Laor <d...@scylladb.com> wrote: >> >>> I strongly recommend option B, separate clusters. Reasons: >>> - Networking of node-node is negligible compared to networking within >>> the node >>> - Different scaling considerations >>> Your workload may require 10 Spark nodes and 20 database nodes, so >>> why bundle them? >>> This ratio may also change over time as your application evolves and >>> amount of data changes. >>> - Isolation - If Spark has a spike in cpu/IO utilization, you wouldn't >>> want it to affect Cassandra and the opposite. >>> If you isolate it with cgroups, you may have too much idle time when >>> the above doesn't happen. >>> >>> >>> On Fri, Jan 4, 2019 at 12:47 PM Goutham reddy < >>> goutham.chiru...@gmail.com> wrote: >>> >>>> Hi, >>>> We have requirement of heavy data lifting and analytics requirement and >>>> decided to go with Apache Spark. In the process we have come up with two >>>> patterns >>>> a. Apache Spark and Apache Cassandra co-located and shared on same >>>> nodes. >>>> b. Apache Spark on one independent cluster and Apache Cassandra as one >>>> independent cluster. >>>> >>>> Need good pattern how to use the analytic engine for Cassandra. Thanks >>>> in advance. >>>> >>>> Regards >>>> Goutham. >>>> >>> > > -- > Jon Haddad > http://www.rustyrazorblade.com > twitter: rustyrazorblade > -- Regards Goutham Reddy