Re: Using Spark to delete from Transactional Cluster

Jonathan Haddad Fri, 23 Mar 2018 12:11:24 -0700

I'm confused as to what the difference between deleting with prepared
statements and deleting through spark is?  To the best of my knowledge
either way it's the same thing - normal deletion with tombstones
replicated.  Is it that you're doing deletes in the analytics DC instead of
your real time one?


On Fri, Mar 23, 2018 at 11:38 AM Charulata Sharma (charshar) <
chars...@cisco.com> wrote:

> Hi Rahul,
>
>          Thanks for your answer. Why do you say that deleting from spark
> is not elegant?? This is the exact feedback I want. Basically why is it not
> elegant?
>
> I can either delete using delete prepared statements or through spark. TTL
> approach doesn’t work for us
>
> Because first of all ttl is there at a column level and there are business
> rules for purge which make the TTL solution not very clean in our case.
>
>
>
> Thanks,
>
> Charu
>
>
>
> *From: *Rahul Singh <rahul.xavier.si...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Thursday, March 22, 2018 at 5:08 PM
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>, "
> user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Re: Using Spark to delete from Transactional Cluster
>
>
>
> Short answer : it works. You can even run “delete” statements from within
> Spark once you know which keys to delete. Not elegant but it works.
>
> It will create a bunch of tombstones and you may need to spread your
> deletes over days. Another thing to consider is instead of deleting setting
> a TTL which will eventually get cleansed.
>
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
>
> On Mar 22, 2018, 2:19 PM -0500, Charulata Sharma (charshar) <
> chars...@cisco.com>, wrote:
>
> Hi,
>
>    Wanted to know the community’s experiences and feedback on using Apache
> Spark to delete data from C* transactional cluster.
>
> We have spark installed in our analytical C* cluster and so far we have
> been using Spark only for analytics purposes.
>
>
>
> However, now with advanced features of Spark 2.0, I am considering using
> spark-cassandra connector for deletes instead of a series of Delete
> Prepared Statements
>
> So essentially the deletes will happen on the analytical cluster and they
> will be replicated over to transactional cluster by means of our keyspace
> replication strategies.
>
>
>
> Are there any risks involved in this ??
>
>
>
> Thanks,
>
> Charu
>
>
>
>

Re: Using Spark to delete from Transactional Cluster

Reply via email to