I'm confused as to what the difference between deleting with prepared statements and deleting through spark is? To the best of my knowledge either way it's the same thing - normal deletion with tombstones replicated. Is it that you're doing deletes in the analytics DC instead of your real time one?
On Fri, Mar 23, 2018 at 11:38 AM Charulata Sharma (charshar) < chars...@cisco.com> wrote: > Hi Rahul, > > Thanks for your answer. Why do you say that deleting from spark > is not elegant?? This is the exact feedback I want. Basically why is it not > elegant? > > I can either delete using delete prepared statements or through spark. TTL > approach doesn’t work for us > > Because first of all ttl is there at a column level and there are business > rules for purge which make the TTL solution not very clean in our case. > > > > Thanks, > > Charu > > > > *From: *Rahul Singh <rahul.xavier.si...@gmail.com> > *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Date: *Thursday, March 22, 2018 at 5:08 PM > *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>, " > user@cassandra.apache.org" <user@cassandra.apache.org> > *Subject: *Re: Using Spark to delete from Transactional Cluster > > > > Short answer : it works. You can even run “delete” statements from within > Spark once you know which keys to delete. Not elegant but it works. > > It will create a bunch of tombstones and you may need to spread your > deletes over days. Another thing to consider is instead of deleting setting > a TTL which will eventually get cleansed. > > > -- > Rahul Singh > rahul.si...@anant.us > > Anant Corporation > > > On Mar 22, 2018, 2:19 PM -0500, Charulata Sharma (charshar) < > chars...@cisco.com>, wrote: > > Hi, > > Wanted to know the community’s experiences and feedback on using Apache > Spark to delete data from C* transactional cluster. > > We have spark installed in our analytical C* cluster and so far we have > been using Spark only for analytics purposes. > > > > However, now with advanced features of Spark 2.0, I am considering using > spark-cassandra connector for deletes instead of a series of Delete > Prepared Statements > > So essentially the deletes will happen on the analytical cluster and they > will be replicated over to transactional cluster by means of our keyspace > replication strategies. > > > > Are there any risks involved in this ?? > > > > Thanks, > > Charu > > > >