Re: clearing tombstones?

2014-05-16 Thread Ruchir Jha
I tried to do this, however the doubling in disk space is not "temporary" as you state in your note. What am I missing? On Fri, Apr 11, 2014 at 10:44 AM, William Oberman wrote: > So, if I was impatient and just "wanted to make this happen now", I could: > > 1.) Change GCGraceSeconds of the CF to

Re: clearing tombstones?

2014-05-11 Thread William Oberman
Not an expert, just a user of cassandra. For me, "before" was a cf with a set of files (I forget the official naming system, so I'll make up my own): A0 A1 ... AN "During": A0 A1 ... AN B0 Where B0 is the union of Ai. Due to tombstones, mutations, etc. B0 is "at most" 2x, but also probably close

Re: clearing tombstones?

2014-04-14 Thread William Oberman
I'm still somewhat in the middle of the process, but it's far enough along to report back. 1.) I changed GCGraceSeconds of the CF to 0 using cassandra-cli 2.) I ran nodetool compact on a single node of the nine (I'll call it "1"). It took 5-7 hours, and reduced the CF from ~450 to ~75GG (*). 3.)

Re: clearing tombstones?

2014-04-11 Thread Laing, Michael
I've never noticed that that setting tombstone_threshold has any effect... at least in 2.0.6. What gets written to the log? On Fri, Apr 11, 2014 at 3:31 PM, DuyHai Doan wrote: > I was wondering, to remove the tombstones from Sstables created by LCS, > why don't we just set the tombstone_thresh

Re: clearing tombstones?

2014-04-11 Thread DuyHai Doan
I was wondering, to remove the tombstones from Sstables created by LCS, why don't we just set the tombstone_threshold table property to a very small value (say 0.01)..? As the doc said ( www.datastax.com/documentation/cql/3.0/cql/cql_reference/compactSubprop.html) this will force compaction on the

Re: clearing tombstones?

2014-04-11 Thread Robert Coli
On Fri, Apr 11, 2014 at 10:33 AM, Paulo Ricardo Motta Gomes < paulo.mo...@chaordicsystems.com> wrote: > My question is : Is there a way to force tombstones to be clared with LCS? > Does scrub help in any case? > 1) Switch to size tiered compaction, compact, and switch back. Not only "with LCS", b

Re: clearing tombstones?

2014-04-11 Thread Robert Coli
(probably should have read downthread before writing my reply.. briefly, +1 most of the thread's commentary regarding major compaction, but don't listen to the FUD about major compaction, unless you have a really large amount of data you'll probably be fine..) On Fri, Apr 11, 2014 at 7:05 AM, Will

Re: clearing tombstones?

2014-04-11 Thread Laing, Michael
At the cost of really quite a lot of compaction, you can temporarily switch to SizeTiered, and when that is completely done (check each node), switch back to Leveled. it's like doing the laundry twice :) I've done this on CFs that were about 5GB but I don't see why it wouldn't work on larger ones

Re: clearing tombstones?

2014-04-11 Thread Paulo Ricardo Motta Gomes
This thread is really informative, thanks for the good feedback. My question is : Is there a way to force tombstones to be clared with LCS? Does scrub help in any case? Or the only solution would be to create a new CF and migrate all the data if you intend to do a large CF cleanup? Cheers, On F

Re: clearing tombstones?

2014-04-11 Thread Mark Reddy
Thats great Will, if you could update the thread with the actions you decide to take and the results that would be great. Mark On Fri, Apr 11, 2014 at 5:53 PM, William Oberman wrote: > I've learned a *lot* from this thread. My thanks to all of the > contributors! > > Paulo: Good luck with LCS

Re: clearing tombstones?

2014-04-11 Thread William Oberman
I've learned a *lot* from this thread. My thanks to all of the contributors! Paulo: Good luck with LCS. I wish I could help there, but all of my CF's are SizeTiered (mostly as I'm on the same schema/same settings since 0.7...) will On Fri, Apr 11, 2014 at 12:14 PM, Mina Naguib wrote: > > Lev

Re: clearing tombstones?

2014-04-11 Thread Mina Naguib
Levelled Compaction is a wholly different beast when it comes to tombstones. The tombstones are inserted, like any other write really, at the lower levels in the leveldb hierarchy. They are only removed after they have had the chance to "naturally" migrate upwards in the leveldb hierarchy to t

Re: clearing tombstones?

2014-04-11 Thread Mark Reddy
To clarify, you would want to manage compactions only if you were concerned about read latency. If you update rows, those rows may become spread across an increasing number of SSTables leading to increased read latency. Thanks for providing some insight into your use case as it does differ from th

Re: clearing tombstones?

2014-04-11 Thread Laing, Michael
I have played with this quite a bit and recommend you set gc_grace_seconds to 0 and use 'nodetool compact [keyspace] [cfname]' on your table. A caveat I have is that we use C* 2.0.6 - but the space we expect to recover is in fact recovered. Actually, since we never delete explicitly (just ttl) we

Re: clearing tombstones?

2014-04-11 Thread William Oberman
Yes, I'm using SizeTiered. I totally understand the "mess up the heuristics" issue. But, I don't understand "You will incur the operational overhead of having to manage compactions if you wish to compact these smaller SSTables". My understanding is the small tables will still compact. The probl

Re: clearing tombstones?

2014-04-11 Thread Paulo Ricardo Motta Gomes
I have a similar problem here, I deleted about 30% of a very large CF using LCS (about 80GB per node), but still my data hasn't shrinked, even if I used 1 day for gc_grace_seconds. Would nodetool scrub help? Does nodetool scrub forces a minor compaction? Cheers, Paulo On Fri, Apr 11, 2014 at 12

Re: clearing tombstones?

2014-04-11 Thread William Oberman
Answered my own question. Good writeup here of the pros/cons of compact: http://www.datastax.com/documentation/cassandra/1.2/cassandra/operations/ops_about_config_compact_c.html And I was thinking of bad information that used to float in this forum about major compactions (with respect to the imp

Re: clearing tombstones?

2014-04-11 Thread Mark Reddy
Yes, running nodetool compact (major compaction) creates one large SSTable. This will mess up the heuristics of the SizeTiered strategy (is this the compaction strategy you are using?) leading to multiple 'small' SSTables alongside the single large SSTable, which results in increased read latency.

Re: clearing tombstones?

2014-04-11 Thread William Oberman
So, if I was impatient and just "wanted to make this happen now", I could: 1.) Change GCGraceSeconds of the CF to 0 2.) run nodetool compact (*) 3.) Change GCGraceSeconds of the CF back to 10 days Since I have ~900M tombstones, even if I miss a few due to impatience, I don't care *that* much as I

Re: clearing tombstones?

2014-04-11 Thread tommaso barbugli
In my experience even after the gc_grace period tombstones remains stored on disk (at least using cassandra 2.0.5) ; only a full compaction clears them. Perhaps that is because my application never reads tombstones? 2014-04-11 16:31 GMT+02:00 Mark Reddy : > Correct, a tombstone will only be remo

Re: clearing tombstones?

2014-04-11 Thread Mark Reddy
Correct, a tombstone will only be removed after gc_grace period has elapsed. The default value is set to 10 days which allows a great deal of time for consistency to be achieved prior to deletion. If you are operationally confident that you can achieve consistency via anti-entropy repairs within a

Re: clearing tombstones?

2014-04-11 Thread William Oberman
I'm seeing a lot of articles about a dependency between removing tombstones and GCGraceSeconds, which might be my problem (I just checked, and this CF has GCGraceSeconds of 10 days). On Fri, Apr 11, 2014 at 10:10 AM, tommaso barbugli wrote: > compaction should take care of it; for me it never wo

Re: clearing tombstones?

2014-04-11 Thread tommaso barbugli
compaction should take care of it; for me it never worked so I run nodetool compaction on every node; that does it. 2014-04-11 16:05 GMT+02:00 William Oberman : > I'm wondering what will clear tombstoned rows? nodetool cleanup, nodetool > repair, or time (as in just wait)? > > I had a CF that w

clearing tombstones?

2014-04-11 Thread William Oberman
I'm wondering what will clear tombstoned rows? nodetool cleanup, nodetool repair, or time (as in just wait)? I had a CF that was more or less storing session information. After some time, we decided that one piece of this information was pointless to track (and was 90%+ of the columns, and in 99

Re: Clearing tombstones

2013-03-28 Thread Joel Samuelsson
Yeah, I didn't mean "normal" as in "what most people use". I meant that they are not "strange" like Tyler mentions. 2013/3/28 aaron morton > The cleanup operation took several minutes though. This doesn't seem > normal then > > It read all the data and made sure the node was a replica for it. S

Re: Clearing tombstones

2013-03-27 Thread aaron morton
> The cleanup operation took several minutes though. This doesn't seem normal > then It read all the data and made sure the node was a replica for it. Since a single node cluster replicas all data, there was not a lot to throw away. > My replication settings should be very normal (simple strate

Re: Clearing tombstones

2013-03-27 Thread Joel Samuelsson
I see. The cleanup operation took several minutes though. This doesn't seem normal then? My replication settings should be very normal (simple strategy and replication factor 1). 2013/3/26 Tyler Hobbs > > On Tue, Mar 26, 2013 at 5:39 AM, Joel Samuelsson < > samuelsson.j...@gmail.com> wrote: > >

Re: Clearing tombstones

2013-03-26 Thread Tyler Hobbs
On Tue, Mar 26, 2013 at 5:39 AM, Joel Samuelsson wrote: > Sorry. I failed to mention that all my CFs had a gc_grace_seconds of 0 > since it's a 1 node cluster. I managed to accomplish what I wanted by first > running cleanup and then compact. Is there any logic to this or should my > tombstones b

Re: Clearing tombstones

2013-03-26 Thread Joel Samuelsson
Sorry. I failed to mention that all my CFs had a gc_grace_seconds of 0 since it's a 1 node cluster. I managed to accomplish what I wanted by first running cleanup and then compact. Is there any logic to this or should my tombstones be cleared by just running compact? 2013/3/25 Tyler Hobbs > Yo

Re: Clearing tombstones

2013-03-25 Thread Tyler Hobbs
You'll need to temporarily lower gc_grace_seconds for that column family, run compaction, and then restore gc_grace_seconds to its original value. See http://wiki.apache.org/cassandra/DistributedDeletes for more info. On Mon, Mar 25, 2013 at 7:40 AM, Joel Samuelsson wrote: > Hi, > > I've deleted

Clearing tombstones

2013-03-25 Thread Joel Samuelsson
Hi, I've deleted a range of keys in my one node test-cluster and want to re-add them with an older creation time. How can I make sure all tombstones are gone so that they can be re-added properly? I've tried nodetool compact but it seems some tombstones remain. Best regards, Joel Samuelsson