In a way, yes. A tombstone will only be removed after gc_grace iff the
compaction is sure that it contains all rows which that tombstone might
shadow. When two non-tombstone conflicting rows are compacted, it's always
just LWW.

On Wed, Apr 29, 2015 at 2:42 PM, Eric Stevens <migh...@gmail.com> wrote:

> But we're talking about a single tombstone on each of a finite (small) set
> of values, right?  We're not talking about INSERTs which are 99% nulls (at
> least I don't think that's what Matthew was suggesting).  Unless you're
> engaging in the antipattern of repeated overwrite, I'm still struggling to
> see why this is worse than an equivalent number of non-tombstoned writes.
> In fact from the description I don't think we're talking about these
> tombstones even occluding any value at all.
>
> > imagine a multi tb sstable w/ 99% tombstones
>
> Let's play with this hypothetical, which doesn't seem like a probable
> consequence of the original question.  You'd have to have taken enough
> writes *inside* gc grace period to have even produced a multi-TB sstable
> to come anywhere near this, and even then this either exceeds or comes
> really close to the recommended maximum total data size per node (let alone
> in a single sstable).  If you did have such an sstable, it doesn't seem
> very likely to compact again inside gc grace period short of manually
> triggered major compaction.
>
> But let's assume you do that, you run cassandra stress inserting nothing
> but tombstones, and kick off major compaction periodically.  If it
> compacted inside gc grace period, is this worse for compaction than the
> same number of non-tombstoned values (i.e. a multi-TB sstable is costly to
> compact no matter what the contents)?  If it compacted outside gc grace
> period, then 99% of the work is just dropping tombstones, it seems like it
> would run really fast (for being an absurdly large sstable), as there would
> be just 1% of the contents to actually copy over to the new sstable.
>
> I'm still not clear on what I'm missing.  Is a tombstone more expensive to
> compact than a non-tombstone?
>
> On Wed, Apr 29, 2015 at 10:06 AM, Jonathan Haddad <j...@jonhaddad.com>
> wrote:
>
>> Enough tombstones can inflate the size of an SSTable causing issues
>> during compaction (imagine a multi tb sstable w/ 99% tombstones) even if
>> there's no clustering key defined.
>>
>> Perhaps an edge case, but worth considering.
>>
>> On Wed, Apr 29, 2015 at 9:17 AM Eric Stevens <migh...@gmail.com> wrote:
>>
>>> Correct me if I'm wrong, but tombstones are only really problematic if
>>> you have them going into clustering keys, then perform a range select on
>>> that column, right (assuming it's not a symptom of the antipattern of
>>> indefinitely overwriting the same value)?  I.E. you're deleting clusters
>>> off of a partition.  A tombstone isn't any more costly, and in some ways
>>> less costly than a normal column (it's a smaller size at rest than, say,
>>> inserting an empty string or other default value as someone suggested).
>>>
>>> Tombstones stay around a little longer post-compaction than other
>>> values, so that's a downside, but they also would drop off the record as if
>>> it had never been set on the next compaction after gc grace period.
>>>
>>> Tombstones aren't intrinsically bad, but they can have some bad
>>> properties in certain situations.  This doesn't strike me as one of them.
>>> If you have a way to avoid inserting null when you know you aren't
>>> occluding an underlying value, that would be ideal.  But because the
>>> tombstone would sit adjacent on disk to other values from the same insert,
>>> even if you were on platters, the drive head is *already positioned* over
>>> the tombstone location when it's read, because it read the prior value and
>>> subsequent value which were written during the same insert.
>>>
>>> In the end, inserting a tombstone into a non-clustered column shouldn't
>>> be appreciably worse (if it is at all) than inserting a value instead.  Or
>>> am I missing something here?
>>>
>>> On Wed, Apr 29, 2015 at 7:53 AM, Matthew Johnson <
>>> matt.john...@algomi.com> wrote:
>>>
>>>> Thank you all for the advice!
>>>>
>>>>
>>>>
>>>> I have decided to use the Insert query builder (
>>>> *com.datastax.driver.core.querybuilder.Insert*) which allows me to
>>>> dynamically insert as many or as few columns as I need, and doesn’t require
>>>> multiple prepared statements. Then, I will look at Ali’s suggestion – I
>>>> will create a small helper method like ‘addToInsertIfNotNull’ and pump all
>>>> my values into that, which will then filter out the ones that are null.
>>>> Should keep the code nice and neat – I will feed back if I find any
>>>> problems with this approach (but please jump in if you have already spotted
>>>> any :)).
>>>>
>>>>
>>>>
>>>> Thanks!
>>>>
>>>> Matt
>>>>
>>>>
>>>>
>>>> *From:* Robert Wille [mailto:rwi...@fold3.com]
>>>> *Sent:* 29 April 2015 15:16
>>>> *To:* user@cassandra.apache.org
>>>> *Subject:* Re: Inserting null values
>>>>
>>>>
>>>>
>>>> I’ve come across the same thing. I have a table with at least half a
>>>> dozen columns that could be null, in any combination. Having a prepared
>>>> statement for each permutation of null columns just isn’t going to happen.
>>>> I don’t want to build custom queries each time because I have a really cool
>>>> system of managing my queries that relies on them being prepared.
>>>>
>>>>
>>>>
>>>> Fortunately for me, I should have at most a handful of tombstones in
>>>> each partition, and most of my records are written exactly once. So, I just
>>>> let the tombstones get written and they’ll eventually get compacted out and
>>>> life will go on.
>>>>
>>>>
>>>>
>>>> It’s annoying and not ideal, but what can you do?
>>>>
>>>>
>>>>
>>>> On Apr 29, 2015, at 2:36 AM, Matthew Johnson <matt.john...@algomi.com>
>>>> wrote:
>>>>
>>>>
>>>>
>>>> Hi all,
>>>>
>>>>
>>>>
>>>> I have some fields that I am storing into Cassandra, but some of them
>>>> could be null at any given point. As there are quite a lot of them, it
>>>> makes the code much more readable if I don’t check each one for null before
>>>> adding it to the INSERT.
>>>>
>>>>
>>>>
>>>> I can see a few Jiras around CQL 3 supporting inserting nulls:
>>>>
>>>>
>>>>
>>>> https://issues.apache.org/jira/browse/CASSANDRA-3783
>>>>
>>>> https://issues.apache.org/jira/browse/CASSANDRA-5648
>>>>
>>>>
>>>>
>>>> But I have tested inserting null and it seems to work fine (when
>>>> querying the table with cqlsh, it shows up as a red lowercase *null*).
>>>>
>>>>
>>>>
>>>> Are there any obvious pitfalls to look out for that I have missed?
>>>> Could it be a performance concern to insert a row with some nulls, as
>>>> opposed to checking the values first and inserting the row and just
>>>> omitting those columns?
>>>>
>>>>
>>>>
>>>> Thanks!
>>>>
>>>> Matt
>>>>
>>>>
>>>>
>>>
>>>
>

Reply via email to