On Wed, Jan 19, 2011 at 12:59 AM, Zhu Han <schumi....@gmail.com> wrote:
>
>
> On Wed, Jan 19, 2011 at 11:35 AM, Germán Kondolf <german.kond...@gmail.com>
> wrote:
>>
>> Yes, that's what I meant, but correct me if I'm wrong, when a deletion
>> comes after another deletion for the same row or column will the gc-before
>> count against the last one, isn't it?
>>
> IIRC, after compaction. even if the row key is not wiped, all the CF are
> replaced by the youngest tombstone.  I do not understand very clearly the
> benefit of wiping out the whole row as early as possible.
>

I think it is not a "benefit", but a potencial issue, if you delete
columns or rows without checking them before you could make them live
as long as you keep issuing deletions, maybe it's a strange use-case,
but certainly Cassandra provides new non-traditional ways of
processing high-volume of information.

As the original example depicted clearly:
day 1 -> insert Row1.Col1
day 2 -> delete Row1.Col1
day 11 (before gc-grace-seconds) -> delete Row1.Col1

In the last command I've extended the life of a tombstone, maybe the
check before the deletion could have a performance impact in the
process, so I think it might be handled server-side instead of
client-side.

//GK
http://twitter.com/germanklf
http://code.google.com/p/seide/

>>
>> Maybe knowing that all the subsequent versions of a deletion are deletions
>> too, it could take the first timestamp against the gc-grace-seconds when is
>> reducing & compacting.
>>
>> // Germán Kondolf
>> http://twitter.com/germanklf
>> http://code.google.com/p/seide/
>> // @i4
>>
>> On 19/01/2011, at 00:16, Jonathan Ellis <jbel...@gmail.com> wrote:
>>
>> > If you mean that multiple tombstones for the same row or column should
>> > be merged into a single one at compaction time, then yes, that is what
>> > happens.
>> >
>> > On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
>> > <german.kond...@gmail.com> wrote:
>> >> Maybe it could be taken into account when the compaction is executed,
>> >> if I only have a consecutive list of uninterrupted tombstones it could
>> >> only care about the first. It sounds like the-way-it-should-be, maybe
>> >> as a part of the "row-reduce" process.
>> >>
>> >> Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
>> >>
>> >> //GK
>> >> http://twitter.com/germanklf
>> >> http://code.google.com/p/seide/
>> >>
>> >> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne
>> >> <sylv...@riptano.com> wrote:
>> >>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <da...@lookin2.com>
>> >>> wrote:
>> >>>> Thanks, Aaron, but I'm not 100% clear.
>> >>>>
>> >>>> My situation is this: My use case spins off rows (not columns) that I
>> >>>> no
>> >>>> longer need and want to delete. It is possible that these rows were
>> >>>> never
>> >>>> created in the first place, or were already deleted. This is a very
>> >>>> large
>> >>>> cleanup task that normally deletes a lot of rows, and the last thing
>> >>>> that I
>> >>>> want to do is create tombstones for rows that didn't exist in the
>> >>>> first
>> >>>> place, or lengthen the life on disk of tombstones of rows that are
>> >>>> already
>> >>>> deleted.
>> >>>>
>> >>>> So the question is: before I delete, do I have to retrieve the row to
>> >>>> see if
>> >>>> it exists in the first place?
>> >>>
>> >>> Yes, in your situation you do.
>> >>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton
>> >>>> <aa...@thelastpickle.com>
>> >>>> wrote:
>> >>>>>
>> >>>>> AFAIK that's not necessary, there is no need to worry about previous
>> >>>>> deletes. You can delete stuff that does not even exist, neither
>> >>>>> batch_mutate
>> >>>>> or remove are going to throw an error.
>> >>>>> All the columns that were (roughly speaking) present at your first
>> >>>>> deletion will be available for GC at the end of the first tombstones
>> >>>>> life.
>> >>>>> Same for the second.
>> >>>>> Say you were to write a col between the two deletes with the same
>> >>>>> name as
>> >>>>> one present at the start. The first version of the col is avail for
>> >>>>> GC after
>> >>>>> tombstone 1, and the second after tombstone 2.
>> >>>>> Hope that helps
>> >>>>> Aaron
>> >>>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <da...@lookin2.com>
>> >>>>> wrote:
>> >>>>>
>> >>>>> Thanks. In other words, before I delete something, I should check to
>> >>>>> see
>> >>>>> whether it exists as a live row in the first place.
>> >>>>>
>> >>>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <r...@twitter.com> wrote:
>> >>>>>>
>> >>>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn
>> >>>>>> <da...@lookin2.com>
>> >>>>>> wrote:
>> >>>>>>> If I delete a row, and later on delete it again, before
>> >>>>>>> GCGraceSeconds
>> >>>>>>> has
>> >>>>>>> elapsed, does the tombstone live longer?
>> >>>>>>
>> >>>>>> Each delete is a new tombstone, which should answer your question.
>> >>>>>>
>> >>>>>> -ryan
>> >>>>>>
>> >>>>>>> In other words, if I have the following scenario:
>> >>>>>>>
>> >>>>>>> GCGraceSeconds = 10 days
>> >>>>>>> On day 1 I delete a row
>> >>>>>>> On day 5 I delete the row again
>> >>>>>>>
>> >>>>>>> Will the tombstone be removed on day 10 or day 15?
>> >>>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > Jonathan Ellis
>> > Project Chair, Apache Cassandra
>> > co-founder of Riptano, the source for professional Cassandra support
>> > http://riptano.com
>>
>
>

//GK
http://twitter.com/germanklf
http://code.google.com/p/seide/

Reply via email to