Hi Jaydeep,

Yes, dealing with tombstones in Cassandra is very tricky.

Cassandra keeps tombstones to mark deleted columns and distribute (hinted
handoff, full repair, read repair ...) to the other nodes that missed the
initial remove request. But Cassandra can't afford to keep those tombstones
lifetime and has to wipe them. The tradeoff is that after a time,
GCGraceSeconds, configured on each column family, the tombstones are fully
dropped during compactions and are not distributed to the other nodes
anymore.
If one node didn't have the chance to receive this tombstone during this
period, and kept and old column value, then the deleted column will
reappear.

So I guess in your case that the time T2 is older than this GCGraceSeconds ?

The best way to avoid all those phantom columns to come back from death is
to run a full repair on your cluster at least once every GCGraceSeconds.
Did you try this?

--
Nicolas


Le sam. 17 sept. 2016 à 00:05, Jaydeep Chovatia <chovatia.jayd...@gmail.com>
a écrit :

> Hi,
>
> We have three node (N1, N2, N3) cluster (RF=3) and data in SSTable as
> following:
>
> N1:
> SSTable: Partition key K1 is marked as tombstone at time T2
>
> N2:
> SSTable: Partition key K1 is marked as tombstone at time T2
>
> N3:
> SSTable: Partition key K1 is valid and has data D1 with lower time-stamp
> T1 (T1 < T2)
>
>
> Now when I read using quorum then sometimes it returns data D1 and
> sometimes it returns empty results. After tracing I found that when N1 and
> N2 are chosen then we get empty data, when (N1/N2) and N3 are chosen then
> D1 data is returned.
>
> My point is when we read with Quorum then our results have to be
> consistent, here same query give different results at different times.
>
> Isn't this a big problem with Cassandra @QUORUM (with tombstone)?
>
>
> Thanks,
> Jaydeep
>

Reply via email to