On Tue, Oct 22, 2013 at 2:29 PM, java8964 java8964 <[email protected]>wrote:

> 1) In the data of full snapshot, I see more than 10% of duplication data.
> What I mean duplication is that there are event_activities with the same
> (entity_1_id, entity_2_id, entity_3_id, entity_4_id, created_on_timestamp,
> column_timestamp). I am surprised to see the high level duplication data,
> especially even adding with the column_timestamp. As my understanding, the
> column_timestamp is provided from the client when Cassandra store the
> column in the row key data. So if there are some small amount of
> duplication, I can explain as application bug, or duplication comes from
> the replication. But more than 10% is too much to explain this way.
>

Have you run "repair"? Do you regularly have hinted handoff kicking in due
to down nodes or dropped messages, such that failed writes are re-delivered
as hints?

=Rob

Reply via email to