This is expected due to tombstones, which this explains pretty well: http://wiki.apache.org/cassandra/DistributedDeletes
If you don't have any tombstones for the row, the bloom filter will let Cassandra avoid doing any disk reads at all 99% of the time. On Tue, Jul 10, 2012 at 10:50 AM, Thorsten von Eicken <t...@rightscale.com>wrote: > We're finding that reading deleted columns can be very slow and I'm > trying to get confirmation for our analysis of what happens. We wrote > lots of data eons ago into fairly large rows (up to 1MB). We recently > read those rows and then deleted them. After this, we ran a > verification-type pass that attempts to re-read these rows and verifies > that they are indeed deleted. The interval between the deletion and > verification pass was far less than gc_grace. We noticed that the > verification pass took as much time as the read&delete pass(!), while > verifying the non-existence of rows that never existed is blindingly > fast in comparison. So it seems that cassandra is reading the old data, > reading the new tombstones, and then returning "there is no data". > Functionally correct, but rather unexpected performance > characteristics... Am I missing something or is this expected? > Thanks! > Thorsten > -- Tyler Hobbs DataStax <http://datastax.com/>