Re: Seeing very weird results on 0.6.2 when paginating through a ColumnFamily with get_slice()

Paul Brown Thu, 15 Jul 2010 20:37:28 -0700

You should make sure that your directions and interval endpoints are chosen 
correctly.  I recall the semantics of the call being like an old-school for 
with the descending flag as a step of +1 or -1.


--
Spelling by mobile.

On Jul 15, 2010, at 20:19, Ilya Maykov <ivmay...@gmail.com> wrote:

> Hi all,
> 
> I'm trying to debug some pretty weird behavior when paginating through
> a ColumnFamily with get_slice(). It basically looks like Cassandra
> does not respect the limit parameter in the SlicePredicate, sometimes
> returning more than limit columns. It also sometimes silently drops
> columns. I'm reading using QUORUM, and all data was written using
> QUORUM as well. The client is in Ruby.
> 
> I am seeing basically non-deterministic behavior when paginating
> through a column family (which is not being written concurrently).
> Here is example output from my pagination method (code below) with
> some extra debug prints added:
> 
> irb(main):005:0> blah = get_entire_column_family(@cassandra,
> "some_key", "some_cf", 100)
> get_entire_column_family(@cass, "some_key", "some_cf", 100) ...
> 100/6648 ... 199/6648 ... 354/6648 ... 453/6648 ... 552/6648 ...
> 689/6648 ... 788/6648 ... 887/6648 ... 1048/6648 ... 1147/6648 ...
> 1246/6648 ... 1377/6648 ... 1476/6648 ... 1575/6648 ... 1674/6648 ...
> 1773/6648 ... 1908/6648 ... 2051/6648 ... 2150/6648 ... 2249/6648 ...
> 2348/6648 ... <snip> ... 6127/6648 ... 6127/6648 ... 6127/6648 ...
> 6127/6648 ... 6127/6648 ...
> 
> The N/6648 is just printing the retrieved columns / total columns at
> each step of the pagination loop. It should be going up by 99 on each
> iteration after the first (because the start of the next slice == the
> last value in the current slice). But sometimes it jumps, indicating
> that more than 100 values were returned from a single get_slice() call
> (i.e. ... 199/6648 ... 354/6648 ...). And when it gets to the end of
> the column family, we end up with fewer than 6648 columns on the
> client side and the code gets stuck in an infinite loop.
> 
> I've tried this several times from an interactive Ruby session and got
> a different number of columns each time:
> 6536/6648
> 6127/6648
> 6514/6648
> However, once I set the limit to be > num_columns and read the entire
> row as a single page, everything worked. And follow-up paginated reads
> also return the entire row successfully. Not sure if that's because
> the entire row is now in cache, or because something was wrong and
> read-repair has fixed it. But since all of our reads and writes are
> done using QUORUM, read-repair shouldn't matter, right?
> 
> Here is the pagination code:
> 
>    def get_entire_column_family(cassandra, row_key, column_family,
> limit_per_slice)
>      column_parent = CassandraThrift::ColumnParent.new(:column_family
> => column_family, :super_column => nil)
>      num_columns = cassandra.get_count(@keyspace, row_key,
> column_parent, CassandraThrift::ConsistencyLevel::QUORUM)
>      predicate = CassandraThrift::SlicePredicate.new
>      predicate.slice_range = CassandraThrift::SliceRange.new(:start
> => "", :finish => "", :reversed => false, :count => limit_per_slice)
>      slice = cassandra.get_slice(@keyspace, row_key, column_parent,
> predicate, CassandraThrift::ConsistencyLevel::QUORUM)
>      result = slice
>      while result.size < num_columns
>        predicate = CassandraThrift::SlicePredicate.new
>        predicate.slice_range = CassandraThrift::SliceRange.new(:start
> => result.last.column.name,
>            :finish => "", :reversed => false, :count => limit_per_slice)
>        slice = cassandra.get_slice(@keyspace, row_key, column_parent,
> predicate, CassandraThrift::ConsistencyLevel::QUORUM)
>        # Because the start parameter to get_slice() is inclusive, we
> should already have the first column of the
>        # new slice in our result. We don't want to have 2 copies of
> it, so drop it from the slice before concatenating.
>        unless slice.nil? || slice.empty?
>          if result.last.column.name == slice.first.column.name
>            result.concat slice[1 .. slice.size-1]
>          else
>            result.concat slice
>          end
>        end
>      end  # while
>      return result
>    end
> 
> I guess I have several questions:
> 1) Is this the proper way to paginate through a large column family
> for a single row key? If not, what is the proper way? Some of our rows
> are very big (hundreds of thousands of columns in the worst case), and
> pagination is a must.
> 2) Could this behavior be expected under some conditions (i.e. maybe
> presence of tombstones or hints from when a node was down or other
> weirdness?)
> 2) Is this a known bug? (maybe related to
> https://issues.apache.org/jira/browse/CASSANDRA-1145 and/or
> https://issues.apache.org/jira/browse/CASSANDRA-1042 ?)
> 3) If this is not a known bug, how should I proceed with investigating it?
> 
> Thanks,
> 
> -- Ilya

Re: Seeing very weird results on 0.6.2 when paginating through a ColumnFamily with get_slice()

Reply via email to