[jira] [Updated] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

Mck SembWever (JIRA) Wed, 07 Sep 2011 11:29:34 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mck SembWever updated CASSANDRA-3150:
-------------------------------------

    Attachment: CASSANDRA-3150.patch

If the split's end token does not match any of the row key tokens the 
RowIterator will never stop (see RowIterator:243)

This patch 1) presumes this is the problem, 2) compares each row token with the 
split end token and exits when need be (which only works on order preserving 
partitioners, and 3) stops iterating when totalRowCount has been read.

Just (3) has been tested and works.

> ColumnFormatRecordReader loops forever
> --------------------------------------
>
>                 Key: CASSANDRA-3150
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.4
>            Reporter: Mck SembWever
>            Assignee: Mck SembWever
>            Priority: Critical
>         Attachments: CASSANDRA-3150.patch
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

Reply via email to