[ https://issues.apache.org/jira/browse/CASSANDRA-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125980#comment-13125980 ]
Hudson commented on CASSANDRA-2855: ----------------------------------- Integrated in Cassandra-0.8 #368 (See [https://builds.apache.org/job/Cassandra-0.8/368/]) Skip empty rows when slicing the entire row. Patch by Jeremy Hanna and brandonwilliams, reviewed by Jeremy Hanna for CASSANDRA-2855 brandonwilliams : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1182463 Files : * /cassandra/branches/cassandra-0.8/CHANGES.txt * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java > Skip rows with empty columns when slicing entire row > ---------------------------------------------------- > > Key: CASSANDRA-2855 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2855 > Project: Cassandra > Issue Type: Improvement > Components: API > Reporter: Jeremy Hanna > Assignee: Jeremy Hanna > Priority: Minor > Labels: hadoop > Fix For: 0.8.8 > > Attachments: 2855-v2.txt, 2855-v3.txt, 2855-v4.txt, 2855-v5.txt > > > We have been finding that range ghosts appear in results from Hadoop via Pig. > This could also happen if rows don't have data for the slice predicate that > is given. This leads to having to do a painful amount of defensive checking > on the Pig side, especially in the case of range ghosts. > We would like to add an option to skip rows that have no column values in it. > That functionality existed before in core Cassandra but was removed because > of the performance penalty of that checking. However with Hadoop support in > the RecordReader, that is batch oriented anyway, so individual row reading > performance isn't as much of an issue. Also we would make it an optional > config parameter for each job anyway, so people wouldn't have to incur that > penalty if they are confident that there won't be those empty rows or they > don't care. > It could be parameter cassandra.skip.empty.rows and be true/false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira