[ 
https://issues.apache.org/jira/browse/CASSANDRA-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060002#comment-13060002
 ] 

Jonathan Ellis commented on CASSANDRA-2855:
-------------------------------------------

bq. is it more expensive/complicated to do it for an empty slice

empty result for entire row slice means it really will be gone when tombstone 
expires, so the two are semantically equivalent.  this is not the case for a 
smaller slice; an empty result for that could mean "there is data in the row, 
just not in the slice you requested."  so leaving that out would be an error.

> Skip rows with empty columns when slicing entire row
> ----------------------------------------------------
>
>                 Key: CASSANDRA-2855
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2855
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>            Reporter: Jeremy Hanna
>            Priority: Minor
>              Labels: hadoop
>             Fix For: 0.8.2
>
>
> We have been finding that range ghosts appear in results from Hadoop via Pig. 
>  This could also happen if rows don't have data for the slice predicate that 
> is given.  This leads to having to do a painful amount of defensive checking 
> on the Pig side, especially in the case of range ghosts.
> We would like to add an option to skip rows that have no column values in it. 
>  That functionality existed before in core Cassandra but was removed because 
> of the performance penalty of that checking.  However with Hadoop support in 
> the RecordReader, that is batch oriented anyway, so individual row reading 
> performance isn't as much of an issue.  Also we would make it an optional 
> config parameter for each job anyway, so people wouldn't have to incur that 
> penalty if they are confident that there won't be those empty rows or they 
> don't care.
> It could be parameter cassandra.skip.empty.rows and be true/false.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to