[ 
https://issues.apache.org/jira/browse/CASSANDRA-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12868425#action_12868425
 ] 

Jeremy Hanna edited comment on CASSANDRA-1042 at 5/17/10 6:25 PM:
------------------------------------------------------------------

For the describe_splits call it makes, it returns 3 sub splits, the first of 
which is a wrapping split.  Sounds like it's buggy on the server side.  Will 
check with Jonathan.

      was (Author: jeromatron):
    The server is generating 3 splits on my data, the first of which is a 
wrapping split.  For the describe_splits call it makes, it returns 3 sub 
splits, the first of which is a wrapping split.  sounds like it's buggy on the 
server side.  Will check with Jonathan.
  
> ColumnFamilyRecordReader returns duplicate rows
> -----------------------------------------------
>
>                 Key: CASSANDRA-1042
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1042
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.6
>            Reporter: Joost Ouwerkerk
>            Assignee: Jeremy Hanna
>             Fix For: 0.6.2
>
>
> There's a bug in ColumnFamilyRecordReader that appears when processing a 
> single split (which happens in most tests that have small number of rows), 
> and potentially in other cases.  When the start and end tokens of the split 
> are equal, duplicate rows can be returned.
> Example with 5 rows:
> token (start and end) = 53193025635115934196771903670925341736
> Tokens returned by first get_range_slices iteration (all 5 rows):
>  16955237001963240173058271559858726497
>  40670782773005619916245995581909898190
>  99079589977253916124855502156832923443
>  144992942750327304334463589818972416113
>  166860289390734216023086131251507064403
> Tokens returned by next iteration (first token is last token from
> previous, end token is unchanged)
>  16955237001963240173058271559858726497
>  40670782773005619916245995581909898190
> Tokens returned by final iteration  (first token is last token from
> previous, end token is unchanged)
>  [] (empty)
> In this example, the mapper has processed 7 rows in total, 2 of which
> were duplicates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to