[ 
https://issues.apache.org/jira/browse/CASSANDRA-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189291#comment-13189291
 ] 

Dominic Williams commented on CASSANDRA-3748:
---------------------------------------------

Hey, anyone got any ideas on this bug yet?
                
> Range ghosts don't disappear as expected and accumulate
> -------------------------------------------------------
>
>                 Key: CASSANDRA-3748
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3748
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.3
>         Environment: Cassandra on Debian 
>            Reporter: Dominic Williams
>              Labels: compaction, ghost-row, range, remove
>             Fix For: 1.0.8
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> I have a problem where range ghosts are accumulating and cannot be removed by 
> reducing GCSeconds and compacting.
> In our system, we have some cfs that represent "markets" where each row 
> represents an item. Once an item is sold, it is removed from the market by 
> passing its key to remove().
> The problem, which was hidden for some time by caching, is appearing on read. 
> Every few seconds our system collates a random sample from each cf/market by 
> choosing a random starting point:
> String startKey = RNG.nextUUID())
> and then loading a page range of rows, specifying the key range as:
> KeyRange keyRange = new KeyRange(pageSize);
> keyRange.setStart_key(startKey);
> keyRange.setEnd_key(maxKey);
> The returned rows are iterated over, and ghosts ignored. If insufficient rows 
> are obtained, the process is repeated using the key of the last row as the 
> starting key (or wrapping if necessary etc).
> When performance was lagging, we did a test and found that constructing a 
> random sample of 40 items (rows) involved iterating over hundreds of 
> thousands of ghost rows. 
> Our first attempt to deal with this was to halve our GCGraceSeconds and then 
> perform major compactions. However, this had no effect on the number of ghost 
> rows being returned. Furthermore, on examination it seems clear that the 
> number of ghost rows being created within GCSeconds window must be smaller 
> than the number being returned. Thus looks like a bug.
> We are using Cassandra 1.0.3 with Sylain's patch from CASSANDRA-3510

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to