[ 
https://issues.apache.org/jira/browse/CASSANDRA-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382505#comment-14382505
 ] 

Benedict commented on CASSANDRA-8993:
-------------------------------------

OK, so it makes a lot more sense now that I realise the downsampling 
granularity can be so small - I was thrown by the "BSL must be a power of two" 
without an equivalent statement for the sampling itself, and in my head I just 
assumed it was all dealing with powers of 2 (so nothing technically wrong with 
the comments, just my interpretation of them). This also explains why the 
effective index intervals were always the same - with powers of 2 they would 
be. I wonder if we couldn't get a lot of the benefit of downsampling by 
sticking to powers of 2, as it might simplify the code significantly? The 
original indices, indices to skip, and effective intervals could each be 
implemented with approximately one simple statement. Not pushing for it, mind, 
just airing the question.

Thanks for taking the time to explain, anyway, and with that clarification I am 
+1 the patch as stands.

On the topic of the zero index always being present: I can vouch that this 
assumption breaks somewhere, because I assumed this to be the case when 
modifying IndexSummaryBuilder, and without a 
setNextSamplePosition(-minIndexInterval) it doesn't pass its test cases (i.e. 
initiating the first sample index deterministically to zero caused unit test 
failures). So we should perhaps track down where the logical flaw is, however 
minor it may be.

> EffectiveIndexInterval calculation is incorrect
> -----------------------------------------------
>
>                 Key: CASSANDRA-8993
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8993
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Blocker
>             Fix For: 2.1.4
>
>         Attachments: 8993-2.1-v2.txt, 8993-2.1.txt, 8993.txt
>
>
> I'm not familiar enough with the calculation itself to understand why this is 
> happening, but see discussion on CASSANDRA-8851 for the background. I've 
> introduced a test case to look for this during downsampling, but it seems to 
> pass just fine, so it may be an artefact of upgrading.
> The problem was, unfortunately, not manifesting directly because it would 
> simply result in a failed lookup. This was only exposed when early opening 
> used firstKeyBeyond, which does not use the effective interval, and provided 
> the result to getPosition().
> I propose a simple fix that ensures a bug here cannot break correctness. 
> Perhaps [~thobbs] can follow up with an investigation as to how it actually 
> went wrong?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to