[ 
https://issues.apache.org/jira/browse/KAFKA-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367553#comment-16367553
 ] 

Xavier Léauté edited comment on KAFKA-5285 at 2/16/18 4:37 PM:
---------------------------------------------------------------

{quote}# For lower range, we just use the {{keyFrom bytes}}, instead of 
{{keyFrom + minSuffix (0000s)}}. Since we know there would be no data in 
between these two keys at all, we can save some overhead of bytes alloc and 
puts.{quote}

This assumption only holds if {{minSuffix}} only consists of zeros. If that is 
not the case, then there exists a suffix {{S}} that is smaller in order than 
{{minSuffix}} and is not a prefix of {{minSuffix}}. In that case I can 
construct a new key by appending this suffix to {{keyFrom}}, i.e. my new key is 
{{keyFrom + S}}. The smallest byte sequence for that key will be {{keyFrom + S 
+ minSuffix}} which is greater than {{keyFrom}}, but still smaller than 
{{keyFrom + minSuffix}} 

 


was (Author: xvrl):
{quote}# For lower range, we just use the {{keyFrom bytes}}, instead of 
{{keyFrom + minSuffix (0000s)}}. Since we know there would be no data in 
between these two keys at all, we can save some overhead of bytes alloc and 
puts.{quote}

This assumption only holds if {{minSuffix}} only consists of zeros. If that is 
not the case, then there exists a suffix S that is smaller in order than 
minSuffix and is not a prefix of minSuffix. In that case I can construct a new 
key by appending this suffix to {{keyFrom}}, i.e. my new key is {{keyFrom + 
S}}. The smallest byte sequence for that key will be {{keyFrom + S + 
minSuffix}} which is greater than {{keyFrom}}, but still smaller than {{keyFrom 
+ minSuffix}} 

 

> Optimize upper / lower byte range for key range scan on windowed stores
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-5285
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5285
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Xavier Léauté
>            Assignee: Guozhang Wang
>            Priority: Major
>              Labels: performance
>
> The current implementation of {{WindowKeySchema}} / {{SessionKeySchema}} 
> {{upperRange}} and {{lowerRange}} does not make any assumptions with respect 
> to the other key bound (e.g. the upper byte bound does not depends on lower 
> key bound).
> It should be possible to optimize the byte range somewhat further using the 
> information provided by the lower bound.
> More specifically, by incorporating that information, we should be able to 
> eliminate the corresponding {{upperRangeFixedSize}} and 
> {{lowerRangeFixedSize}}, since the result should be the same if we implement 
> that optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to