[ 
https://issues.apache.org/jira/browse/PHOENIX-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010248#comment-15010248
 ] 

James Taylor commented on PHOENIX-2417:
---------------------------------------

The idea would be to use this encoding on PTableImpl for the in memory and 
over-the-wire encoding (potentially the on disk format as well), as we've found 
that smaller guideposts prevent latency issues with concurrent point lookups 
(and we don't want to consume too much memory for these guideposts). Since the 
guideposts are traversed sequentially, the encoding shouldn't negatively impact 
performance.

> Compress memory used by row key byte[] of guideposts
> ----------------------------------------------------
>
>                 Key: PHOENIX-2417
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2417
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>
> We've found that smaller guideposts are better in terms of minimizing any 
> increase in latency for point scans. However, this increases the amount of 
> memory significantly when caching the guideposts on the client. Guidepost are 
> equidistant row keys in the form of raw byte[] which are likely to have a 
> large percentage of their leading bytes in common (as they're stored in 
> sorted order. We should use a simple compression technique to mitigate this. 
> I noticed that Apache Parquet has a run length encoding - perhaps we can use 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to