[
https://issues.apache.org/jira/browse/PHOENIX-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102454#comment-15102454
]
ASF GitHub Bot commented on PHOENIX-2417:
-----------------------------------------
Github user ankitsinghal commented on the pull request:
https://github.com/apache/phoenix/pull/147#issuecomment-172089737
Thanks @JamesRTaylor for the review.
I have made the changes you have suggested above except this one.
byte[] currentGuidePostBytes =
SchemaUtil.copyKeyIfNecessary(currentGuidePost);
As PrefixByteDecoder updates the previous buffer only whenever maxLength is
passed as a part of optimization.
public ImmutableBytesWritable decode(DataInput in) throws IOException {
int prefixLen = WritableUtils.readVInt(in);
int suffixLen = WritableUtils.readVInt(in);
int length = prefixLen + suffixLen;
byte[] b;
if (maxLength == -1) { // Allocate new byte array each time
b = new byte[length];
System.arraycopy(previous.get(), previous.getOffset(), b, 0,
prefixLen);
} else { // Reuse same buffer each time
b = previous.get();
}
in.readFully(b, prefixLen, suffixLen);
previous.set(b, 0, length);
return previous;
}
so I need to copy bytes even if the length of the ImmutableByteWritable is
equal to byte[] contained in it.
What do you think about this?
And also I have fixed the failure test cases as well. there was logical
operator problem while incrementing
guidePosts till the start key.
I have run the complete test suite now and will confirm you once it is
completed.
> Compress memory used by row key byte[] of guideposts
> ----------------------------------------------------
>
> Key: PHOENIX-2417
> URL: https://issues.apache.org/jira/browse/PHOENIX-2417
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: James Taylor
> Assignee: Ankit Singhal
> Fix For: 4.7.0
>
> Attachments: PHOENIX-2417.patch, PHOENIX-2417_encoder.diff,
> PHOENIX-2417_v2_wip.patch
>
>
> We've found that smaller guideposts are better in terms of minimizing any
> increase in latency for point scans. However, this increases the amount of
> memory significantly when caching the guideposts on the client. Guidepost are
> equidistant row keys in the form of raw byte[] which are likely to have a
> large percentage of their leading bytes in common (as they're stored in
> sorted order. We should use a simple compression technique to mitigate this.
> I noticed that Apache Parquet has a run length encoding - perhaps we can use
> that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)