[
https://issues.apache.org/jira/browse/HBASE-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13798188#comment-13798188
]
Lars Hofhansl commented on HBASE-9794:
--------------------------------------
This is one of my pet peeves :) and the reason why scanning with block
encoding is so much slower and more GC intensive than without.
> KeyValues / cells backed by buffer fragments
> --------------------------------------------
>
> Key: HBASE-9794
> URL: https://issues.apache.org/jira/browse/HBASE-9794
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Andrew Purtell
>
> There are various places in the code where we see comments to the effect
> "would be great if we had a scatter gather API for KV", appearing at places
> where we rewrite KVs on the server, for example in HRegion where we process
> appends and increments.
> KeyValues are stored in buffers of fixed length. This approach has
> performance advantages for the common case where KVs are not manipulated on
> their way from disk to RPC. The disadvantage of this approach is any
> manipulation of tags requires the creation of a new buffer to hold the
> result, and a copy of the KV data into the new buffer. Appends and increments
> are typically a small percentage of overall workload so this has been fine up
> to now.
>
> KeyValues can now carry metadata known as tags. Tags are stored contiguously
> with the rest of the KeyValue. Applications wishing to use tags (like per
> cell security) change the equation by wanting to rewrite KVs significantly
> more often.
> We should consider backing KeyValue with an alternative structure that can
> better support rewriting portions of its data, appends to existing buffers,
> scatter-gather copies, possibly even copy-on-write.
--
This message was sent by Atlassian JIRA
(v6.1#6144)