[
https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074812#comment-14074812
]
Lars Hofhansl commented on HBASE-7336:
--------------------------------------
[~vrodionov], curious about how you will find good splitpoints *inside* a
region. Regions can be assumed to be roughly of equal size (in terms of bytes,
not rows), but inside a region the distribution of keys can be arbitrarily
skewed, and hence unless you have more state you cannot find good splits inside
a region.
(the region split points actually are a very rough histogram for data
distribution)
> HFileBlock.readAtOffset does not work well with multiple threads
> ----------------------------------------------------------------
>
> Key: HBASE-7336
> URL: https://issues.apache.org/jira/browse/HBASE-7336
> Project: HBase
> Issue Type: Sub-task
> Components: Performance
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Priority: Critical
> Fix For: 0.94.4, 0.95.0
>
> Attachments: 7336-0.94.txt, 7336-0.96.txt
>
>
> HBase grinds to a halt when many threads scan along the same set of blocks
> and neither read short circuit is nor block caching is enabled for the dfs
> client ... disabling the block cache makes sense on very large scans.
> It turns out that synchronizing in istream in HFileBlock.readAtOffset is the
> culprit.
--
This message was sent by Atlassian JIRA
(v6.2#6252)