[ 
https://issues.apache.org/jira/browse/HBASE-23066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941073#comment-16941073
 ] 

Jacob LeBlanc commented on HBASE-23066:
---------------------------------------

I've run some performance tests to demonstrate the effectiveness of the patch.

I have not patched our production cluster yet as I'm waiting on confirmation 
from AWS service team that I won't be overwriting AWS-specific changes in the 
HStore class, but I've done some sampling on a test cluster.

Basic setup is an EMR cluster running 1.4.9 backed by S3 as well as ganglia 
installed to capture the metrics. I have a stress tester executing about 1000 
scans per second on a 1.5 GB region. Prefetching is enabled, and I have one 
region server that is unpatched or has the new configuration setting disabled, 
and one region server that is patched and has the new configuration option 
enabled. I then execute the following test:

1. Move the region to the desired region server (either patched or unpatched).
2. Wait for prefetching to complete and for mean scan times to normalize.
3. Execute a major compaction on the target region.
4. Check region server UI / logs to see when the compaction completes.
5. Collect data from ganglia.

One issue I identified with my test is the scans aren't as random as they 
should be so I believe data after compaction is getting cached on read more 
quickly than it otherwise should be on the unpatched server if my scans were 
truly random. I can improve the test, but results still validate the patch.

Baseline mean scan time was about 20 - 60 milliseconds. After compaction the 
results were:

Trial 1 (unpatched): mean scan time peaked at over 27000 milliseconds, and 
stayed above 5000 milliseconds for 3 minutes
Trial 2 (unpatched): mean scan time peaked at over 27000 milliseconds, and 
stayed above 5000 milliseconds for 3.5 minutes
Trial 3 (patched): mean scan time peaked to 282 milliseconds for one time sample
Trial 4 (patched): mean scan time peaked at just over 1300 milliseconds and 
remained abover 1000 milliseconds for 30 seconds
Trial 5 (patched): no noticable spike in mean scan time

I've attached a picture of a graph of the results.

> Allow cache on write during compactions when prefetching is enabled
> -------------------------------------------------------------------
>
>                 Key: HBASE-23066
>                 URL: https://issues.apache.org/jira/browse/HBASE-23066
>             Project: HBase
>          Issue Type: Improvement
>          Components: Compaction, regionserver
>    Affects Versions: 1.4.10
>            Reporter: Jacob LeBlanc
>            Assignee: Jacob LeBlanc
>            Priority: Minor
>             Fix For: 1.5.0, 2.3.0
>
>         Attachments: HBASE-23066.patch, performance_results.png, 
> prefetchCompactedBlocksOnWrite.patch
>
>
> In cases where users care a lot about read performance for tables that are 
> small enough to fit into a cache (or the cache is large enough), 
> prefetchOnOpen can be enabled to make the entire table available in cache 
> after the initial region opening is completed. Any new data can also be 
> guaranteed to be in cache with the cacheBlocksOnWrite setting.
> However, the missing piece is when all blocks are evicted after a compaction. 
> We found very poor performance after compactions for tables under heavy read 
> load and a slower backing filesystem (S3). After a compaction the prefetching 
> threads need to compete with threads servicing read requests and get 
> constantly blocked as a result. 
> This is a proposal to introduce a new cache configuration option that would 
> cache blocks on write during compaction for any column family that has 
> prefetch enabled. This would virtually guarantee all blocks are kept in cache 
> after the initial prefetch on open is completed allowing for guaranteed 
> steady read performance despite a slow backing file system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to