[ https://issues.apache.org/jira/browse/HBASE-14708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Elliott Clark updated HBASE-14708: ---------------------------------- Summary: Use copy on write Map for region location cache (was: Use copy on write TreeMap for region location cache) > Use copy on write Map for region location cache > ----------------------------------------------- > > Key: HBASE-14708 > URL: https://issues.apache.org/jira/browse/HBASE-14708 > Project: HBase > Issue Type: Improvement > Components: Client > Affects Versions: 1.1.2 > Reporter: Elliott Clark > Assignee: Elliott Clark > Priority: Critical > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-14708-v10.patch, HBASE-14708-v11.patch, > HBASE-14708-v12.patch, HBASE-14708-v2.patch, HBASE-14708-v3.patch, > HBASE-14708-v4.patch, HBASE-14708-v5.patch, HBASE-14708-v6.patch, > HBASE-14708-v7.patch, HBASE-14708-v8.patch, HBASE-14708-v9.patch, > HBASE-14708.patch, anotherbench.zip, location_cache_times.pdf, result.csv > > > Internally a co-worker profiled their application that was talking to HBase. > > 60% of the time was spent in locating a region. This was while the cluster > was stable and no regions were moving. > To figure out if there was a faster way to cache region location I wrote up a > benchmark here: https://github.com/elliottneilclark/benchmark-hbase-cache > This tries to simulate a heavy load on the location cache. > * 24 different threads. > * 2 Deleting location data > * 2 Adding location data > * Using floor to get the result. > To repeat my work just run ./run.sh and it should produce a result.csv > Results: > ConcurrentSkiplistMap is a good middle ground. It's got equal speed for > reading and writing. > However most operations will not need to remove or add a region location. > There will be potentially several orders of magnitude more reads for cached > locations than there will be on clearing the cache. > So I propose a copy on write tree map. -- This message was sent by Atlassian JIRA (v6.3.4#6332)