Paul.

> [Commenting on-list, not sure if I should just sign up + put these
> comments direct on reviews?]

Yes. You're supposed to sign in and write comment directly to http://review.hbase.org

> So assuming that
> disabling/enabling prefetch for a region is rare (both in terms of
> frequency, and in terms of a small number of regions being disabled),
> the best performance will probably come from something like
> private final Set<byte[]> prefetchDisabledRegions
> = new CopyOnWriteArraySet<byte[]>()

Excellent suggestion. It's the exact use case. I will follow your suggestion and use CopyOnWriteArraySet instead of HashMap.

Thanks,
Mingjie

On 05/28/2010 08:21 PM, Paul Cowan wrote:
[Commenting on-list, not sure if I should just sign up + put these
comments direct on reviews?]

On 29/05/2010 7:59 AM, Ryan Rawson wrote:
 > Belay that order, I need to inspect this code review... unfortunately
a concurrent collection is not the answer to all multi threading woes.

Changing this class to use ConcurrentHashMap, and completely removing
synchronization, won't work, as it's written. The lazy-loading in
regionCachePreFetchEnabled() (get - if null, put) won't work correctly
with a ConcurrentHashMap without external synchronization.

As it is, the class is thread-safe in the way it accesses the HashMap.
To remove the synchronization (if desired), you could:

1) Replace with a ConcurrentHashMap, and change
regionCachePreFetchEnabled() to do something like:
prefetchRegionCacheEnabled.putIfAbsent(tableName, true);
return prefetchRegionCacheEnabled.get(tableName);
Alternatively, remove the 'true' -- that's the default, right? So there
isn't actually any reason to put true values in the map if there's no
value otherwise. Then just use:
Boolean result = this.prefetchRegionCacheEnabled.get(tableName);
return result == null ? Boolean.TRUE : result;
(and it's simpler, right?)


2) Again, given that 'true' is the default value, there's probably
actually no reason to use a Map at all. All you want to know is which
ones have specifically had fetching disabled, right? So assuming that
disabling/enabling prefetch for a region is rare (both in terms of
frequency, and in terms of a small number of regions being disabled),
the best performance will probably come from something like
private final Set<byte[]> prefetchDisabledRegions
= new CopyOnWriteArraySet<byte[]>()
then disable is just:
prefetchDisabledRegions.add(name);
enable is:
prefetchDisabledRegions.remove(name);
and isEnabled is just
prefetchDisabledRegions.contains(name);
which is probably going to be quicker if disabling is done only for a
few regions (so linear walking of the underlying array is quicker than
hashing + probing) -- and remember, scans of CopyOnWriteArraySet are
completely lock-free. Also, it's cleaner IMHO...


3) Use the solution from #2 with a good ConcurrentHashSet (there isn't
one in the JDK, unfortunately, but maybe you have one available), which
is more useful if large numbers of regions will be disabled or if
regions will be toggled on/off frequently.

Cheers,

Paul

PS: Minor suggestion: this class is inconsistent with variable/method
names as to whether 'prefetch' is one camel-cased word (prefetch) or two
(preFetch). Just thought I'd mention it!

Reply via email to