[jira] Created: (MAPREDUCE-1909) TrackerDistributedCacheManager takes a blocking lock fo a loop that executes 10K times

Dick King (JIRA) Thu, 01 Jul 2010 14:58:50 -0700

TrackerDistributedCacheManager takes a blocking lock fo a loop that executes 
10K times
--------------------------------------------------------------------------------------


                 Key: MAPREDUCE-1909
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1909
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
            Reporter: Dick King
            Assignee: Dick King


In {{TrackerDistributedCachaManager.java}} , the portion where the cache is 
cleaned up, the lock is taken on the main hash table and then all the entries 
are scanned to see if they can be deleted.  That's a long lockage.  The table 
is likely to have 10K entries.

I would like to reduce the longest lock duration by maintaining the set of 
{{CacheStatus}} es to delete incrementally.

1: Let there be a new {{HashSet}}, {{deleteSet}}, that's protected under 
{{synchronized(cachedArchives)}}

2: When {{refcount}} is decreased to 0, move the {{CacheStatus}} from 
{{cachedArchives}} to {{deleteSet}}

3: When seeking an existing {{CacheStatus}}, look in {{deleteSet}} if it isn't 
in {{cachedArchives}}

4: When {{refcount}} is increased from 0 to 1 in a pre-existing {{CacheStatus}} 
[see 3:, above] move the {{CacheStatus}} from {{deleteSet}} to 
{{cachedArchives}}

5: When we clean the cache, under {{synchronized(cachedArchives)}} , move 
{{deleteSet}} to a local variable and create a new empty {{HashSet}}.  This is 
constant time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1909) TrackerDistributedCacheManager takes a blocking lock fo a loop that executes 10K times

Reply via email to