I've encountered a situation lately where lots of regions need compaction at the same time. This leads to callers being blocked waiting to grab the monitor to enter 'reclaimMemStoreMemory'.
When the number of regions is large, the blocking can last a long time. In my specific case, the callers are Reducer instances and the timeout of the reducer (mapred.task.timeout) actually gets reached and the Reducer gets restarted. I was thinking about a possible enhancement where we could specify in HTable that in case of compaction we want to be pushed back immediately instead of being blocked. This could be done by adding a boolean to Put objects that would be set in HTable.put(), reflecting the HTable setting (thus avoiding to have several values of this flag in a list of Put). The modification of this flag would trigger a flushCommits so all Put instances in the buffer carry the same value of the flag. The behavior in case a request is pushed back would be to throw an IOException, leaving the write buffer as it was prior to the call. This would allow in my case the Reducer to wait on its side, modifying its status so mapred.task.timeout is not reahed. What do you think? Mathias.
