Hi, I have posted the same question to the dev list before but I haven' heard back from anyone so I figured someone here might be able to shed some light on it. I have recently been reading Hadoop 1.1.0 source code to better understand the internals and learned a lot from it, so far. When I was looking at ReduceTask.java, I saw some synchronization attempts, some of which seem redundant to me. To be more specific, in MapOutputCopier's copyOutput method, before calling addToMapOutputFilesOnDisk, we synchronize on mapOutputFilesOnDisk but the actual addToMapOutputFilesOnDisk also synchronizes on it again. And this entire block in copyOutput is already synchronized on ReduceTask.this which seems a bit confusing to me. I'm not sure if this is the right place to ask this question, but I'd appreciate if someone point out why we need this sort of triple locking here.
Regards, Jim