In the merge sort (1) the # of processors required decreases with each super step. If I am not wrong in the current implementation of Hama, all the processors slots are blocked till the job completion in spite of knowing that no processing will be done by some of them. Can we improve this? I am not aware of any other algorithms where the # of processors required keeps on decreasing as the job makes progress.
(1) - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.870 Praveen
