Hi all, There're lots of materials from internet suggest to set dfs.block.size larger, e.g. from 64M to 256M, when the job is large. And they said the performance would improve. But I'm not clear why increse the block size will improve. I know that increase block size will reduce the map task number for the same input, but why lesser map tasks will improve overall performance?
Any comments would be highly valued, and thanks in advance. Best Regards, Carp