Sagar Naik wrote:
Hi Raghu,
The periodic "du" and block reports thread thrash the disk. (Block
Reports takes abt on an avg 21 mins )
and I think all the datanode threads are not able to do much and freeze
yes, that is the known problem we talked about in the earlier mails in
this thread.
When you have millions of blocks, one hour for du and block report
intervals is too often for you. May be you could increase it to
something like 6 or 12 hours.
It still does not fix the block report problem since DataNode does the
scan in-line.
As I mentioned in earlier mails, we should really fix the block report
problem. As simple fix would scan (very slowly, unlike DU) the
directories in the background.
Even after fixing block reports, you should be aware that excessive
number of block does impact the performance. No system can guarantee
performance when overloaded. What we want to do is to make Hadoop
degrade gracefully.. rather than DNs getting killed.
Raghu.