Re: Restrict Frequency of BlockReport To Namenode startup and failover

2020-02-11 Thread Ayush Saxena
Thanx Everyone!!! Just to conclude the thread. Have created HDFS-15162 to track this. -Ayush > On 09-Feb-2020, at 5:01 PM, Ayush Saxena wrote: > > Hi Stephen, > We are trying this on 3.1.1 > We aren’t upgrading from 2.x, we are trying to increase the cluster size to > go beyond 10K

Re: Restrict Frequency of BlockReport To Namenode startup and failover

2020-02-09 Thread Ayush Saxena
Hi Stephen, We are trying this on 3.1.1 We aren’t upgrading from 2.x, we are trying to increase the cluster size to go beyond 10K datanodes. In the process, we analysed that block reports from these many DN’s are quite bothersome. There are plenty of reasons why block reports bothers

Re: Restrict Frequency of BlockReport To Namenode startup and failover

2020-02-07 Thread Stephen O'Donnell
Are you seeing this problem on the 3.x branch, and if so, did the problem exist before you upgraded to 3.x? I am wondering if the situation is better or worse since moving to 3.x. Also, do you believe the issue is driven by the namenode holding its lock for too long while it processes each block

Re: Restrict Frequency of BlockReport To Namenode startup and failover

2020-02-06 Thread Surendra Singh Lilhore
Thanks Wei-Chiu, I feel now IBR is more stable in branch 3.x. If BR is just added to prevent bugs in IBR, I feel we should fix such bug in IBR. Adding one new functionality to prevent bug in other is not good. I also thing, DN should send BR in failure and process start scenario only. -Surendra

Re: Restrict Frequency of BlockReport To Namenode startup and failover

2020-02-06 Thread Ayush Saxena
Hi Wei-Chiu, Thanx for the response. Yes, We are talking about the FBR only. Increasing the frequency limits the problem, but doesn’t seems to be solving it. With increasing cluster size, the frequency needs to be increased, and we cannot increase it indefinitely, as in some case FBR is needed.

Re: Restrict Frequency of BlockReport To Namenode startup and failover

2020-02-06 Thread Wei-Chiu Chuang
Hey Ayush, Thanks a lot for your proposal. Do you mean the Full Block Report that is sent out every 6 hours per DataNode? Someone told me they reduced the frequency of FBR to 24 hours and it seems okay. One of the purposes of FBR was to prevent bugs in incremental block report implementation.

Restrict Frequency of BlockReport To Namenode startup and failover

2020-02-04 Thread Ayush Saxena
Hi All, Me and Surendra have been lately trying to minimise the impact of Block Reports on Namenode in huge cluster. We observed in a huge cluster, about 10k datanodes, the periodic block reports impact the Namenode performance adversely. We have been thinking to restrict the block reports to be