[ https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481715#comment-14481715 ]
Charles Lamb commented on HDFS-7923: ------------------------------------ Here is a description of the heuristic that my patch has implemented for the NN to determine what to send back in response to the "should I send a BR?" question. In the vein of keeping it relatively simple, let's consider 3 parameters: * The max # of FBR requests that the NN is willing to process at any given time (to be called 'dfs.namenode.max.concurrent.block.reports', with a default of Integer.MAX_INTEGER) * The DN's configured block report interval (dfs.blockreport.intervalMsec). This parameter already exists. * The max time we ever want the NN to go without receiving an FBR from a given DN ('dfs.blockreport.max.deferMsec'). If the time since the last FBR received from the DN is less than dfs.blockreport.intervalMsec, then it returns false ("No, don't send an FBR"). In theory, this should never happen if the DN is obeying dfs.blockreport.intervalMsec. If the number of block reports currently being processed by an NN is less than dfs.namenode.max.concurrent.block.reports, and the time since it last received an FBR from the DN sending the heartbeat is greater than dfs.blockreport.intervalMsec, then the NN automatically answers true ("Yes, send along an FBR"). If the number of BRs being processed by an NN is > than dfs.namenode.max.concurrent.block.reports when it receives the heartbeat, then it checks the last time that it received an FBR from the DN sending the heartbeat and if it's greater than dfs.blockreport.max.deferMsec, then it returns true ("Yes, send along an FBR"). If the time-since-last-FBR is less than dfs.blockreport.max.deferMsec, then it returns false. > The DataNodes should rate-limit their full block reports by asking the NN on > heartbeat messages > ----------------------------------------------------------------------------------------------- > > Key: HDFS-7923 > URL: https://issues.apache.org/jira/browse/HDFS-7923 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Colin Patrick McCabe > Assignee: Charles Lamb > Attachments: HDFS-7923.000.patch > > > The DataNodes should rate-limit their full block reports. They can do this > by first sending a heartbeat message to the NN with an optional boolean set > which requests permission to send a full block report. If the NN responds > with another optional boolean set, the DN will send an FBR... if not, it will > wait until later. This can be done compatibly with optional fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)