[ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7923:
-------------------------------
    Attachment: HDFS-7923.002.patch

Thanks for the review and comments [~cmccabe].

{code}
  public static final String  DFS_NAMENODE_MAX_CONCURRENT_BLOCK_REPORTS_KEY = 
"dfs.namenode.max.concurrent.block.reports";
  public static final int     DFS_NAMENODE_MAX_CONCURRENT_BLOCK_REPORTS_DEFAULT 
= Integer.MAX_VALUE;
{code}

bq. It seems like this should default to something less than the default number 
of RPC handler threads, not to MAX_INT. Given that dfs.namenode.handler.count = 
10, it seems like this should be no more than 5 or 6, right? The main point 
here to avoid having the NN handler threads completely choked with block 
reports, and that is defeated if the value is MAX_INT. I realize that you 
probably intended this to be configured. But it seems like we should have a 
reasonable default that works for most people.

Actually, my intent was to not have this feature kick in unless it was 
configured, but you have said that you want it enabled by default. I've changed 
the default to the above setting to 6.

{code}
+  /* Number of block reports currently being processed. */
+  private final AtomicInteger blockReportProcessingCount = new 
AtomicInteger(0);
{code}

bq. I'm not sure an AtomicInteger makes sense here. We only modify this 
variable (write to it) when holding the FSN lock in write mode, right? And we 
only read from it when holding the FSN in read mode. So, there isn't any need 
to add atomic ops.

Actually, it is incr'd outside the FSN lock, otherwise it could never be > 1.

bq. I think we need to track which datanodes we gave the "green light" to, and 
not decrement the counter until they either send that report, or some timeout 
expires. (We need the timeout in case datanodes go away after requesting 
permission-to-send.) The timeout can probably be as short as a few minutes. If 
you can't manage to send an FBR in a few minutes, there's more problems going 
on.

I've added a map called 'pendingBlockReports' to BlockManager to track the 
datanodes that we've given the "ok" to as well as when we gave it to them. 
There's also a method to clean the table.

{code}
  public static final String  DFS_BLOCKREPORT_MAX_DEFER_MSEC_KEY = 
"dfs.blockreport.max.deferMsec";
  public static final long    DFS_BLOCKREPORT_MAX_DEFER_MSEC_DEFAULT = 
Long.MAX_VALUE;
{code}

bq. Do we really need this config key?

I've added a TreeBidiMap called lastBlockReportTime to track this. I would have 
used guava instead of apache.commons.collections, but Guava doesn't have a 
sorted BidiMap.


> The DataNodes should rate-limit their full block reports by asking the NN on 
> heartbeat messages
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7923
>                 URL: https://issues.apache.org/jira/browse/HDFS-7923
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Colin Patrick McCabe
>            Assignee: Charles Lamb
>         Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
> HDFS-7923.002.patch
>
>
> The DataNodes should rate-limit their full block reports.  They can do this 
> by first sending a heartbeat message to the NN with an optional boolean set 
> which requests permission to send a full block report.  If the NN responds 
> with another optional boolean set, the DN will send an FBR... if not, it will 
> wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to