[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-06-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7923:
---
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

committed.  let's fix the checkstyle issues in a follow-on

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.8.0

 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch, 
 HDFS-7923.006.patch, HDFS-7923.007.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-06-08 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7923:
---
Attachment: HDFS-7923.007.patch

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch, 
 HDFS-7923.006.patch, HDFS-7923.007.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-06-05 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7923:
---
Attachment: HDFS-7923.006.patch

I need to make sure to treat the initial block report delay as being in 
seconds, not milliseconds.  Thanks to Andrew for pointing this out.  Updating 
with patch 6.

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch, 
 HDFS-7923.006.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-06-05 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7923:
---
Attachment: HDFS-7923.005.patch

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch, 
 HDFS-7923.005.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-06-05 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7923:
---
Attachment: (was: HDFS-7923.005.patch)

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-05-29 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7923:
---
Attachment: HDFS-7923.004.patch

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-05-29 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7923:
---
 Target Version/s: 2.8.0
Affects Version/s: 2.8.0
   Status: Patch Available  (was: In Progress)

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-05-22 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7923:
---
Attachment: HDFS-7923.003.patch

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Charles Lamb
 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch, HDFS-7923.003.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-04-29 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7923:
---
Attachment: HDFS-7923.002.patch

Thanks for the review and comments [~cmccabe].

{code}
  public static final String  DFS_NAMENODE_MAX_CONCURRENT_BLOCK_REPORTS_KEY = 
dfs.namenode.max.concurrent.block.reports;
  public static final int DFS_NAMENODE_MAX_CONCURRENT_BLOCK_REPORTS_DEFAULT 
= Integer.MAX_VALUE;
{code}

bq. It seems like this should default to something less than the default number 
of RPC handler threads, not to MAX_INT. Given that dfs.namenode.handler.count = 
10, it seems like this should be no more than 5 or 6, right? The main point 
here to avoid having the NN handler threads completely choked with block 
reports, and that is defeated if the value is MAX_INT. I realize that you 
probably intended this to be configured. But it seems like we should have a 
reasonable default that works for most people.

Actually, my intent was to not have this feature kick in unless it was 
configured, but you have said that you want it enabled by default. I've changed 
the default to the above setting to 6.

{code}
+  /* Number of block reports currently being processed. */
+  private final AtomicInteger blockReportProcessingCount = new 
AtomicInteger(0);
{code}

bq. I'm not sure an AtomicInteger makes sense here. We only modify this 
variable (write to it) when holding the FSN lock in write mode, right? And we 
only read from it when holding the FSN in read mode. So, there isn't any need 
to add atomic ops.

Actually, it is incr'd outside the FSN lock, otherwise it could never be  1.

bq. I think we need to track which datanodes we gave the green light to, and 
not decrement the counter until they either send that report, or some timeout 
expires. (We need the timeout in case datanodes go away after requesting 
permission-to-send.) The timeout can probably be as short as a few minutes. If 
you can't manage to send an FBR in a few minutes, there's more problems going 
on.

I've added a map called 'pendingBlockReports' to BlockManager to track the 
datanodes that we've given the ok to as well as when we gave it to them. 
There's also a method to clean the table.

{code}
  public static final String  DFS_BLOCKREPORT_MAX_DEFER_MSEC_KEY = 
dfs.blockreport.max.deferMsec;
  public static final longDFS_BLOCKREPORT_MAX_DEFER_MSEC_DEFAULT = 
Long.MAX_VALUE;
{code}

bq. Do we really need this config key?

I've added a TreeBidiMap called lastBlockReportTime to track this. I would have 
used guava instead of apache.commons.collections, but Guava doesn't have a 
sorted BidiMap.


 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Charles Lamb
 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
 HDFS-7923.002.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-04-24 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7923:
---
Attachment: HDFS-7923.001.patch

[~cmccabe], attached is a patch that is rebased onto the trunk.

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Charles Lamb
 Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-04-06 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7923:
---
Attachment: HDFS-7923.000.patch

Attached is a patch that implements the behavior I described.

 The DataNodes should rate-limit their full block reports by asking the NN on 
 heartbeat messages
 ---

 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Charles Lamb
 Attachments: HDFS-7923.000.patch


 The DataNodes should rate-limit their full block reports.  They can do this 
 by first sending a heartbeat message to the NN with an optional boolean set 
 which requests permission to send a full block report.  If the NN responds 
 with another optional boolean set, the DN will send an FBR... if not, it will 
 wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)