date:20140725


[ 
https://issues.apache.org/jira/browse/HDFS-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074126#comment-14074126
 ] 

Ming Ma commented on HDFS-6742:
---

Arpit, good point. Let me follow up.

 Support sorting datanode list on the new NN webUI
 -

 Key: HDFS-6742
 URL: https://issues.apache.org/jira/browse/HDFS-6742
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma

 The legacy webUI allows sorting datanode list based on specific column such 
 as hostname. It is handy for admins can find pattern more quickly, especially 
 for big clusters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6751) NN WebUI enhancements

Ming Ma created HDFS-6751:
-

 Summary: NN WebUI enhancements
 Key: HDFS-6751
 URL: https://issues.apache.org/jira/browse/HDFS-6751
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Ming Ma


The umbrella jira for NN webUI enhancements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6742) Support sorting datanode list on the new NN webUI


 [ 
https://issues.apache.org/jira/browse/HDFS-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6742:
--

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-6751

 Support sorting datanode list on the new NN webUI
 -

 Key: HDFS-6742
 URL: https://issues.apache.org/jira/browse/HDFS-6742
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma

 The legacy webUI allows sorting datanode list based on specific column such 
 as hostname. It is handy for admins can find pattern more quickly, especially 
 for big clusters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6743) Put IP address into a new column on the new NN webUI


 [ 
https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6743:
--

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-6751

 Put IP address into a new column on the new NN webUI
 

 Key: HDFS-6743
 URL: https://issues.apache.org/jira/browse/HDFS-6743
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma

 The new NN webUI combines hostname and IP into one column in datanode list. 
 It is more convenient for admins if the IP address can be put to a separate 
 column, as in the legacy NN webUI.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6744) Improve decommissioning nodes and dead nodes access on the new NN webUI


 [ 
https://issues.apache.org/jira/browse/HDFS-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6744:
--

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-6751

 Improve decommissioning nodes and dead nodes access on the new NN webUI
 ---

 Key: HDFS-6744
 URL: https://issues.apache.org/jira/browse/HDFS-6744
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma

 The new NN webUI lists live node at the top of the page, followed by dead 
 node and decommissioning node. From admins point of view:
 1. Decommissioning nodes and dead nodes are more interesting. It is better to 
 move decommissioning nodes to the top of the page, followed by dead nodes and 
 decommissioning nodes.
 2. To find decommissioning nodes or dead nodes, the whole page that includes 
 all nodes needs to be loaded. That could take some time for big clusters.
 The legacy web UI filters out the type of nodes dynamically. That seems to 
 work well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6745) Display the list of very-under-replicated blocks as well as the files on NN webUI


 [ 
https://issues.apache.org/jira/browse/HDFS-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6745:
--

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-6751

 Display the list of very-under-replicated blocks as well as the files on NN 
 webUI
 ---

 Key: HDFS-6745
 URL: https://issues.apache.org/jira/browse/HDFS-6745
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma

 Sometimes admins want to know the list of very-under-replicated blocks 
 before major actions such as decommission; as these blocks are more likely to 
 turn into missing blocks. very-under-replicated blocks  are those blocks 
 with live replica count of 1 and replicator factor of = 3.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6745) Display the list of very-under-replicated blocks as well as the files on NN webUI


[ 
https://issues.apache.org/jira/browse/HDFS-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074139#comment-14074139
 ] 

Ming Ma commented on HDFS-6745:
---

CLI will be good. Perhaps people can also specify input parameters such as the 
definition of very-under-replicated. The count of very-under-replicated 
blocks will be good for metrics system; how about the full text such as block 
IDs and file names?

 Display the list of very-under-replicated blocks as well as the files on NN 
 webUI
 ---

 Key: HDFS-6745
 URL: https://issues.apache.org/jira/browse/HDFS-6745
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma

 Sometimes admins want to know the list of very-under-replicated blocks 
 before major actions such as decommission; as these blocks are more likely to 
 turn into missing blocks. very-under-replicated blocks  are those blocks 
 with live replica count of 1 and replicator factor of = 3.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6746) Support datanode list pagination and filtering for big clusters on NN webUI


 [ 
https://issues.apache.org/jira/browse/HDFS-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6746:
--

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-6751

 Support datanode list pagination and filtering for big clusters on NN webUI
 ---

 Key: HDFS-6746
 URL: https://issues.apache.org/jira/browse/HDFS-6746
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma

 This isn't a major issue yet. Still it might be good to add support for 
 pagination at some point and maybe some filtering. For example, that is 
 useful to filter out live nodes that belong to the same rack.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HDFS-6747) Display the most recent GC info on NN webUI


 [ 
https://issues.apache.org/jira/browse/HDFS-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma resolved HDFS-6747.
---

Resolution: Won't Fix

 Display the most recent GC info on NN webUI
 ---

 Key: HDFS-6747
 URL: https://issues.apache.org/jira/browse/HDFS-6747
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma

 It will be handy if the recent GC information is available on NN webUI. 
 admins don't need to dig out GC logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6722) Display readable last contact time for dead nodes on NN webUI


 [ 
https://issues.apache.org/jira/browse/HDFS-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6722:
--

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-6751

 Display readable last contact time for dead nodes on NN webUI
 -

 Key: HDFS-6722
 URL: https://issues.apache.org/jira/browse/HDFS-6722
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-6722-2.patch, HDFS-6722.patch


 For dead node info on NN webUI, admins want to know when the nodes became 
 dead, to troubleshoot missing block, etc. Currently the webUI displays the 
 last contact as the unit of seconds since the last contact. It will be 
 useful to display the info in Date format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files


[ 
https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074150#comment-14074150
 ] 

Vinayakumar B commented on HDFS-5919:
-

Committed to trunk and branch-2. 
Thanks [~jingzhao] and [~umamaheswararao] for the reviews.

 FileJournalManager doesn't purge empty and corrupt inprogress edits files
 -

 Key: HDFS-5919
 URL: https://issues.apache.org/jira/browse/HDFS-5919
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch


 FileJournalManager doesn't purge empty and corrupt inprogress edit files.
 These stale files will be accumulated over time.
 These should be cleared along with the purging of other edit logs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files


 [ 
https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-5919:


   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

 FileJournalManager doesn't purge empty and corrupt inprogress edits files
 -

 Key: HDFS-5919
 URL: https://issues.apache.org/jira/browse/HDFS-5919
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch


 FileJournalManager doesn't purge empty and corrupt inprogress edit files.
 These stale files will be accumulated over time.
 These should be cleared along with the purging of other edit logs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files


[ 
https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074154#comment-14074154
 ] 

Hudson commented on HDFS-5919:
--

FAILURE: Integrated in Hadoop-trunk-Commit #5965 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5965/])
HDFS-5919. FileJournalManager doesn't purge empty and corrupt inprogress edits 
files (vinayakumarb) (vinayakumarb: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613355)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java


 FileJournalManager doesn't purge empty and corrupt inprogress edits files
 -

 Key: HDFS-5919
 URL: https://issues.apache.org/jira/browse/HDFS-5919
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch


 FileJournalManager doesn't purge empty and corrupt inprogress edit files.
 These stale files will be accumulated over time.
 These should be cleared along with the purging of other edit logs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files


[ 
https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074156#comment-14074156
 ] 

Hadoop QA commented on HDFS-5919:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12657766/HDFS-5919.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDatanodeConfig
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  org.apache.hadoop.TestGenericRefresh
  org.apache.hadoop.TestRefreshCallQueue
  
org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7463//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7463//console

This message is automatically generated.

 FileJournalManager doesn't purge empty and corrupt inprogress edits files
 -

 Key: HDFS-5919
 URL: https://issues.apache.org/jira/browse/HDFS-5919
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch


 FileJournalManager doesn't purge empty and corrupt inprogress edit files.
 These stale files will be accumulated over time.
 These should be cleared along with the purging of other edit logs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit

Vinayakumar B created HDFS-6752:
---

 Summary: Avoid Address bind errors in 
TestDatanodeConfig#testMemlockLimit 
 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B


Above test failed due to Address Bind Exception.
Set the HTTP port to ephemeral port in Configuration.

{noformat}java.net.BindException: Port in use: 0.0.0.0:50075
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at 
org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at 
org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
at 
org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


 [ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-6752:


Attachment: HDFS-6752.patch

Attaching the patch to configure datanode port as ephemeral

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


 [ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-6752:


Status: Patch Available  (was: Open)

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6747) Display the most recent GC info on NN webUI

2014-07-25 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074178#comment-14074178
 ] 

Haohui Mai commented on HDFS-6747:
--

I think that it will be quite handy to show some GC metrics on the UI. Arguably 
this is not the best way to operate the system, but GC configuration / metrics 
are the first things that we looked at whenever we encounter performance 
problems.

It is quite helpful to have them on the UI to quickly diagnose performance 
issues.

 Display the most recent GC info on NN webUI
 ---

 Key: HDFS-6747
 URL: https://issues.apache.org/jira/browse/HDFS-6747
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma

 It will be handy if the recent GC information is available on NN webUI. 
 admins don't need to dig out GC logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files


[ 
https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074188#comment-14074188
 ] 

Vinayakumar B commented on HDFS-5919:
-

Above failures are not related this patch.
bq. org.apache.hadoop.hdfs.TestDatanodeConfig
Above failure is due to AddressBindException, and jira issue raised HDFS-6752 
to fix this.
{quote}org.apache.hadoop.TestGenericRefresh
org.apache.hadoop.TestRefreshCallQueue{quote}
Above tests also related to AddressBindException

bq. org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
This failure is due to HDFS-6694

bq. org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport
This failures is observed in many of the recent builds. Need to find out the 
detailed reason.

 FileJournalManager doesn't purge empty and corrupt inprogress edits files
 -

 Key: HDFS-5919
 URL: https://issues.apache.org/jira/browse/HDFS-5919
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch


 FileJournalManager doesn't purge empty and corrupt inprogress edit files.
 These stale files will be accumulated over time.
 These should be cleared along with the purging of other edit logs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6709) Implement off-heap data structures for NameNode and other HDFS memory optimization

2014-07-25 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074189#comment-14074189
 ] 

Kai Zheng commented on HDFS-6709:
-

I repeated the test in the post and sadly found it's true that DirectByteBuffer 
doesn't perform well as write. I'm communicating with Oracle about this and 
hopefully they could explain about it and address this in future Java version. 
It's interesting. Thanks.

 Implement off-heap data structures for NameNode and other HDFS memory 
 optimization
 --

 Key: HDFS-6709
 URL: https://issues.apache.org/jira/browse/HDFS-6709
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6709.001.patch


 We should investigate implementing off-heap data structures for NameNode and 
 other HDFS memory optimization.  These data structures could reduce latency 
 by avoiding the long GC times that occur with large Java heaps.  We could 
 also avoid per-object memory overheads and control memory layout a little bit 
 better.  This also would allow us to use the JVM's compressed oops 
 optimization even with really large namespaces, if we could get the Java heap 
 below 32 GB for those cases.  This would provide another performance and 
 memory efficiency boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-4265) BKJM doesn't take advantage of speculative reads

2014-07-25 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074250#comment-14074250
 ] 

Rakesh R commented on HDFS-4265:


Hi [~umamaheswararao] Could you please have a look at this when you get 
sometime. Thanks!

HDFS-4265  HDFS-4266 are dependent patches. After committing the first one, 
second issue need to be rebased. I’ll update the second patch after seeing the 
first issue in.

 BKJM doesn't take advantage of speculative reads
 

 Key: HDFS-4265
 URL: https://issues.apache.org/jira/browse/HDFS-4265
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 2.2.0
Reporter: Ivan Kelly
Assignee: Rakesh R
 Attachments: 001-HDFS-4265.patch, 002-HDFS-4265.patch, 
 003-HDFS-4265.patch, 004-HDFS-4265.patch


 BookKeeperEditLogInputStream reads entry at a time, so it doesn't take 
 advantage of the speculative read mechanism introduced by BOOKKEEPER-336.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


[ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074272#comment-14074272
 ] 

Hadoop QA commented on HDFS-6752:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12657790/HDFS-6752.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  
org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport
  org.apache.hadoop.hdfs.TestEncryptedTransfer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7464//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7464//console

This message is automatically generated.

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6723) New NN webUI no longer displays decommissioned state for dead node


[ 
https://issues.apache.org/jira/browse/HDFS-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074293#comment-14074293
 ] 

Hudson commented on HDFS-6723:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #623 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/623/])
HDFS-6723. New NN webUI no longer displays decommissioned state for dead node. 
Contributed by Ming Ma. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613220)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html


 New NN webUI no longer displays decommissioned state for dead node
 --

 Key: HDFS-6723
 URL: https://issues.apache.org/jira/browse/HDFS-6723
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 2.5.0

 Attachments: HDFS-6723.patch


 Somehow the new webUI doesn't show if a given dead node is decommissioned or 
 not. JMX does return the correct info. Perhaps some bug in dfshealth.html?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files


[ 
https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074292#comment-14074292
 ] 

Hudson commented on HDFS-5919:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #623 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/623/])
HDFS-5919. FileJournalManager doesn't purge empty and corrupt inprogress edits 
files (vinayakumarb) (vinayakumarb: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613355)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java


 FileJournalManager doesn't purge empty and corrupt inprogress edits files
 -

 Key: HDFS-5919
 URL: https://issues.apache.org/jira/browse/HDFS-5919
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch


 FileJournalManager doesn't purge empty and corrupt inprogress edit files.
 These stale files will be accumulated over time.
 These should be cleared along with the purging of other edit logs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6657) Remove link to 'Legacy UI' in trunk's Namenode UI


[ 
https://issues.apache.org/jira/browse/HDFS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074295#comment-14074295
 ] 

Hudson commented on HDFS-6657:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #623 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/623/])
HDFS-6657. Remove link to 'Legacy UI' in trunk's Namenode UI. Contributed by 
Vinayakumar B. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613195)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/index.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/index.html


 Remove link to 'Legacy UI' in trunk's Namenode UI
 -

 Key: HDFS-6657
 URL: https://issues.apache.org/jira/browse/HDFS-6657
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
Priority: Minor
 Fix For: 3.0.0

 Attachments: HDFS-6657.patch, HDFS-6657.patch


 Link to 'Legacy UI' provided on namenode's UI.
 Since in trunk, all jsp pages are removed, these links will not work. can be 
 removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6715) webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode


[ 
https://issues.apache.org/jira/browse/HDFS-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074290#comment-14074290
 ] 

Hudson commented on HDFS-6715:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #623 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/623/])
HDFS-6715. Webhdfs wont fail over when it gets java.io.IOException: Namenode is 
in startup mode. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613237)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFSForHA.java


 webhdfs wont fail over when it gets java.io.IOException: Namenode is in 
 startup mode
 

 Key: HDFS-6715
 URL: https://issues.apache.org/jira/browse/HDFS-6715
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, webhdfs
Affects Versions: 2.2.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Fix For: 2.6.0

 Attachments: HDFS-6715.000.patch, HDFS-6715.001.patch


 Noticed in our HA testing when we run MR job with webhdfs file system we some 
 times run into 
 {code}
 2014-04-17 05:08:06,346 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
 report from attempt_1397710493213_0001_r_08_0: Container killed by the 
 ApplicationMaster.
 Container killed on request. Exit code is 143
 Container exited with a non-zero exit code 143
 2014-04-17 05:08:10,205 ERROR [CommitterEvent Processor #1] 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Could not 
 commit job
 java.io.IOException: Namenode is in startup mode
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6753) When one the Disk is full and all the volumes configured are unhealthy , then Datanode is not considering it as failure and datanode process is not shutting down .

2014-07-25 Thread J.Andreina (JIRA)

J.Andreina created HDFS-6753:


 Summary: When one the Disk is full and all the volumes configured 
are unhealthy , then Datanode is not considering it as failure and datanode 
process is not shutting down .
 Key: HDFS-6753
 URL: https://issues.apache.org/jira/browse/HDFS-6753
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: J.Andreina


Env Details :
=
Cluster has 3 Datanode
Cluster installed with Rex user
dfs.datanode.failed.volumes.tolerated  = 3
dfs.blockreport.intervalMsec  = 18000
dfs.datanode.directoryscan.interval = 120
DN_XX1.XX1.XX1.XX1 data dir = 
/mnt/tmp_Datanode,/home/REX/data/dfs1/data,/home/REX/data/dfs2/data,/opt/REX/dfs/data
 
 
/home/REX/data/dfs1/data,/home/REX/data/dfs2/data,/opt/REX/dfs/data - 
permission is denied ( hence DN considered the volume as failed )
 
Expected behavior is observed when disk is not full:

 
Step 1: Change the permissions of /mnt/tmp_Datanode to root
 
Step 2: Perform write operations ( DN detects that all Volume configured is 
failed and gets shutdown )
 
Scenario 1: 
===
 
Step 1 : Make /mnt/tmp_Datanode disk full and change the permissions to root
Step 2 : Perform client write operations ( disk full exception is thrown , but 
Datanode is not getting shutdown ,  eventhough all the volume configured has 
failed)
 
{noformat}
 
2014-07-21 14:10:52,814 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
XX1.XX1.XX1.XX1:50010:DataXceiver error processing WRITE_BLOCK operation  src: 
/XX2.XX2.XX2.XX2:10106 dst: /XX1.XX1.XX1.XX1:50010
 
org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of space: The 
volume with the most available space (=4096 B) is less than the block size 
(=134217728 B).
 
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.RoundRobinVolumeChoosingPolicy.chooseVolume(RoundRobinVolumeChoosingPolicy.java:60)
 
{noformat}
 
Observations :
==
1. Write operations does not shutdown Datanode , eventhough all the volume 
configured is failed ( When one of the disk is full and for all the disk 
permission is denied)
 
2. Directory scannning fails , still DN is not getting shutdown
 
 
 
{noformat}
 
2014-07-21 14:13:00,180 WARN 
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: Exception occured 
while compiling report: 
 
java.io.IOException: Invalid directory or I/O error occurred for dir: 
/mnt/tmp_Datanode/current/BP-1384489961-XX2.XX2.XX2.XX2-845784615183/current/finalized
 
at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1164)
 
at 
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner$ReportCompiler.compileReport(DirectoryScanner.java:596)
 
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files


[ 
https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074384#comment-14074384
 ] 

Hudson commented on HDFS-5919:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1842 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1842/])
HDFS-5919. FileJournalManager doesn't purge empty and corrupt inprogress edits 
files (vinayakumarb) (vinayakumarb: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613355)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java


 FileJournalManager doesn't purge empty and corrupt inprogress edits files
 -

 Key: HDFS-5919
 URL: https://issues.apache.org/jira/browse/HDFS-5919
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch


 FileJournalManager doesn't purge empty and corrupt inprogress edit files.
 These stale files will be accumulated over time.
 These should be cleared along with the purging of other edit logs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6723) New NN webUI no longer displays decommissioned state for dead node


[ 
https://issues.apache.org/jira/browse/HDFS-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074385#comment-14074385
 ] 

Hudson commented on HDFS-6723:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1842 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1842/])
HDFS-6723. New NN webUI no longer displays decommissioned state for dead node. 
Contributed by Ming Ma. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613220)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html


 New NN webUI no longer displays decommissioned state for dead node
 --

 Key: HDFS-6723
 URL: https://issues.apache.org/jira/browse/HDFS-6723
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 2.5.0

 Attachments: HDFS-6723.patch


 Somehow the new webUI doesn't show if a given dead node is decommissioned or 
 not. JMX does return the correct info. Perhaps some bug in dfshealth.html?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6657) Remove link to 'Legacy UI' in trunk's Namenode UI


[ 
https://issues.apache.org/jira/browse/HDFS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074387#comment-14074387
 ] 

Hudson commented on HDFS-6657:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1842 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1842/])
HDFS-6657. Remove link to 'Legacy UI' in trunk's Namenode UI. Contributed by 
Vinayakumar B. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613195)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/index.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/index.html


 Remove link to 'Legacy UI' in trunk's Namenode UI
 -

 Key: HDFS-6657
 URL: https://issues.apache.org/jira/browse/HDFS-6657
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
Priority: Minor
 Fix For: 3.0.0

 Attachments: HDFS-6657.patch, HDFS-6657.patch


 Link to 'Legacy UI' provided on namenode's UI.
 Since in trunk, all jsp pages are removed, these links will not work. can be 
 removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6715) webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode


[ 
https://issues.apache.org/jira/browse/HDFS-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074382#comment-14074382
 ] 

Hudson commented on HDFS-6715:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1842 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1842/])
HDFS-6715. Webhdfs wont fail over when it gets java.io.IOException: Namenode is 
in startup mode. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613237)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFSForHA.java


 webhdfs wont fail over when it gets java.io.IOException: Namenode is in 
 startup mode
 

 Key: HDFS-6715
 URL: https://issues.apache.org/jira/browse/HDFS-6715
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, webhdfs
Affects Versions: 2.2.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Fix For: 2.6.0

 Attachments: HDFS-6715.000.patch, HDFS-6715.001.patch


 Noticed in our HA testing when we run MR job with webhdfs file system we some 
 times run into 
 {code}
 2014-04-17 05:08:06,346 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
 report from attempt_1397710493213_0001_r_08_0: Container killed by the 
 ApplicationMaster.
 Container killed on request. Exit code is 143
 Container exited with a non-zero exit code 143
 2014-04-17 05:08:10,205 ERROR [CommitterEvent Processor #1] 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Could not 
 commit job
 java.io.IOException: Namenode is in startup mode
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6723) New NN webUI no longer displays decommissioned state for dead node


[ 
https://issues.apache.org/jira/browse/HDFS-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074414#comment-14074414
 ] 

Hudson commented on HDFS-6723:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1815 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1815/])
HDFS-6723. New NN webUI no longer displays decommissioned state for dead node. 
Contributed by Ming Ma. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613220)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html


 New NN webUI no longer displays decommissioned state for dead node
 --

 Key: HDFS-6723
 URL: https://issues.apache.org/jira/browse/HDFS-6723
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 2.5.0

 Attachments: HDFS-6723.patch


 Somehow the new webUI doesn't show if a given dead node is decommissioned or 
 not. JMX does return the correct info. Perhaps some bug in dfshealth.html?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6657) Remove link to 'Legacy UI' in trunk's Namenode UI


[ 
https://issues.apache.org/jira/browse/HDFS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074416#comment-14074416
 ] 

Hudson commented on HDFS-6657:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1815 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1815/])
HDFS-6657. Remove link to 'Legacy UI' in trunk's Namenode UI. Contributed by 
Vinayakumar B. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613195)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/index.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/index.html


 Remove link to 'Legacy UI' in trunk's Namenode UI
 -

 Key: HDFS-6657
 URL: https://issues.apache.org/jira/browse/HDFS-6657
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
Priority: Minor
 Fix For: 3.0.0

 Attachments: HDFS-6657.patch, HDFS-6657.patch


 Link to 'Legacy UI' provided on namenode's UI.
 Since in trunk, all jsp pages are removed, these links will not work. can be 
 removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files


[ 
https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074413#comment-14074413
 ] 

Hudson commented on HDFS-5919:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1815 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1815/])
HDFS-5919. FileJournalManager doesn't purge empty and corrupt inprogress edits 
files (vinayakumarb) (vinayakumarb: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613355)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java


 FileJournalManager doesn't purge empty and corrupt inprogress edits files
 -

 Key: HDFS-5919
 URL: https://issues.apache.org/jira/browse/HDFS-5919
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch


 FileJournalManager doesn't purge empty and corrupt inprogress edit files.
 These stale files will be accumulated over time.
 These should be cleared along with the purging of other edit logs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6715) webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode


[ 
https://issues.apache.org/jira/browse/HDFS-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074411#comment-14074411
 ] 

Hudson commented on HDFS-6715:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1815 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1815/])
HDFS-6715. Webhdfs wont fail over when it gets java.io.IOException: Namenode is 
in startup mode. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613237)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFSForHA.java


 webhdfs wont fail over when it gets java.io.IOException: Namenode is in 
 startup mode
 

 Key: HDFS-6715
 URL: https://issues.apache.org/jira/browse/HDFS-6715
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, webhdfs
Affects Versions: 2.2.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Fix For: 2.6.0

 Attachments: HDFS-6715.000.patch, HDFS-6715.001.patch


 Noticed in our HA testing when we run MR job with webhdfs file system we some 
 times run into 
 {code}
 2014-04-17 05:08:06,346 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
 report from attempt_1397710493213_0001_r_08_0: Container killed by the 
 ApplicationMaster.
 Container killed on request. Exit code is 143
 Container exited with a non-zero exit code 143
 2014-04-17 05:08:10,205 ERROR [CommitterEvent Processor #1] 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Could not 
 commit job
 java.io.IOException: Namenode is in startup mode
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer


[ 
https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074441#comment-14074441
 ] 

Charles Lamb commented on HDFS-6247:


30 seconds would be fine, or maybe even some time based on the socket timeout. 
Presently the socket timeout is a constant, but I could see that perhaps being 
turned into a configuration parameter in the future. How about .1 * 
socketTimeout for the hearbeat interval? Does that make sense?


 Avoid timeouts for replaceBlock() call by sending intermediate responses to 
 Balancer
 

 Key: HDFS-6247
 URL: https://issues.apache.org/jira/browse/HDFS-6247
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer, datanode
Affects Versions: 2.4.0
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, 
 HDFS-6247.patch


 Currently there is no response sent from target Datanode to Balancer for the 
 replaceBlock() calls.
 Since the Block movement for balancing is throttled, complete block movement 
 will take time and this could result in timeout at Balancer, which will be 
 trying to read the status message.
  
 To Avoid this during replaceBlock() call in in progress Datanode  can send 
 IN_PROGRESS status messages to Balancer to avoid timeouts and treat 
 BlockMovement as  failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6749) FSNamesystem#getXAttrs and listXAttrs should call resolvePath


 [ 
https://issues.apache.org/jira/browse/HDFS-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6749:
---

Attachment: HDFS-6749.002.patch

[~cnauroth],

Thanks for the review and that's a good point about adding a unit test. I've 
added calls to these methods in TestINodeFile and confirmed that each of them 
fails without the patch and passes with the patch.

 FSNamesystem#getXAttrs and listXAttrs should call resolvePath
 -

 Key: HDFS-6749
 URL: https://issues.apache.org/jira/browse/HDFS-6749
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6749.001.patch, HDFS-6749.002.patch


 FSNamesystem#getXAttrs and listXAttrs don't call FSDirectory#resolvePath. 
 They should.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Moved] (HDFS-6754) TestNamenodeCapacityReport.testXceiverCount may sometimes fail due to lack of retry


 [ 
https://issues.apache.org/jira/browse/HDFS-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai moved YARN-2358 to HDFS-6754:
---

 Target Version/s: 2.6.0  (was: 2.6.0)
Affects Version/s: (was: 2.6.0)
   2.6.0
  Key: HDFS-6754  (was: YARN-2358)
  Project: Hadoop HDFS  (was: Hadoop YARN)

 TestNamenodeCapacityReport.testXceiverCount may sometimes fail due to lack of 
 retry
 ---

 Key: HDFS-6754
 URL: https://issues.apache.org/jira/browse/HDFS-6754
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai

 I have seen TestNamenodeCapacityReport.testXceiverCount fail intermittently 
 in our nightly builds with the following error:
 {noformat}
 java.io.IOException: Unable to close file because the last block does not 
 have enough number of replicas.
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2151)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2119)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport.testXceiverCount(TestNamenodeCapacityReport.java:281)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6755) Make DFSOutputStream more efficient

Mit Desai created HDFS-6755:
---

 Summary: Make DFSOutputStream more efficient
 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai


Following code in DFSOutputStream may have an unnecessary sleep.

{code}
try {
  Thread.sleep(localTimeout);
  if (retries == 0) {
throw new IOException(Unable to close file because the last block
+  does not have enough number of replicas.);
  }
  retries--;
  localTimeout *= 2;
  if (Time.now() - localstart  5000) {
DFSClient.LOG.info(Could not complete  + src +  retrying...);
  }
} catch (InterruptedException ie) {
  DFSClient.LOG.warn(Caught exception , ie);
}
{code}

Currently, the code sleeps before throwing an exception which should not be the 
case.
The sleep time gets doubled on every iteration, which can make a significant 
effect if there are more than one iterations. We need to move the sleep down 
after decrementing retries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6755) Make DFSOutputStream more efficient


 [ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated HDFS-6755:


Issue Type: Improvement  (was: Bug)

 Make DFSOutputStream more efficient
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai

 Following code in DFSOutputStream may have an unnecessary sleep.
 {code}
 try {
   Thread.sleep(localTimeout);
   if (retries == 0) {
 throw new IOException(Unable to close file because the last 
 block
 +  does not have enough number of replicas.);
   }
   retries--;
   localTimeout *= 2;
   if (Time.now() - localstart  5000) {
 DFSClient.LOG.info(Could not complete  + src +  retrying...);
   }
 } catch (InterruptedException ie) {
   DFSClient.LOG.warn(Caught exception , ie);
 }
 {code}
 Currently, the code sleeps before throwing an exception which should not be 
 the case.
 The sleep time gets doubled on every iteration, which can make a significant 
 effect if there are more than one iterations. We need to move the sleep down 
 after decrementing retries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6755) Make DFSOutputStream more efficient


 [ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated HDFS-6755:


Description: 
Following code in DFSOutputStream may have an unnecessary sleep.

{code}
try {
  Thread.sleep(localTimeout);
  if (retries == 0) {
throw new IOException(Unable to close file because the last block
+  does not have enough number of replicas.);
  }
  retries--;
  localTimeout *= 2;
  if (Time.now() - localstart  5000) {
DFSClient.LOG.info(Could not complete  + src +  retrying...);
  }
} catch (InterruptedException ie) {
  DFSClient.LOG.warn(Caught exception , ie);
}
{code}

Currently, the code sleeps before throwing an exception which should not be the 
case.
The sleep time gets doubled on every iteration, which can make a significant 
effect if there are more than one iterations and it would sleep just to throw 
an exception. We need to move the sleep down after decrementing retries.

  was:
Following code in DFSOutputStream may have an unnecessary sleep.

{code}
try {
  Thread.sleep(localTimeout);
  if (retries == 0) {
throw new IOException(Unable to close file because the last block
+  does not have enough number of replicas.);
  }
  retries--;
  localTimeout *= 2;
  if (Time.now() - localstart  5000) {
DFSClient.LOG.info(Could not complete  + src +  retrying...);
  }
} catch (InterruptedException ie) {
  DFSClient.LOG.warn(Caught exception , ie);
}
{code}

Currently, the code sleeps before throwing an exception which should not be the 
case.
The sleep time gets doubled on every iteration, which can make a significant 
effect if there are more than one iterations. We need to move the sleep down 
after decrementing retries.


 Make DFSOutputStream more efficient
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai

 Following code in DFSOutputStream may have an unnecessary sleep.
 {code}
 try {
   Thread.sleep(localTimeout);
   if (retries == 0) {
 throw new IOException(Unable to close file because the last 
 block
 +  does not have enough number of replicas.);
   }
   retries--;
   localTimeout *= 2;
   if (Time.now() - localstart  5000) {
 DFSClient.LOG.info(Could not complete  + src +  retrying...);
   }
 } catch (InterruptedException ie) {
   DFSClient.LOG.warn(Caught exception , ie);
 }
 {code}
 Currently, the code sleeps before throwing an exception which should not be 
 the case.
 The sleep time gets doubled on every iteration, which can make a significant 
 effect if there are more than one iterations and it would sleep just to throw 
 an exception. We need to move the sleep down after decrementing retries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6750) The DataNode should use its shared memory segment to mark short-circuit replicas that have been unlinked as stale


[ 
https://issues.apache.org/jira/browse/HDFS-6750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074587#comment-14074587
 ] 

Colin Patrick McCabe commented on HDFS-6750:


Test failures are unrelated.  TestNamenodeCapacityReport failure is HDFS-6726.  
TestPipelinesFailover failure is HDFS-6694.  TestBlockRecovery failure is a 
port in use exception  (see   HDFS-4744).

 The DataNode should use its shared memory segment to mark short-circuit 
 replicas that have been unlinked as stale
 -

 Key: HDFS-6750
 URL: https://issues.apache.org/jira/browse/HDFS-6750
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6750.001.patch


 The DataNode should mark short-circuit replicas that have been unlinked as 
 stale.  This would prevent replicas that had been deleted from lingering in 
 the DFSClient cache.  (At least for DFSClients that use shared memory; those 
 without shared memory will still have to use the timeout method.)
 Note that when a replica is stale, any ongoing reads or mmaps can still 
 complete.  But stale replicas will be removed from the DFSClient cache once 
 they're no longer in use.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6729) Support maintenance mode for DN

[
https://issues.apache.org/jira/browse/HDFS-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074651#comment-14074651
]

Colin Patrick McCabe commented on HDFS-6729:

By default it takes 10 and a half minutes until the NameNode starts
re-replicating anything. With the stale DN feature turned on, applications
trying to read from the stale node will be re-directed, so the cluster won't
experience lag (or at least, not because of applications trying to contact the
node under maintenance).

So I guess the question is, is it worth adding another state in case the
maintenance on the datanode can't be finished in 10 minutes? On the upside, I
suppose it probably wouldn't be a lot of code. It would be very similar to the
stale datanode stuff we already implemented.

Support maintenance mode for DN
---

Key: HDFS-6729
URL: https://issues.apache.org/jira/browse/HDFS-6729
Project: Hadoop HDFS
Issue Type: Improvement
Components: datanode
Affects Versions: 2.4.0
Reporter: Lei (Eddy) Xu

Some maintenance works (e.g., upgrading RAM or add disks) on DataNode only
takes a short amount of time (e.g., 10 minutes). In these cases, the users do
not want to report missing blocks on this DN because the DN will be online
shortly without data lose. Thus, we need a maintenance mode for a DN so that
maintenance work can be carried out on the DN without having to decommission
it or the DN being marked as dead.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6742) Support sorting datanode list on the new NN webUI


[ 
https://issues.apache.org/jira/browse/HDFS-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074662#comment-14074662
 ] 

Arpit Agarwal commented on HDFS-6742:
-

Thanks Ming!

 Support sorting datanode list on the new NN webUI
 -

 Key: HDFS-6742
 URL: https://issues.apache.org/jira/browse/HDFS-6742
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma

 The legacy webUI allows sorting datanode list based on specific column such 
 as hostname. It is handy for admins can find pattern more quickly, especially 
 for big clusters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer

[
https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074674#comment-14074674
]

Vinayakumar B commented on HDFS-6247:
-

I feel 30 seconds or may be 0.5 * socketTimeout would be fine.
Since this socketTimeout is based on the configuration at client side, only it
can be assumed that this configuration is same in both client and datanode.
bq. How about .1 * socketTimeout for the hearbeat interval? Does that make
sense?
I feel making at same as socketTimeout can create timeout problems at the
boundary. Like datanode might send the response, but just before receiving that
response client might get timeout.
So I feel better will be half of it. i.e *0.5 * socketTimeout* will that be
fine?

Avoid timeouts for replaceBlock() call by sending intermediate responses to
Balancer

Key: HDFS-6247
URL: https://issues.apache.org/jira/browse/HDFS-6247
Project: Hadoop HDFS
Issue Type: Bug
Components: balancer, datanode
Affects Versions: 2.4.0
Reporter: Vinayakumar B
Assignee: Vinayakumar B
Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch,
HDFS-6247.patch

Currently there is no response sent from target Datanode to Balancer for the
replaceBlock() calls.
Since the Block movement for balancing is throttled, complete block movement
will take time and this could result in timeout at Balancer, which will be
trying to read the status message.

To Avoid this during replaceBlock() call in in progress Datanode can send
IN_PROGRESS status messages to Balancer to avoid timeouts and treat
BlockMovement as failed.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-583) HDFS should enforce a max block size


[ 
https://issues.apache.org/jira/browse/HDFS-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074690#comment-14074690
 ] 

Colin Patrick McCabe commented on HDFS-583:
---

Most places where we refer to block size uses long.  I'm not sure where we 
are limiting this (it would be good to document this somehow, if it is indeed 
going on.)

In general, enormous blocks haven't really been all that useful in the past, 
since they make it harder for execution frameworks to divide up work in a 
reasonable manner.  I can sort of see why you might want a limit in theory, but 
so far it hasn't really been a requested feature by anyone.  With or without 
giant blocks, evil clients can still fill up the DataNode, up to their 
designated quota.  Small blocks are probably more evil, but we limited those in 
HDFS-4305 when we introduced {{dfs.namenode.fs-limits.min-block-size}}.

 HDFS should enforce a max block size
 

 Key: HDFS-583
 URL: https://issues.apache.org/jira/browse/HDFS-583
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Hairong Kuang

 When DataNode creates a replica, it should enforce a max block size, so 
 clients can't go crazy. One way of enforcing this is to make 
 BlockWritesStreams to be filter steams that check the block size.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer


[ 
https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074693#comment-14074693
 ] 

Charles Lamb commented on HDFS-6247:


bq. II feel making at same as socketTimeout can create timeout problems at the 
boundary. 

Yes, I agree completely. That would be a bad thing.

bq. So I feel better will be half of it. i.e 0.5 * socketTimeout will that be 
fine?

Yes, that seems ok to me.


 Avoid timeouts for replaceBlock() call by sending intermediate responses to 
 Balancer
 

 Key: HDFS-6247
 URL: https://issues.apache.org/jira/browse/HDFS-6247
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer, datanode
Affects Versions: 2.4.0
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, 
 HDFS-6247.patch


 Currently there is no response sent from target Datanode to Balancer for the 
 replaceBlock() calls.
 Since the Block movement for balancing is throttled, complete block movement 
 will take time and this could result in timeout at Balancer, which will be 
 trying to read the status message.
  
 To Avoid this during replaceBlock() call in in progress Datanode  can send 
 IN_PROGRESS status messages to Balancer to avoid timeouts and treat 
 BlockMovement as  failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


[ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074694#comment-14074694
 ] 

Arpit Agarwal commented on HDFS-6752:
-

+1 for the patch.

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer


 [ 
https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-6247:


Attachment: HDFS-6247.patch

Attached the updated patch. 
Uses interval as {{0.5 * socketTimeout}} 

 Avoid timeouts for replaceBlock() call by sending intermediate responses to 
 Balancer
 

 Key: HDFS-6247
 URL: https://issues.apache.org/jira/browse/HDFS-6247
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer, datanode
Affects Versions: 2.4.0
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, 
 HDFS-6247.patch, HDFS-6247.patch


 Currently there is no response sent from target Datanode to Balancer for the 
 replaceBlock() calls.
 Since the Block movement for balancing is throttled, complete block movement 
 will take time and this could result in timeout at Balancer, which will be 
 trying to read the status message.
  
 To Avoid this during replaceBlock() call in in progress Datanode  can send 
 IN_PROGRESS status messages to Balancer to avoid timeouts and treat 
 BlockMovement as  failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


[ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074719#comment-14074719
 ] 

Vinayakumar B commented on HDFS-6752:
-

Thanks [~arpit99] for the review.
Will commit soon

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6755) Make DFSOutputStream more efficient


[ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074717#comment-14074717
 ] 

Colin Patrick McCabe commented on HDFS-6755:


The code here is using exponential backoff to wait for the {{NameNode}} to be 
available.  Getting rid of the sleep won't make anything more efficient... it 
will just increase the number of cases where a temporary network issue between 
a client and a NameNode causes a file close to fail.

 Make DFSOutputStream more efficient
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai

 Following code in DFSOutputStream may have an unnecessary sleep.
 {code}
 try {
   Thread.sleep(localTimeout);
   if (retries == 0) {
 throw new IOException(Unable to close file because the last 
 block
 +  does not have enough number of replicas.);
   }
   retries--;
   localTimeout *= 2;
   if (Time.now() - localstart  5000) {
 DFSClient.LOG.info(Could not complete  + src +  retrying...);
   }
 } catch (InterruptedException ie) {
   DFSClient.LOG.warn(Caught exception , ie);
 }
 {code}
 Currently, the code sleeps before throwing an exception which should not be 
 the case.
 The sleep time gets doubled on every iteration, which can make a significant 
 effect if there are more than one iterations and it would sleep just to throw 
 an exception. We need to move the sleep down after decrementing retries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


[ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074722#comment-14074722
 ] 

Vinayakumar B commented on HDFS-6752:
-

Oops!. Wrong tag. 
Thanks [~arpitagarwal] for the review.

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


[ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074729#comment-14074729
 ] 

Vinayakumar B commented on HDFS-6752:
-

Committed to trunk and branch-2

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


 [ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-6752:


   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


[ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074730#comment-14074730
 ] 

Hudson commented on HDFS-6752:
--

FAILURE: Integrated in Hadoop-trunk-Commit #5968 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5968/])
HDFS-6752. Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
(vinayakumarb) (vinayakumarb: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613486)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeConfig.java


 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6743) Put IP address into a new column on the new NN webUI

2014-07-25 Thread Chen He (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074734#comment-14074734
 ] 

Chen He commented on HDFS-6743:
---

Hi [~aw], are you working on this issue? If so, please assign youself as 
assignee. If not, I can work on this. Thanks!

 Put IP address into a new column on the new NN webUI
 

 Key: HDFS-6743
 URL: https://issues.apache.org/jira/browse/HDFS-6743
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma

 The new NN webUI combines hostname and IP into one column in datanode list. 
 It is more convenient for admins if the IP address can be put to a separate 
 column, as in the legacy NN webUI.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6755) Make DFSOutputStream more efficient


 [ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated HDFS-6755:


Attachment: HDFS-6755.patch

Hi [~cmccabe],
I did not mean to get rid of the sleep. I have uploaded the patch to indicate 
the change I wanted to make.
I wanted to throw an IOException if the {{retries == 0}} before 
{{Thread.sleep(localTimeout);}} is called.

Does that seem reasonable?

 Make DFSOutputStream more efficient
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: HDFS-6755.patch


 Following code in DFSOutputStream may have an unnecessary sleep.
 {code}
 try {
   Thread.sleep(localTimeout);
   if (retries == 0) {
 throw new IOException(Unable to close file because the last 
 block
 +  does not have enough number of replicas.);
   }
   retries--;
   localTimeout *= 2;
   if (Time.now() - localstart  5000) {
 DFSClient.LOG.info(Could not complete  + src +  retrying...);
   }
 } catch (InterruptedException ie) {
   DFSClient.LOG.warn(Caught exception , ie);
 }
 {code}
 Currently, the code sleeps before throwing an exception which should not be 
 the case.
 The sleep time gets doubled on every iteration, which can make a significant 
 effect if there are more than one iterations and it would sleep just to throw 
 an exception. We need to move the sleep down after decrementing retries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HDFS-6742) Support sorting datanode list on the new NN webUI

2014-07-25 Thread Chen He (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned HDFS-6742:
-

Assignee: Chen He

 Support sorting datanode list on the new NN webUI
 -

 Key: HDFS-6742
 URL: https://issues.apache.org/jira/browse/HDFS-6742
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma
Assignee: Chen He

 The legacy webUI allows sorting datanode list based on specific column such 
 as hostname. It is handy for admins can find pattern more quickly, especially 
 for big clusters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HDFS-6724) Decrypt EDEK before creating CryptoInputStream/CryptoOutputStream

2014-07-25 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-6724.
---

   Resolution: Fixed
Fix Version/s: fs-encryption (HADOOP-10150 and HDFS-6134)

 Decrypt EDEK before creating CryptoInputStream/CryptoOutputStream
 -

 Key: HDFS-6724
 URL: https://issues.apache.org/jira/browse/HDFS-6724
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Yi Liu
Assignee: Andrew Wang
 Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)

 Attachments: hdfs-6724.001.patch, hdfs-6724.002.patch, 
 hdfs-6724.003.patch


 In DFSClient, we need to decrypt EDEK before creating 
 CryptoInputStream/CryptoOutputStream, currently edek is used directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6724) Decrypt EDEK before creating CryptoInputStream/CryptoOutputStream

2014-07-25 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6724:
--

Attachment: hdfs-6724.003.patch

Thanks again for the reviews guys, attaching a final patch removing the KPCE 
change. Committed this to the branch.

 Decrypt EDEK before creating CryptoInputStream/CryptoOutputStream
 -

 Key: HDFS-6724
 URL: https://issues.apache.org/jira/browse/HDFS-6724
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Yi Liu
Assignee: Andrew Wang
 Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)

 Attachments: hdfs-6724.001.patch, hdfs-6724.002.patch, 
 hdfs-6724.003.patch


 In DFSClient, we need to decrypt EDEK before creating 
 CryptoInputStream/CryptoOutputStream, currently edek is used directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


[ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074758#comment-14074758
 ] 

Arpit Agarwal commented on HDFS-6752:
-

Thanks for fixing this [~vinayrpet].

Since it is a test-only fix I see no harm in merging it to branch-2.5 too.

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6755) Make DFSOutputStream more efficient


[ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074759#comment-14074759
 ] 

Colin Patrick McCabe commented on HDFS-6755:


Ah.  I misinterpreted your description.  I thought you wanted to get rid of the 
sleep completely.  But you only wanted to get rid of it for the case that we 
are not going to retry the connection to the NameNode.

+1 for the patch, pending jenkins.

 Make DFSOutputStream more efficient
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: HDFS-6755.patch


 Following code in DFSOutputStream may have an unnecessary sleep.
 {code}
 try {
   Thread.sleep(localTimeout);
   if (retries == 0) {
 throw new IOException(Unable to close file because the last 
 block
 +  does not have enough number of replicas.);
   }
   retries--;
   localTimeout *= 2;
   if (Time.now() - localstart  5000) {
 DFSClient.LOG.info(Could not complete  + src +  retrying...);
   }
 } catch (InterruptedException ie) {
   DFSClient.LOG.warn(Caught exception , ie);
 }
 {code}
 Currently, the code sleeps before throwing an exception which should not be 
 the case.
 The sleep time gets doubled on every iteration, which can make a significant 
 effect if there are more than one iterations and it would sleep just to throw 
 an exception. We need to move the sleep down after decrementing retries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


 [ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6752:


Target Version/s: 2.5.0
   Fix Version/s: (was: 2.6.0)
  2.5.0
  3.0.0
Release Note: Merged to branch-2.5 as r1613492.

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0, 2.4.1
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


 [ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6752:


Affects Version/s: 3.0.0

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0, 2.4.1
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


[ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074763#comment-14074763
 ] 

Arpit Agarwal commented on HDFS-6752:
-

Merged to branch-2.5 as r1613492.

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0, 2.4.1
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode


 [ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6755:
---

Description: DFSOutputStream#close has a loop where it tries to contact the 
NameNode, to call {{complete}} on the file which is open-for-write.  This loop 
includes a sleep which increases exponentially (exponential backoff).  It makes 
sense to sleep before re-contacting the NameNode, but the code also sleeps even 
in the case where it has already decided to give up and throw an exception back 
to the user.  It should not sleep after it has already decided to give up, 
since there's no point.  (was: Following code in DFSOutputStream may have an 
unnecessary sleep.

{code}
try {
  Thread.sleep(localTimeout);
  if (retries == 0) {
throw new IOException(Unable to close file because the last block
+  does not have enough number of replicas.);
  }
  retries--;
  localTimeout *= 2;
  if (Time.now() - localstart  5000) {
DFSClient.LOG.info(Could not complete  + src +  retrying...);
  }
} catch (InterruptedException ie) {
  DFSClient.LOG.warn(Caught exception , ie);
}
{code}

Currently, the code sleeps before throwing an exception which should not be the 
case.
The sleep time gets doubled on every iteration, which can make a significant 
effect if there are more than one iterations and it would sleep just to throw 
an exception. We need to move the sleep down after decrementing retries.)

 There is an unnecessary sleep in the code path where DFSOutputStream#close 
 gives up its attempt to contact the namenode
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: HDFS-6755.patch


 DFSOutputStream#close has a loop where it tries to contact the NameNode, to 
 call {{complete}} on the file which is open-for-write.  This loop includes a 
 sleep which increases exponentially (exponential backoff).  It makes sense to 
 sleep before re-contacting the NameNode, but the code also sleeps even in the 
 case where it has already decided to give up and throw an exception back to 
 the user.  It should not sleep after it has already decided to give up, since 
 there's no point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


 [ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6752:


Affects Version/s: 2.4.1

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0, 2.4.1
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


 [ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6752:


Component/s: test

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0, 2.4.1
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit


 [ 
https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6752:


Release Note:   (was: Merged to branch-2.5 as r1613492.)

 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit 
 -

 Key: HDFS-6752
 URL: https://issues.apache.org/jira/browse/HDFS-6752
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0, 2.4.1
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6752.patch


 Above test failed due to Address Bind Exception.
 Set the HTTP port to ephemeral port in Configuration.
 {noformat}java.net.BindException: Port in use: 0.0.0.0:50075
   at sun.nio.ch.Net.bind(Native Method)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970)
   at 
 org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode


 [ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6755:
---

Status: Patch Available  (was: Open)

 There is an unnecessary sleep in the code path where DFSOutputStream#close 
 gives up its attempt to contact the namenode
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: HDFS-6755.patch


 DFSOutputStream#close has a loop where it tries to contact the NameNode, to 
 call {{complete}} on the file which is open-for-write.  This loop includes a 
 sleep which increases exponentially (exponential backoff).  It makes sense to 
 sleep before re-contacting the NameNode, but the code also sleeps even in the 
 case where it has already decided to give up and throw an exception back to 
 the user.  It should not sleep after it has already decided to give up, since 
 there's no point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode


 [ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6755:
---

Summary: There is an unnecessary sleep in the code path where 
DFSOutputStream#close gives up its attempt to contact the namenode  (was: Make 
DFSOutputStream more efficient)

 There is an unnecessary sleep in the code path where DFSOutputStream#close 
 gives up its attempt to contact the namenode
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: HDFS-6755.patch


 Following code in DFSOutputStream may have an unnecessary sleep.
 {code}
 try {
   Thread.sleep(localTimeout);
   if (retries == 0) {
 throw new IOException(Unable to close file because the last 
 block
 +  does not have enough number of replicas.);
   }
   retries--;
   localTimeout *= 2;
   if (Time.now() - localstart  5000) {
 DFSClient.LOG.info(Could not complete  + src +  retrying...);
   }
 } catch (InterruptedException ie) {
   DFSClient.LOG.warn(Caught exception , ie);
 }
 {code}
 Currently, the code sleeps before throwing an exception which should not be 
 the case.
 The sleep time gets doubled on every iteration, which can make a significant 
 effect if there are more than one iterations and it would sleep just to throw 
 an exception. We need to move the sleep down after decrementing retries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6749) FSNamesystem#getXAttrs and listXAttrs should call resolvePath


[ 
https://issues.apache.org/jira/browse/HDFS-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074767#comment-14074767
 ] 

Hadoop QA commented on HDFS-6749:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12657840/HDFS-6749.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7465//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7465//console

This message is automatically generated.

 FSNamesystem#getXAttrs and listXAttrs should call resolvePath
 -

 Key: HDFS-6749
 URL: https://issues.apache.org/jira/browse/HDFS-6749
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6749.001.patch, HDFS-6749.002.patch


 FSNamesystem#getXAttrs and listXAttrs don't call FSDirectory#resolvePath. 
 They should.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6749) FSNamesystem#getXAttrs and listXAttrs should call resolvePath


[ 
https://issues.apache.org/jira/browse/HDFS-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074770#comment-14074770
 ] 

Charles Lamb commented on HDFS-6749:


The test failure appears to be unrelated.

 FSNamesystem#getXAttrs and listXAttrs should call resolvePath
 -

 Key: HDFS-6749
 URL: https://issues.apache.org/jira/browse/HDFS-6749
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6749.001.patch, HDFS-6749.002.patch


 FSNamesystem#getXAttrs and listXAttrs don't call FSDirectory#resolvePath. 
 They should.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB

Juan Yu created HDFS-6756:
-

 Summary: Default ipc.maximum.data.length should be increased to 
128MB from 64MB
 Key: HDFS-6756
 URL: https://issues.apache.org/jira/browse/HDFS-6756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-4449) When a decommission is awaiting closure of live blocks, show the block IDs on the NameNode's UI report

2014-07-25 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-4449:
--

Assignee: Yongjun Zhang  (was: Harsh J)

 When a decommission is awaiting closure of live blocks, show the block IDs on 
 the NameNode's UI report
 --

 Key: HDFS-4449
 URL: https://issues.apache.org/jira/browse/HDFS-4449
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Harsh J
Assignee: Yongjun Zhang

 It is rather common for people to be complaining about 'DN decommission' 
 hangs cause of live blocks waiting to get completed by some app (especially 
 certain HBase specifics cause a file to be open for a longer time, as 
 compared with MR/etc.).
 While they can see a count of the blocks that are live, we should add some 
 more details to that view. Particularly add the list of live blocks waiting 
 to be closed, so that a user may understand better on why its hung and also 
 be able to trace back the block to files manually if needed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-964) hdfs-default.xml shouldn't use hadoop.tmp.dir for dfs.data.dir (0.20 and lower) / dfs.datanode.dir (0.21 and up)


[ 
https://issues.apache.org/jira/browse/HDFS-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074787#comment-14074787
 ] 

Juan Yu commented on HDFS-964:
--

Hi [~aw], this seems a good idea, why is it marked as Won't Fix? 
any reason, does it cause incompatibility issue? 

 hdfs-default.xml shouldn't use hadoop.tmp.dir for dfs.data.dir (0.20 and 
 lower) / dfs.datanode.dir (0.21 and up)
 

 Key: HDFS-964
 URL: https://issues.apache.org/jira/browse/HDFS-964
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: HDFS-964.txt


 This question/problem pops up all the time.  Can we *please* eliminate 
 hadoop.tmp.dir's usage from the default in dfs.data.dir.  It is confusing to 
 new people and results in all sorts of weird accidents.  If we want the same 
 value, fine, but there are a lot of implied things by the variable re-use.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-223) Asynchronous IO Handling in Hadoop and HDFS

[
https://issues.apache.org/jira/browse/HDFS-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074837#comment-14074837
]

Colin Patrick McCabe commented on HDFS-223:
---

I think the existing thread-pool model kind of makes sense for the Datanode.
The DN has to compute checksums, which inevitably chews the CPU. You can't
really chew the CPU in a non-blocking way. Realistically, if you have 10 disks
and 4096 DN threads chugging along at once (the current
{{dfs.datanode.max.transfer.threads}}), you're going to have about 400
simultaneous operations per disk. It seems like the CPU consumption for CRC32
or hard disk bandwidth would become a bottleneck long before the number of I/O
threads was an issue.

Some of the scalability issues here were related to spending too much time
creating and tearing down TCP sockets, I think, and were solved by the socket
cache. Hedged reads also help with some of the DFSClient latency spikes
described here.

I think eventually we'll need to re-evaulate this in light of new technology.
But for right now, it's hard to see how we'd use non-blocking to get better
throughput on the DN (as far as I can see).

Asynchronous IO Handling in Hadoop and HDFS
---

Key: HDFS-223
URL: https://issues.apache.org/jira/browse/HDFS-223
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Raghu Angadi
Attachments: GrizzlyEchoServer.patch, MinaEchoServer.patch

I think Hadoop needs utilities or framework to make it simpler to deal with
generic asynchronous IO in Hadoop.
Example use case :
Its been a long standing problem that DataNode takes too many threads for
data transfers. Each write operation takes up 2 threads at each of the
datanodes and each read operation takes one irrespective of how much activity
is on the sockets. The kinds of load that HDFS serves has been expanding
quite fast and HDFS should handle these varied loads better. If there is a
framework for non-blocking IO, read and write pipeline state machines could
be implemented with async events on a fixed number of threads.
A generic utility is better since it could be used in other places like
DFSClient. DFSClient currently creates 2 extra threads for each file it has
open for writing.
Initially I started writing a primitive selector, then tried to see if such
facility already exists. [Apache MINA|http://mina.apache.org] seemed to do
exactly this. My impression after looking the the interface and examples is
that it does not give kind control we might prefer or need. First use case I
was thinking of implementing using MINA was to replace response handlers in
DataNode. The response handlers are simpler since they don't involve disk
I/O. I [asked on MINA user
list|http://www.nabble.com/Async-events-with-existing-NIO-sockets.-td18640767.html],
but looks like it can not be done, I think mainly because the sockets are
already created.
Essentially what I have in mind is similar to MINA, except that read and
write of the sockets is done by the event handlers. The lowest layer
essentially invokes selectors, invokes event handlers on single or on
multiple threads. Each event handler is is expected to do some non-blocking
work. We would of course have utility handler implementations that do read,
write, accept etc, that are useful for simple processing.
Sam Pullara mentioned that [xSockets|http://xsocket.sourceforge.net/] is more
flexible. It is under GPL.
Are there other such implementations we should look at?

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB


[ 
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074853#comment-14074853
 ] 

Arpit Agarwal commented on HDFS-6756:
-

Hi Juan, are you seeing any specific instance where the 64MB limit is a problem?

 Default ipc.maximum.data.length should be increased to 128MB from 64MB
 --

 Key: HDFS-6756
 URL: https://issues.apache.org/jira/browse/HDFS-6756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-223) Asynchronous IO Handling in Hadoop and HDFS

2014-07-25 Thread Andrew Purtell (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074856#comment-14074856
]

Andrew Purtell commented on HDFS-223:
-

Agreed. CPU becomes more of an issue if the block devices are (increasingly)
solid state. Reducing threading overheads would give you more headroom for work
like checksumming, therefore, throughput. I'm not sure how much improvement is
possible but it could be worth investigation. There are other options besides
rewriting the DataNode, you could look at something like Parallel Universe's
lightweight threading library.

Asynchronous IO Handling in Hadoop and HDFS
---

Key: HDFS-223
URL: https://issues.apache.org/jira/browse/HDFS-223
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Raghu Angadi
Attachments: GrizzlyEchoServer.patch, MinaEchoServer.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer


[ 
https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074909#comment-14074909
 ] 

Hadoop QA commented on HDFS-6247:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12657871/HDFS-6247.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7466//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7466//console

This message is automatically generated.

 Avoid timeouts for replaceBlock() call by sending intermediate responses to 
 Balancer
 

 Key: HDFS-6247
 URL: https://issues.apache.org/jira/browse/HDFS-6247
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer, datanode
Affects Versions: 2.4.0
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, 
 HDFS-6247.patch, HDFS-6247.patch


 Currently there is no response sent from target Datanode to Balancer for the 
 replaceBlock() calls.
 Since the Block movement for balancing is throttled, complete block movement 
 will take time and this could result in timeout at Balancer, which will be 
 trying to read the status message.
  
 To Avoid this during replaceBlock() call in in progress Datanode  can send 
 IN_PROGRESS status messages to Balancer to avoid timeouts and treat 
 BlockMovement as  failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6740) FSDataset adds data volumes dynamically

2014-07-25 Thread Lei (Eddy) Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-6740:


Attachment: HDFS-6740.000.patch

Upload patch that supports adding volumes to {{FsDatasetAsyncDiskService}}, 
{{FsVolumeList}} and {{FsDatasetImpl}}.

 FSDataset adds data volumes dynamically
 ---

 Key: HDFS-6740
 URL: https://issues.apache.org/jira/browse/HDFS-6740
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.1
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-6740.000.patch


 To support volume management in DN (HDFS-1362), it requires FSDatasetImpl to 
 be able to add volumes dynamically during runtime. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB


[ 
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074937#comment-14074937
 ] 

Juan Yu commented on HDFS-6756:
---

Got lots of messages like Requested data length 72293417 is longer than 
maximum configured RPC length 67108864 after upgrading. 
fsck shows there are thousands of under-replicated blocks
After increased the RPC length, the remaining messages cleared out. 

Though the default block size is 64M, 128M seems a more common setting. 
wouldn't 128M make more sense?

 Default ipc.maximum.data.length should be increased to 128MB from 64MB
 --

 Key: HDFS-6756
 URL: https://issues.apache.org/jira/browse/HDFS-6756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB


[ 
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074969#comment-14074969
 ] 

Arpit Agarwal commented on HDFS-6756:
-

Did you figure out which specific RPC call? Was it a block report? Also what 
version of Hadoop are you running?

We used to see this error message when the block count per DataNode would 
exceed roughly 6 Million. We fixed it in Apache Hadoop 2.4 by splitting block 
reports per storage. This error is likely a symptom of an underlying problem 
that needs to be fixed. A arge protocol message take seconds to process and can 
'freeze' the callee if there is a lock held while processing it.

As a last resort this limit can be increased on a cluster-specific basis. I 
don't think it is a good idea to just change the default.

 Default ipc.maximum.data.length should be increased to 128MB from 64MB
 --

 Key: HDFS-6756
 URL: https://issues.apache.org/jira/browse/HDFS-6756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB

[
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074969#comment-14074969
]

Arpit Agarwal edited comment on HDFS-6756 at 7/25/14 9:37 PM:
--

Did you figure out which specific RPC call? Was it a block report? Also what
version of Hadoop are you running?

We used to see this error message when the block count per DataNode would
exceed roughly 6 Million. We fixed it in v2.4 by splitting block reports per
storage. A large protocol message take seconds to process and can 'freeze' the
callee if there is a lock held while processing it.

As a last resort this limit can be increased on a cluster-specific basis. I
don't think it is a good idea to just change the default.

was (Author: arpitagarwal):
Did you figure out which specific RPC call? Was it a block report? Also what
version of Hadoop are you running?

We used to see this error message when the block count per DataNode would
exceed roughly 6 Million. We fixed it in Apache Hadoop 2.4 by splitting block
reports per storage. This error is likely a symptom of an underlying problem
that needs to be fixed. A arge protocol message take seconds to process and can
'freeze' the callee if there is a lock held while processing it.

As a last resort this limit can be increased on a cluster-specific basis. I
don't think it is a good idea to just change the default.

Default ipc.maximum.data.length should be increased to 128MB from 64MB
--

Key: HDFS-6756
URL: https://issues.apache.org/jira/browse/HDFS-6756
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6570) add api that enables checking if a user has certain permissions on a file

2014-07-25 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-6570:
---

Status: Open  (was: Patch Available)

 add api that enables checking if a user has certain permissions on a file
 -

 Key: HDFS-6570
 URL: https://issues.apache.org/jira/browse/HDFS-6570
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Jitendra Nath Pandey
 Attachments: HDFS-6570-prototype.1.patch, HDFS-6570.2.patch, 
 HDFS-6570.3.patch, HDFS-6570.4.patch


 For some of the authorization modes in Hive, the servers in Hive check if a 
 given user has permissions on a certain file or directory. For example, the 
 storage based authorization mode allows hive table metadata to be modified 
 only when the user has access to the corresponding table directory on hdfs. 
 There are likely to be such use cases outside of Hive as well.
 HDFS does not provide an api for such checks. As a result, the logic to check 
 if a user has permissions on a directory gets replicated in Hive. This 
 results in duplicate logic and there introduces possibilities for 
 inconsistencies in the interpretation of the permission model. This becomes a 
 bigger problem with the complexity of ACL logic.
 HDFS should provide an api that provides functionality that is similar to 
 access function in unistd.h - http://linux.die.net/man/2/access .



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode


[ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074996#comment-14074996
 ] 

Hadoop QA commented on HDFS-6755:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12657875/HDFS-6755.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7467//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7467//console

This message is automatically generated.

 There is an unnecessary sleep in the code path where DFSOutputStream#close 
 gives up its attempt to contact the namenode
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: HDFS-6755.patch


 DFSOutputStream#close has a loop where it tries to contact the NameNode, to 
 call {{complete}} on the file which is open-for-write.  This loop includes a 
 sleep which increases exponentially (exponential backoff).  It makes sense to 
 sleep before re-contacting the NameNode, but the code also sleeps even in the 
 case where it has already decided to give up and throw an exception back to 
 the user.  It should not sleep after it has already decided to give up, since 
 there's no point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6570) add api that enables checking if a user has certain permissions on a file

2014-07-25 Thread Jitendra Nath Pandey (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jitendra Nath Pandey updated HDFS-6570:
---

Attachment: HDFS-6570.4.patch

Thanks for the reviews Chris. Updated patch addresses the comments.

add api that enables checking if a user has certain permissions on a file
-

Key: HDFS-6570
URL: https://issues.apache.org/jira/browse/HDFS-6570
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Jitendra Nath Pandey
Attachments: HDFS-6570-prototype.1.patch, HDFS-6570.2.patch,
HDFS-6570.3.patch, HDFS-6570.4.patch

For some of the authorization modes in Hive, the servers in Hive check if a
given user has permissions on a certain file or directory. For example, the
storage based authorization mode allows hive table metadata to be modified
only when the user has access to the corresponding table directory on hdfs.
There are likely to be such use cases outside of Hive as well.
HDFS does not provide an api for such checks. As a result, the logic to check
if a user has permissions on a directory gets replicated in Hive. This
results in duplicate logic and there introduces possibilities for
inconsistencies in the interpretation of the permission model. This becomes a
bigger problem with the complexity of ACL logic.
HDFS should provide an api that provides functionality that is similar to
access function in unistd.h - http://linux.die.net/man/2/access .

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6570) add api that enables checking if a user has certain permissions on a file

2014-07-25 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-6570:
---

Status: Patch Available  (was: Open)

 add api that enables checking if a user has certain permissions on a file
 -

 Key: HDFS-6570
 URL: https://issues.apache.org/jira/browse/HDFS-6570
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Jitendra Nath Pandey
 Attachments: HDFS-6570-prototype.1.patch, HDFS-6570.2.patch, 
 HDFS-6570.3.patch, HDFS-6570.4.patch


 For some of the authorization modes in Hive, the servers in Hive check if a 
 given user has permissions on a certain file or directory. For example, the 
 storage based authorization mode allows hive table metadata to be modified 
 only when the user has access to the corresponding table directory on hdfs. 
 There are likely to be such use cases outside of Hive as well.
 HDFS does not provide an api for such checks. As a result, the logic to check 
 if a user has permissions on a directory gets replicated in Hive. This 
 results in duplicate logic and there introduces possibilities for 
 inconsistencies in the interpretation of the permission model. This becomes a 
 bigger problem with the complexity of ACL logic.
 HDFS should provide an api that provides functionality that is similar to 
 access function in unistd.h - http://linux.die.net/man/2/access .



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6729) Support maintenance mode for DN

2014-07-25 Thread Lei (Eddy) Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075005#comment-14075005
 ] 

Lei (Eddy) Xu commented on HDFS-6729:
-

[~aw] and [~cmccabe] Thanks for looking into this issue! 

We have customers encountering significant lag time between each decommissioned 
node (e.g., pulling data away from each other node), as described by 
[~cmccabe]. This significant lag time _blew users maintenance window_.

So, I am wondering whether it is possible to allow users to set a maintenance 
mode for DN for a given time (e.g., the user specifies the maintenance time as 
1 hour), after that if the DN does not come back, NN starts the normal 
re-replicate process?

 Support maintenance mode for DN
 ---

 Key: HDFS-6729
 URL: https://issues.apache.org/jira/browse/HDFS-6729
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.4.0
Reporter: Lei (Eddy) Xu

 Some maintenance works (e.g., upgrading RAM or add disks) on DataNode only 
 takes a short amount of time (e.g., 10 minutes). In these cases, the users do 
 not want to report missing blocks on this DN because the DN will be online 
 shortly without data lose. Thus, we need a maintenance mode for a DN so that 
 maintenance work can be carried out on the DN without having to decommission 
 it or the DN being marked as dead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB


[ 
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075019#comment-14075019
 ] 

Juan Yu commented on HDFS-6756:
---

[~arpitagarwal]
You're right, this is in block report, this is v2.0. the DN has large number of 
replicas.
Could you point out the JIRA that fixes this issue? thanks.

 Default ipc.maximum.data.length should be increased to 128MB from 64MB
 --

 Key: HDFS-6756
 URL: https://issues.apache.org/jira/browse/HDFS-6756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB


[ 
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075023#comment-14075023
 ] 

Arpit Agarwal commented on HDFS-6756:
-

HDFS-5153

 Default ipc.maximum.data.length should be increased to 128MB from 64MB
 --

 Key: HDFS-6756
 URL: https://issues.apache.org/jira/browse/HDFS-6756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode


[ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075029#comment-14075029
 ] 

Colin Patrick McCabe commented on HDFS-6755:


No new tests are needed, since this is just a one-line change moving a 
Thread.sleep in an error case.  Committing.  Thanks, Mit.

 There is an unnecessary sleep in the code path where DFSOutputStream#close 
 gives up its attempt to contact the namenode
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: HDFS-6755.patch


 DFSOutputStream#close has a loop where it tries to contact the NameNode, to 
 call {{complete}} on the file which is open-for-write.  This loop includes a 
 sleep which increases exponentially (exponential backoff).  It makes sense to 
 sleep before re-contacting the NameNode, but the code also sleeps even in the 
 case where it has already decided to give up and throw an exception back to 
 the user.  It should not sleep after it has already decided to give up, since 
 there's no point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB


[ 
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075037#comment-14075037
 ] 

Juan Yu commented on HDFS-6756:
---

Thx  a lot. 
One more question, The split is per storage directory / disk volume. 
Isn't there chance that a storage dir could still contain more than 10 million 
blocks in the future?

 Default ipc.maximum.data.length should be increased to 128MB from 64MB
 --

 Key: HDFS-6756
 URL: https://issues.apache.org/jira/browse/HDFS-6756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HDFS-3607) log a message when fuse_dfs is not built


 [ 
https://issues.apache.org/jira/browse/HDFS-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved HDFS-3607.


  Resolution: Fixed
Target Version/s:   (was: )

 log a message when fuse_dfs is not built
 

 Key: HDFS-3607
 URL: https://issues.apache.org/jira/browse/HDFS-3607
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: fuse-dfs
Affects Versions: 2.0.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor

 We should log a message when fuse_dfs is not built explaining why



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode


[ 
https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075050#comment-14075050
 ] 

Hudson commented on HDFS-6755:
--

FAILURE: Integrated in Hadoop-trunk-Commit #5971 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5971/])
HDFS-6755. There is an unnecessary sleep in the code path where 
DFSOutputStream#close gives up its attempt to contact the namenode (mitdesai21 
via cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613522)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java


 There is an unnecessary sleep in the code path where DFSOutputStream#close 
 gives up its attempt to contact the namenode
 ---

 Key: HDFS-6755
 URL: https://issues.apache.org/jira/browse/HDFS-6755
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: HDFS-6755.patch


 DFSOutputStream#close has a loop where it tries to contact the NameNode, to 
 call {{complete}} on the file which is open-for-write.  This loop includes a 
 sleep which increases exponentially (exponential backoff).  It makes sense to 
 sleep before re-contacting the NameNode, but the code also sleeps even in the 
 case where it has already decided to give up and throw an exception back to 
 the user.  It should not sleep after it has already decided to give up, since 
 there's no point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-3607) log a message when fuse_dfs is not built


[ 
https://issues.apache.org/jira/browse/HDFS-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075044#comment-14075044
 ] 

Colin Patrick McCabe commented on HDFS-3607:


We now do log a message when fuse isn't built.

from hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/CMakeLists.txt :

{code}
# Find Linux FUSE
IF (${CMAKE_SYSTEM_NAME} MATCHES Linux)
find_package(PkgConfig REQUIRED)
pkg_check_modules(FUSE fuse)
IF(FUSE_FOUND)
...
ELSE(FUSE_FOUND)
MESSAGE(STATUS Failed to find Linux FUSE libraries or include files.  
Will not build FUSE client.)
ENDIF(FUSE_FOUND)
ELSE (${CMAKE_SYSTEM_NAME} MATCHES Linux)
MESSAGE(STATUS Non-Linux system detected.  Will not build FUSE client.)
ENDIF (${CMAKE_SYSTEM_NAME} MATCHES Linux)
{code}

 log a message when fuse_dfs is not built
 

 Key: HDFS-3607
 URL: https://issues.apache.org/jira/browse/HDFS-3607
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: fuse-dfs
Affects Versions: 2.0.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor

 We should log a message when fuse_dfs is not built explaining why



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB


[ 
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075051#comment-14075051
 ] 

Arpit Agarwal commented on HDFS-6756:
-

Possible but unlikely in the near future at least. Even if you assume a 
conservative 32MB average block size you would need 300TB disks.

 Default ipc.maximum.data.length should be increased to 128MB from 64MB
 --

 Key: HDFS-6756
 URL: https://issues.apache.org/jira/browse/HDFS-6756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB


 [ 
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu resolved HDFS-6756.
---

Resolution: Invalid

 Default ipc.maximum.data.length should be increased to 128MB from 64MB
 --

 Key: HDFS-6756
 URL: https://issues.apache.org/jira/browse/HDFS-6756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-3607) log a message when fuse_dfs is not built


 [ 
https://issues.apache.org/jira/browse/HDFS-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3607:
---

Target Version/s: 2.0.2-alpha
   Fix Version/s: 2.0.2-alpha

 log a message when fuse_dfs is not built
 

 Key: HDFS-3607
 URL: https://issues.apache.org/jira/browse/HDFS-3607
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: fuse-dfs
Affects Versions: 2.0.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.0.2-alpha


 We should log a message when fuse_dfs is not built explaining why



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-3607) log a message when fuse_dfs is not built