[jira] [Commented] (HDFS-6742) Support sorting datanode list on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074126#comment-14074126 ] Ming Ma commented on HDFS-6742: --- Arpit, good point. Let me follow up. Support sorting datanode list on the new NN webUI - Key: HDFS-6742 URL: https://issues.apache.org/jira/browse/HDFS-6742 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma The legacy webUI allows sorting datanode list based on specific column such as hostname. It is handy for admins can find pattern more quickly, especially for big clusters. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6751) NN WebUI enhancements
Ming Ma created HDFS-6751: - Summary: NN WebUI enhancements Key: HDFS-6751 URL: https://issues.apache.org/jira/browse/HDFS-6751 Project: Hadoop HDFS Issue Type: Task Reporter: Ming Ma The umbrella jira for NN webUI enhancements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6742) Support sorting datanode list on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6742: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-6751 Support sorting datanode list on the new NN webUI - Key: HDFS-6742 URL: https://issues.apache.org/jira/browse/HDFS-6742 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma The legacy webUI allows sorting datanode list based on specific column such as hostname. It is handy for admins can find pattern more quickly, especially for big clusters. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6743) Put IP address into a new column on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6743: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-6751 Put IP address into a new column on the new NN webUI Key: HDFS-6743 URL: https://issues.apache.org/jira/browse/HDFS-6743 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma The new NN webUI combines hostname and IP into one column in datanode list. It is more convenient for admins if the IP address can be put to a separate column, as in the legacy NN webUI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6744) Improve decommissioning nodes and dead nodes access on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6744: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-6751 Improve decommissioning nodes and dead nodes access on the new NN webUI --- Key: HDFS-6744 URL: https://issues.apache.org/jira/browse/HDFS-6744 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma The new NN webUI lists live node at the top of the page, followed by dead node and decommissioning node. From admins point of view: 1. Decommissioning nodes and dead nodes are more interesting. It is better to move decommissioning nodes to the top of the page, followed by dead nodes and decommissioning nodes. 2. To find decommissioning nodes or dead nodes, the whole page that includes all nodes needs to be loaded. That could take some time for big clusters. The legacy web UI filters out the type of nodes dynamically. That seems to work well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6745) Display the list of very-under-replicated blocks as well as the files on NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6745: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-6751 Display the list of very-under-replicated blocks as well as the files on NN webUI --- Key: HDFS-6745 URL: https://issues.apache.org/jira/browse/HDFS-6745 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma Sometimes admins want to know the list of very-under-replicated blocks before major actions such as decommission; as these blocks are more likely to turn into missing blocks. very-under-replicated blocks are those blocks with live replica count of 1 and replicator factor of = 3. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6745) Display the list of very-under-replicated blocks as well as the files on NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074139#comment-14074139 ] Ming Ma commented on HDFS-6745: --- CLI will be good. Perhaps people can also specify input parameters such as the definition of very-under-replicated. The count of very-under-replicated blocks will be good for metrics system; how about the full text such as block IDs and file names? Display the list of very-under-replicated blocks as well as the files on NN webUI --- Key: HDFS-6745 URL: https://issues.apache.org/jira/browse/HDFS-6745 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma Sometimes admins want to know the list of very-under-replicated blocks before major actions such as decommission; as these blocks are more likely to turn into missing blocks. very-under-replicated blocks are those blocks with live replica count of 1 and replicator factor of = 3. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6746) Support datanode list pagination and filtering for big clusters on NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6746: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-6751 Support datanode list pagination and filtering for big clusters on NN webUI --- Key: HDFS-6746 URL: https://issues.apache.org/jira/browse/HDFS-6746 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma This isn't a major issue yet. Still it might be good to add support for pagination at some point and maybe some filtering. For example, that is useful to filter out live nodes that belong to the same rack. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6747) Display the most recent GC info on NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma resolved HDFS-6747. --- Resolution: Won't Fix Display the most recent GC info on NN webUI --- Key: HDFS-6747 URL: https://issues.apache.org/jira/browse/HDFS-6747 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma It will be handy if the recent GC information is available on NN webUI. admins don't need to dig out GC logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6722) Display readable last contact time for dead nodes on NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6722: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-6751 Display readable last contact time for dead nodes on NN webUI - Key: HDFS-6722 URL: https://issues.apache.org/jira/browse/HDFS-6722 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6722-2.patch, HDFS-6722.patch For dead node info on NN webUI, admins want to know when the nodes became dead, to troubleshoot missing block, etc. Currently the webUI displays the last contact as the unit of seconds since the last contact. It will be useful to display the info in Date format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074150#comment-14074150 ] Vinayakumar B commented on HDFS-5919: - Committed to trunk and branch-2. Thanks [~jingzhao] and [~umamaheswararao] for the reviews. FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-5919: Resolution: Fixed Fix Version/s: 2.6.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074154#comment-14074154 ] Hudson commented on HDFS-5919: -- FAILURE: Integrated in Hadoop-trunk-Commit #5965 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5965/]) HDFS-5919. FileJournalManager doesn't purge empty and corrupt inprogress edits files (vinayakumarb) (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613355) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074156#comment-14074156 ] Hadoop QA commented on HDFS-5919: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657766/HDFS-5919.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDatanodeConfig org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.TestGenericRefresh org.apache.hadoop.TestRefreshCallQueue org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7463//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7463//console This message is automatically generated. FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
Vinayakumar B created HDFS-6752: --- Summary: Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-6752: Attachment: HDFS-6752.patch Attaching the patch to configure datanode port as ephemeral Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-6752: Status: Patch Available (was: Open) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6747) Display the most recent GC info on NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074178#comment-14074178 ] Haohui Mai commented on HDFS-6747: -- I think that it will be quite handy to show some GC metrics on the UI. Arguably this is not the best way to operate the system, but GC configuration / metrics are the first things that we looked at whenever we encounter performance problems. It is quite helpful to have them on the UI to quickly diagnose performance issues. Display the most recent GC info on NN webUI --- Key: HDFS-6747 URL: https://issues.apache.org/jira/browse/HDFS-6747 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma It will be handy if the recent GC information is available on NN webUI. admins don't need to dig out GC logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074188#comment-14074188 ] Vinayakumar B commented on HDFS-5919: - Above failures are not related this patch. bq. org.apache.hadoop.hdfs.TestDatanodeConfig Above failure is due to AddressBindException, and jira issue raised HDFS-6752 to fix this. {quote}org.apache.hadoop.TestGenericRefresh org.apache.hadoop.TestRefreshCallQueue{quote} Above tests also related to AddressBindException bq. org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover This failure is due to HDFS-6694 bq. org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport This failures is observed in many of the recent builds. Need to find out the detailed reason. FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6709) Implement off-heap data structures for NameNode and other HDFS memory optimization
[ https://issues.apache.org/jira/browse/HDFS-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074189#comment-14074189 ] Kai Zheng commented on HDFS-6709: - I repeated the test in the post and sadly found it's true that DirectByteBuffer doesn't perform well as write. I'm communicating with Oracle about this and hopefully they could explain about it and address this in future Java version. It's interesting. Thanks. Implement off-heap data structures for NameNode and other HDFS memory optimization -- Key: HDFS-6709 URL: https://issues.apache.org/jira/browse/HDFS-6709 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6709.001.patch We should investigate implementing off-heap data structures for NameNode and other HDFS memory optimization. These data structures could reduce latency by avoiding the long GC times that occur with large Java heaps. We could also avoid per-object memory overheads and control memory layout a little bit better. This also would allow us to use the JVM's compressed oops optimization even with really large namespaces, if we could get the Java heap below 32 GB for those cases. This would provide another performance and memory efficiency boost. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4265) BKJM doesn't take advantage of speculative reads
[ https://issues.apache.org/jira/browse/HDFS-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074250#comment-14074250 ] Rakesh R commented on HDFS-4265: Hi [~umamaheswararao] Could you please have a look at this when you get sometime. Thanks! HDFS-4265 HDFS-4266 are dependent patches. After committing the first one, second issue need to be rebased. I’ll update the second patch after seeing the first issue in. BKJM doesn't take advantage of speculative reads Key: HDFS-4265 URL: https://issues.apache.org/jira/browse/HDFS-4265 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: 2.2.0 Reporter: Ivan Kelly Assignee: Rakesh R Attachments: 001-HDFS-4265.patch, 002-HDFS-4265.patch, 003-HDFS-4265.patch, 004-HDFS-4265.patch BookKeeperEditLogInputStream reads entry at a time, so it doesn't take advantage of the speculative read mechanism introduced by BOOKKEEPER-336. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074272#comment-14074272 ] Hadoop QA commented on HDFS-6752: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657790/HDFS-6752.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport org.apache.hadoop.hdfs.TestEncryptedTransfer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7464//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7464//console This message is automatically generated. Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6723) New NN webUI no longer displays decommissioned state for dead node
[ https://issues.apache.org/jira/browse/HDFS-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074293#comment-14074293 ] Hudson commented on HDFS-6723: -- FAILURE: Integrated in Hadoop-Yarn-trunk #623 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/623/]) HDFS-6723. New NN webUI no longer displays decommissioned state for dead node. Contributed by Ming Ma. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613220) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html New NN webUI no longer displays decommissioned state for dead node -- Key: HDFS-6723 URL: https://issues.apache.org/jira/browse/HDFS-6723 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Fix For: 2.5.0 Attachments: HDFS-6723.patch Somehow the new webUI doesn't show if a given dead node is decommissioned or not. JMX does return the correct info. Perhaps some bug in dfshealth.html? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074292#comment-14074292 ] Hudson commented on HDFS-5919: -- FAILURE: Integrated in Hadoop-Yarn-trunk #623 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/623/]) HDFS-5919. FileJournalManager doesn't purge empty and corrupt inprogress edits files (vinayakumarb) (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613355) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6657) Remove link to 'Legacy UI' in trunk's Namenode UI
[ https://issues.apache.org/jira/browse/HDFS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074295#comment-14074295 ] Hudson commented on HDFS-6657: -- FAILURE: Integrated in Hadoop-Yarn-trunk #623 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/623/]) HDFS-6657. Remove link to 'Legacy UI' in trunk's Namenode UI. Contributed by Vinayakumar B. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613195) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/index.html * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/index.html Remove link to 'Legacy UI' in trunk's Namenode UI - Key: HDFS-6657 URL: https://issues.apache.org/jira/browse/HDFS-6657 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Fix For: 3.0.0 Attachments: HDFS-6657.patch, HDFS-6657.patch Link to 'Legacy UI' provided on namenode's UI. Since in trunk, all jsp pages are removed, these links will not work. can be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6715) webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode
[ https://issues.apache.org/jira/browse/HDFS-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074290#comment-14074290 ] Hudson commented on HDFS-6715: -- FAILURE: Integrated in Hadoop-Yarn-trunk #623 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/623/]) HDFS-6715. Webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode. Contributed by Jing Zhao. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613237) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFSForHA.java webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode Key: HDFS-6715 URL: https://issues.apache.org/jira/browse/HDFS-6715 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.2.0 Reporter: Arpit Gupta Assignee: Jing Zhao Fix For: 2.6.0 Attachments: HDFS-6715.000.patch, HDFS-6715.001.patch Noticed in our HA testing when we run MR job with webhdfs file system we some times run into {code} 2014-04-17 05:08:06,346 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1397710493213_0001_r_08_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 2014-04-17 05:08:10,205 ERROR [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Could not commit job java.io.IOException: Namenode is in startup mode at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:525) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6753) When one the Disk is full and all the volumes configured are unhealthy , then Datanode is not considering it as failure and datanode process is not shutting down .
J.Andreina created HDFS-6753: Summary: When one the Disk is full and all the volumes configured are unhealthy , then Datanode is not considering it as failure and datanode process is not shutting down . Key: HDFS-6753 URL: https://issues.apache.org/jira/browse/HDFS-6753 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Env Details : = Cluster has 3 Datanode Cluster installed with Rex user dfs.datanode.failed.volumes.tolerated = 3 dfs.blockreport.intervalMsec = 18000 dfs.datanode.directoryscan.interval = 120 DN_XX1.XX1.XX1.XX1 data dir = /mnt/tmp_Datanode,/home/REX/data/dfs1/data,/home/REX/data/dfs2/data,/opt/REX/dfs/data /home/REX/data/dfs1/data,/home/REX/data/dfs2/data,/opt/REX/dfs/data - permission is denied ( hence DN considered the volume as failed ) Expected behavior is observed when disk is not full: Step 1: Change the permissions of /mnt/tmp_Datanode to root Step 2: Perform write operations ( DN detects that all Volume configured is failed and gets shutdown ) Scenario 1: === Step 1 : Make /mnt/tmp_Datanode disk full and change the permissions to root Step 2 : Perform client write operations ( disk full exception is thrown , but Datanode is not getting shutdown , eventhough all the volume configured has failed) {noformat} 2014-07-21 14:10:52,814 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: XX1.XX1.XX1.XX1:50010:DataXceiver error processing WRITE_BLOCK operation src: /XX2.XX2.XX2.XX2:10106 dst: /XX1.XX1.XX1.XX1:50010 org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of space: The volume with the most available space (=4096 B) is less than the block size (=134217728 B). at org.apache.hadoop.hdfs.server.datanode.fsdataset.RoundRobinVolumeChoosingPolicy.chooseVolume(RoundRobinVolumeChoosingPolicy.java:60) {noformat} Observations : == 1. Write operations does not shutdown Datanode , eventhough all the volume configured is failed ( When one of the disk is full and for all the disk permission is denied) 2. Directory scannning fails , still DN is not getting shutdown {noformat} 2014-07-21 14:13:00,180 WARN org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: Exception occured while compiling report: java.io.IOException: Invalid directory or I/O error occurred for dir: /mnt/tmp_Datanode/current/BP-1384489961-XX2.XX2.XX2.XX2-845784615183/current/finalized at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1164) at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner$ReportCompiler.compileReport(DirectoryScanner.java:596) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074384#comment-14074384 ] Hudson commented on HDFS-5919: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1842 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1842/]) HDFS-5919. FileJournalManager doesn't purge empty and corrupt inprogress edits files (vinayakumarb) (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613355) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6723) New NN webUI no longer displays decommissioned state for dead node
[ https://issues.apache.org/jira/browse/HDFS-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074385#comment-14074385 ] Hudson commented on HDFS-6723: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1842 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1842/]) HDFS-6723. New NN webUI no longer displays decommissioned state for dead node. Contributed by Ming Ma. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613220) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html New NN webUI no longer displays decommissioned state for dead node -- Key: HDFS-6723 URL: https://issues.apache.org/jira/browse/HDFS-6723 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Fix For: 2.5.0 Attachments: HDFS-6723.patch Somehow the new webUI doesn't show if a given dead node is decommissioned or not. JMX does return the correct info. Perhaps some bug in dfshealth.html? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6657) Remove link to 'Legacy UI' in trunk's Namenode UI
[ https://issues.apache.org/jira/browse/HDFS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074387#comment-14074387 ] Hudson commented on HDFS-6657: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1842 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1842/]) HDFS-6657. Remove link to 'Legacy UI' in trunk's Namenode UI. Contributed by Vinayakumar B. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613195) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/index.html * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/index.html Remove link to 'Legacy UI' in trunk's Namenode UI - Key: HDFS-6657 URL: https://issues.apache.org/jira/browse/HDFS-6657 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Fix For: 3.0.0 Attachments: HDFS-6657.patch, HDFS-6657.patch Link to 'Legacy UI' provided on namenode's UI. Since in trunk, all jsp pages are removed, these links will not work. can be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6715) webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode
[ https://issues.apache.org/jira/browse/HDFS-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074382#comment-14074382 ] Hudson commented on HDFS-6715: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1842 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1842/]) HDFS-6715. Webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode. Contributed by Jing Zhao. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613237) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFSForHA.java webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode Key: HDFS-6715 URL: https://issues.apache.org/jira/browse/HDFS-6715 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.2.0 Reporter: Arpit Gupta Assignee: Jing Zhao Fix For: 2.6.0 Attachments: HDFS-6715.000.patch, HDFS-6715.001.patch Noticed in our HA testing when we run MR job with webhdfs file system we some times run into {code} 2014-04-17 05:08:06,346 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1397710493213_0001_r_08_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 2014-04-17 05:08:10,205 ERROR [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Could not commit job java.io.IOException: Namenode is in startup mode at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:525) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6723) New NN webUI no longer displays decommissioned state for dead node
[ https://issues.apache.org/jira/browse/HDFS-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074414#comment-14074414 ] Hudson commented on HDFS-6723: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1815 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1815/]) HDFS-6723. New NN webUI no longer displays decommissioned state for dead node. Contributed by Ming Ma. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613220) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html New NN webUI no longer displays decommissioned state for dead node -- Key: HDFS-6723 URL: https://issues.apache.org/jira/browse/HDFS-6723 Project: Hadoop HDFS Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Fix For: 2.5.0 Attachments: HDFS-6723.patch Somehow the new webUI doesn't show if a given dead node is decommissioned or not. JMX does return the correct info. Perhaps some bug in dfshealth.html? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6657) Remove link to 'Legacy UI' in trunk's Namenode UI
[ https://issues.apache.org/jira/browse/HDFS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074416#comment-14074416 ] Hudson commented on HDFS-6657: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1815 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1815/]) HDFS-6657. Remove link to 'Legacy UI' in trunk's Namenode UI. Contributed by Vinayakumar B. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613195) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/index.html * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/index.html Remove link to 'Legacy UI' in trunk's Namenode UI - Key: HDFS-6657 URL: https://issues.apache.org/jira/browse/HDFS-6657 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Fix For: 3.0.0 Attachments: HDFS-6657.patch, HDFS-6657.patch Link to 'Legacy UI' provided on namenode's UI. Since in trunk, all jsp pages are removed, these links will not work. can be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5919) FileJournalManager doesn't purge empty and corrupt inprogress edits files
[ https://issues.apache.org/jira/browse/HDFS-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074413#comment-14074413 ] Hudson commented on HDFS-5919: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1815 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1815/]) HDFS-5919. FileJournalManager doesn't purge empty and corrupt inprogress edits files (vinayakumarb) (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613355) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java FileJournalManager doesn't purge empty and corrupt inprogress edits files - Key: HDFS-5919 URL: https://issues.apache.org/jira/browse/HDFS-5919 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-5919.patch, HDFS-5919.patch, HDFS-5919.patch FileJournalManager doesn't purge empty and corrupt inprogress edit files. These stale files will be accumulated over time. These should be cleared along with the purging of other edit logs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6715) webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode
[ https://issues.apache.org/jira/browse/HDFS-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074411#comment-14074411 ] Hudson commented on HDFS-6715: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1815 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1815/]) HDFS-6715. Webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode. Contributed by Jing Zhao. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613237) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFSForHA.java webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode Key: HDFS-6715 URL: https://issues.apache.org/jira/browse/HDFS-6715 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.2.0 Reporter: Arpit Gupta Assignee: Jing Zhao Fix For: 2.6.0 Attachments: HDFS-6715.000.patch, HDFS-6715.001.patch Noticed in our HA testing when we run MR job with webhdfs file system we some times run into {code} 2014-04-17 05:08:06,346 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1397710493213_0001_r_08_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 2014-04-17 05:08:10,205 ERROR [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Could not commit job java.io.IOException: Namenode is in startup mode at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:525) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer
[ https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074441#comment-14074441 ] Charles Lamb commented on HDFS-6247: 30 seconds would be fine, or maybe even some time based on the socket timeout. Presently the socket timeout is a constant, but I could see that perhaps being turned into a configuration parameter in the future. How about .1 * socketTimeout for the hearbeat interval? Does that make sense? Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer Key: HDFS-6247 URL: https://issues.apache.org/jira/browse/HDFS-6247 Project: Hadoop HDFS Issue Type: Bug Components: balancer, datanode Affects Versions: 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch Currently there is no response sent from target Datanode to Balancer for the replaceBlock() calls. Since the Block movement for balancing is throttled, complete block movement will take time and this could result in timeout at Balancer, which will be trying to read the status message. To Avoid this during replaceBlock() call in in progress Datanode can send IN_PROGRESS status messages to Balancer to avoid timeouts and treat BlockMovement as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6749) FSNamesystem#getXAttrs and listXAttrs should call resolvePath
[ https://issues.apache.org/jira/browse/HDFS-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6749: --- Attachment: HDFS-6749.002.patch [~cnauroth], Thanks for the review and that's a good point about adding a unit test. I've added calls to these methods in TestINodeFile and confirmed that each of them fails without the patch and passes with the patch. FSNamesystem#getXAttrs and listXAttrs should call resolvePath - Key: HDFS-6749 URL: https://issues.apache.org/jira/browse/HDFS-6749 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6749.001.patch, HDFS-6749.002.patch FSNamesystem#getXAttrs and listXAttrs don't call FSDirectory#resolvePath. They should. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Moved] (HDFS-6754) TestNamenodeCapacityReport.testXceiverCount may sometimes fail due to lack of retry
[ https://issues.apache.org/jira/browse/HDFS-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai moved YARN-2358 to HDFS-6754: --- Target Version/s: 2.6.0 (was: 2.6.0) Affects Version/s: (was: 2.6.0) 2.6.0 Key: HDFS-6754 (was: YARN-2358) Project: Hadoop HDFS (was: Hadoop YARN) TestNamenodeCapacityReport.testXceiverCount may sometimes fail due to lack of retry --- Key: HDFS-6754 URL: https://issues.apache.org/jira/browse/HDFS-6754 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai I have seen TestNamenodeCapacityReport.testXceiverCount fail intermittently in our nightly builds with the following error: {noformat} java.io.IOException: Unable to close file because the last block does not have enough number of replicas. at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2151) at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2119) at org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport.testXceiverCount(TestNamenodeCapacityReport.java:281) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6755) Make DFSOutputStream more efficient
Mit Desai created HDFS-6755: --- Summary: Make DFSOutputStream more efficient Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations. We need to move the sleep down after decrementing retries. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6755) Make DFSOutputStream more efficient
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated HDFS-6755: Issue Type: Improvement (was: Bug) Make DFSOutputStream more efficient --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations. We need to move the sleep down after decrementing retries. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6755) Make DFSOutputStream more efficient
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated HDFS-6755: Description: Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations and it would sleep just to throw an exception. We need to move the sleep down after decrementing retries. was: Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations. We need to move the sleep down after decrementing retries. Make DFSOutputStream more efficient --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations and it would sleep just to throw an exception. We need to move the sleep down after decrementing retries. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6750) The DataNode should use its shared memory segment to mark short-circuit replicas that have been unlinked as stale
[ https://issues.apache.org/jira/browse/HDFS-6750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074587#comment-14074587 ] Colin Patrick McCabe commented on HDFS-6750: Test failures are unrelated. TestNamenodeCapacityReport failure is HDFS-6726. TestPipelinesFailover failure is HDFS-6694. TestBlockRecovery failure is a port in use exception (see HDFS-4744). The DataNode should use its shared memory segment to mark short-circuit replicas that have been unlinked as stale - Key: HDFS-6750 URL: https://issues.apache.org/jira/browse/HDFS-6750 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6750.001.patch The DataNode should mark short-circuit replicas that have been unlinked as stale. This would prevent replicas that had been deleted from lingering in the DFSClient cache. (At least for DFSClients that use shared memory; those without shared memory will still have to use the timeout method.) Note that when a replica is stale, any ongoing reads or mmaps can still complete. But stale replicas will be removed from the DFSClient cache once they're no longer in use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6729) Support maintenance mode for DN
[ https://issues.apache.org/jira/browse/HDFS-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074651#comment-14074651 ] Colin Patrick McCabe commented on HDFS-6729: By default it takes 10 and a half minutes until the NameNode starts re-replicating anything. With the stale DN feature turned on, applications trying to read from the stale node will be re-directed, so the cluster won't experience lag (or at least, not because of applications trying to contact the node under maintenance). So I guess the question is, is it worth adding another state in case the maintenance on the datanode can't be finished in 10 minutes? On the upside, I suppose it probably wouldn't be a lot of code. It would be very similar to the stale datanode stuff we already implemented. Support maintenance mode for DN --- Key: HDFS-6729 URL: https://issues.apache.org/jira/browse/HDFS-6729 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Some maintenance works (e.g., upgrading RAM or add disks) on DataNode only takes a short amount of time (e.g., 10 minutes). In these cases, the users do not want to report missing blocks on this DN because the DN will be online shortly without data lose. Thus, we need a maintenance mode for a DN so that maintenance work can be carried out on the DN without having to decommission it or the DN being marked as dead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6742) Support sorting datanode list on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074662#comment-14074662 ] Arpit Agarwal commented on HDFS-6742: - Thanks Ming! Support sorting datanode list on the new NN webUI - Key: HDFS-6742 URL: https://issues.apache.org/jira/browse/HDFS-6742 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma The legacy webUI allows sorting datanode list based on specific column such as hostname. It is handy for admins can find pattern more quickly, especially for big clusters. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer
[ https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074674#comment-14074674 ] Vinayakumar B commented on HDFS-6247: - I feel 30 seconds or may be 0.5 * socketTimeout would be fine. Since this socketTimeout is based on the configuration at client side, only it can be assumed that this configuration is same in both client and datanode. bq. How about .1 * socketTimeout for the hearbeat interval? Does that make sense? I feel making at same as socketTimeout can create timeout problems at the boundary. Like datanode might send the response, but just before receiving that response client might get timeout. So I feel better will be half of it. i.e *0.5 * socketTimeout* will that be fine? Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer Key: HDFS-6247 URL: https://issues.apache.org/jira/browse/HDFS-6247 Project: Hadoop HDFS Issue Type: Bug Components: balancer, datanode Affects Versions: 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch Currently there is no response sent from target Datanode to Balancer for the replaceBlock() calls. Since the Block movement for balancing is throttled, complete block movement will take time and this could result in timeout at Balancer, which will be trying to read the status message. To Avoid this during replaceBlock() call in in progress Datanode can send IN_PROGRESS status messages to Balancer to avoid timeouts and treat BlockMovement as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-583) HDFS should enforce a max block size
[ https://issues.apache.org/jira/browse/HDFS-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074690#comment-14074690 ] Colin Patrick McCabe commented on HDFS-583: --- Most places where we refer to block size uses long. I'm not sure where we are limiting this (it would be good to document this somehow, if it is indeed going on.) In general, enormous blocks haven't really been all that useful in the past, since they make it harder for execution frameworks to divide up work in a reasonable manner. I can sort of see why you might want a limit in theory, but so far it hasn't really been a requested feature by anyone. With or without giant blocks, evil clients can still fill up the DataNode, up to their designated quota. Small blocks are probably more evil, but we limited those in HDFS-4305 when we introduced {{dfs.namenode.fs-limits.min-block-size}}. HDFS should enforce a max block size Key: HDFS-583 URL: https://issues.apache.org/jira/browse/HDFS-583 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Hairong Kuang When DataNode creates a replica, it should enforce a max block size, so clients can't go crazy. One way of enforcing this is to make BlockWritesStreams to be filter steams that check the block size. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer
[ https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074693#comment-14074693 ] Charles Lamb commented on HDFS-6247: bq. II feel making at same as socketTimeout can create timeout problems at the boundary. Yes, I agree completely. That would be a bad thing. bq. So I feel better will be half of it. i.e 0.5 * socketTimeout will that be fine? Yes, that seems ok to me. Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer Key: HDFS-6247 URL: https://issues.apache.org/jira/browse/HDFS-6247 Project: Hadoop HDFS Issue Type: Bug Components: balancer, datanode Affects Versions: 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch Currently there is no response sent from target Datanode to Balancer for the replaceBlock() calls. Since the Block movement for balancing is throttled, complete block movement will take time and this could result in timeout at Balancer, which will be trying to read the status message. To Avoid this during replaceBlock() call in in progress Datanode can send IN_PROGRESS status messages to Balancer to avoid timeouts and treat BlockMovement as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074694#comment-14074694 ] Arpit Agarwal commented on HDFS-6752: - +1 for the patch. Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer
[ https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-6247: Attachment: HDFS-6247.patch Attached the updated patch. Uses interval as {{0.5 * socketTimeout}} Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer Key: HDFS-6247 URL: https://issues.apache.org/jira/browse/HDFS-6247 Project: Hadoop HDFS Issue Type: Bug Components: balancer, datanode Affects Versions: 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch Currently there is no response sent from target Datanode to Balancer for the replaceBlock() calls. Since the Block movement for balancing is throttled, complete block movement will take time and this could result in timeout at Balancer, which will be trying to read the status message. To Avoid this during replaceBlock() call in in progress Datanode can send IN_PROGRESS status messages to Balancer to avoid timeouts and treat BlockMovement as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074719#comment-14074719 ] Vinayakumar B commented on HDFS-6752: - Thanks [~arpit99] for the review. Will commit soon Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6755) Make DFSOutputStream more efficient
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074717#comment-14074717 ] Colin Patrick McCabe commented on HDFS-6755: The code here is using exponential backoff to wait for the {{NameNode}} to be available. Getting rid of the sleep won't make anything more efficient... it will just increase the number of cases where a temporary network issue between a client and a NameNode causes a file close to fail. Make DFSOutputStream more efficient --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations and it would sleep just to throw an exception. We need to move the sleep down after decrementing retries. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074722#comment-14074722 ] Vinayakumar B commented on HDFS-6752: - Oops!. Wrong tag. Thanks [~arpitagarwal] for the review. Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074729#comment-14074729 ] Vinayakumar B commented on HDFS-6752: - Committed to trunk and branch-2 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-6752: Resolution: Fixed Fix Version/s: 2.6.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074730#comment-14074730 ] Hudson commented on HDFS-6752: -- FAILURE: Integrated in Hadoop-trunk-Commit #5968 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5968/]) HDFS-6752. Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit (vinayakumarb) (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613486) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeConfig.java Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6743) Put IP address into a new column on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074734#comment-14074734 ] Chen He commented on HDFS-6743: --- Hi [~aw], are you working on this issue? If so, please assign youself as assignee. If not, I can work on this. Thanks! Put IP address into a new column on the new NN webUI Key: HDFS-6743 URL: https://issues.apache.org/jira/browse/HDFS-6743 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma The new NN webUI combines hostname and IP into one column in datanode list. It is more convenient for admins if the IP address can be put to a separate column, as in the legacy NN webUI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6755) Make DFSOutputStream more efficient
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated HDFS-6755: Attachment: HDFS-6755.patch Hi [~cmccabe], I did not mean to get rid of the sleep. I have uploaded the patch to indicate the change I wanted to make. I wanted to throw an IOException if the {{retries == 0}} before {{Thread.sleep(localTimeout);}} is called. Does that seem reasonable? Make DFSOutputStream more efficient --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: HDFS-6755.patch Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations and it would sleep just to throw an exception. We need to move the sleep down after decrementing retries. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6742) Support sorting datanode list on the new NN webUI
[ https://issues.apache.org/jira/browse/HDFS-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned HDFS-6742: - Assignee: Chen He Support sorting datanode list on the new NN webUI - Key: HDFS-6742 URL: https://issues.apache.org/jira/browse/HDFS-6742 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma Assignee: Chen He The legacy webUI allows sorting datanode list based on specific column such as hostname. It is handy for admins can find pattern more quickly, especially for big clusters. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6724) Decrypt EDEK before creating CryptoInputStream/CryptoOutputStream
[ https://issues.apache.org/jira/browse/HDFS-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang resolved HDFS-6724. --- Resolution: Fixed Fix Version/s: fs-encryption (HADOOP-10150 and HDFS-6134) Decrypt EDEK before creating CryptoInputStream/CryptoOutputStream - Key: HDFS-6724 URL: https://issues.apache.org/jira/browse/HDFS-6724 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Yi Liu Assignee: Andrew Wang Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) Attachments: hdfs-6724.001.patch, hdfs-6724.002.patch, hdfs-6724.003.patch In DFSClient, we need to decrypt EDEK before creating CryptoInputStream/CryptoOutputStream, currently edek is used directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6724) Decrypt EDEK before creating CryptoInputStream/CryptoOutputStream
[ https://issues.apache.org/jira/browse/HDFS-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6724: -- Attachment: hdfs-6724.003.patch Thanks again for the reviews guys, attaching a final patch removing the KPCE change. Committed this to the branch. Decrypt EDEK before creating CryptoInputStream/CryptoOutputStream - Key: HDFS-6724 URL: https://issues.apache.org/jira/browse/HDFS-6724 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Yi Liu Assignee: Andrew Wang Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) Attachments: hdfs-6724.001.patch, hdfs-6724.002.patch, hdfs-6724.003.patch In DFSClient, we need to decrypt EDEK before creating CryptoInputStream/CryptoOutputStream, currently edek is used directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074758#comment-14074758 ] Arpit Agarwal commented on HDFS-6752: - Thanks for fixing this [~vinayrpet]. Since it is a test-only fix I see no harm in merging it to branch-2.5 too. Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6755) Make DFSOutputStream more efficient
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074759#comment-14074759 ] Colin Patrick McCabe commented on HDFS-6755: Ah. I misinterpreted your description. I thought you wanted to get rid of the sleep completely. But you only wanted to get rid of it for the case that we are not going to retry the connection to the NameNode. +1 for the patch, pending jenkins. Make DFSOutputStream more efficient --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: HDFS-6755.patch Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations and it would sleep just to throw an exception. We need to move the sleep down after decrementing retries. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6752: Target Version/s: 2.5.0 Fix Version/s: (was: 2.6.0) 2.5.0 3.0.0 Release Note: Merged to branch-2.5 as r1613492. Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.4.1 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6752: Affects Version/s: 3.0.0 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.4.1 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074763#comment-14074763 ] Arpit Agarwal commented on HDFS-6752: - Merged to branch-2.5 as r1613492. Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.4.1 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6755: --- Description: DFSOutputStream#close has a loop where it tries to contact the NameNode, to call {{complete}} on the file which is open-for-write. This loop includes a sleep which increases exponentially (exponential backoff). It makes sense to sleep before re-contacting the NameNode, but the code also sleeps even in the case where it has already decided to give up and throw an exception back to the user. It should not sleep after it has already decided to give up, since there's no point. (was: Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations and it would sleep just to throw an exception. We need to move the sleep down after decrementing retries.) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: HDFS-6755.patch DFSOutputStream#close has a loop where it tries to contact the NameNode, to call {{complete}} on the file which is open-for-write. This loop includes a sleep which increases exponentially (exponential backoff). It makes sense to sleep before re-contacting the NameNode, but the code also sleeps even in the case where it has already decided to give up and throw an exception back to the user. It should not sleep after it has already decided to give up, since there's no point. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6752: Affects Version/s: 2.4.1 Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.4.1 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6752: Component/s: test Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.4.1 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6752) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit
[ https://issues.apache.org/jira/browse/HDFS-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6752: Release Note: (was: Merged to branch-2.5 as r1613492.) Avoid Address bind errors in TestDatanodeConfig#testMemlockLimit - Key: HDFS-6752 URL: https://issues.apache.org/jira/browse/HDFS-6752 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.4.1 Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6752.patch Above test failed due to Address Bind Exception. Set the HTTP port to ephemeral port in Configuration. {noformat}java.net.BindException: Port in use: 0.0.0.0:50075 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:853) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:794) at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:473) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:856) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:379) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2046) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1933) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1980) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1970) at org.apache.hadoop.hdfs.TestDatanodeConfig.testMemlockLimit(TestDatanodeConfig.java:146){noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6755: --- Status: Patch Available (was: Open) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: HDFS-6755.patch DFSOutputStream#close has a loop where it tries to contact the NameNode, to call {{complete}} on the file which is open-for-write. This loop includes a sleep which increases exponentially (exponential backoff). It makes sense to sleep before re-contacting the NameNode, but the code also sleeps even in the case where it has already decided to give up and throw an exception back to the user. It should not sleep after it has already decided to give up, since there's no point. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6755: --- Summary: There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode (was: Make DFSOutputStream more efficient) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: HDFS-6755.patch Following code in DFSOutputStream may have an unnecessary sleep. {code} try { Thread.sleep(localTimeout); if (retries == 0) { throw new IOException(Unable to close file because the last block + does not have enough number of replicas.); } retries--; localTimeout *= 2; if (Time.now() - localstart 5000) { DFSClient.LOG.info(Could not complete + src + retrying...); } } catch (InterruptedException ie) { DFSClient.LOG.warn(Caught exception , ie); } {code} Currently, the code sleeps before throwing an exception which should not be the case. The sleep time gets doubled on every iteration, which can make a significant effect if there are more than one iterations and it would sleep just to throw an exception. We need to move the sleep down after decrementing retries. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6749) FSNamesystem#getXAttrs and listXAttrs should call resolvePath
[ https://issues.apache.org/jira/browse/HDFS-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074767#comment-14074767 ] Hadoop QA commented on HDFS-6749: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657840/HDFS-6749.002.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7465//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7465//console This message is automatically generated. FSNamesystem#getXAttrs and listXAttrs should call resolvePath - Key: HDFS-6749 URL: https://issues.apache.org/jira/browse/HDFS-6749 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6749.001.patch, HDFS-6749.002.patch FSNamesystem#getXAttrs and listXAttrs don't call FSDirectory#resolvePath. They should. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6749) FSNamesystem#getXAttrs and listXAttrs should call resolvePath
[ https://issues.apache.org/jira/browse/HDFS-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074770#comment-14074770 ] Charles Lamb commented on HDFS-6749: The test failure appears to be unrelated. FSNamesystem#getXAttrs and listXAttrs should call resolvePath - Key: HDFS-6749 URL: https://issues.apache.org/jira/browse/HDFS-6749 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6749.001.patch, HDFS-6749.002.patch FSNamesystem#getXAttrs and listXAttrs don't call FSDirectory#resolvePath. They should. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
Juan Yu created HDFS-6756: - Summary: Default ipc.maximum.data.length should be increased to 128MB from 64MB Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4449) When a decommission is awaiting closure of live blocks, show the block IDs on the NameNode's UI report
[ https://issues.apache.org/jira/browse/HDFS-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HDFS-4449: -- Assignee: Yongjun Zhang (was: Harsh J) When a decommission is awaiting closure of live blocks, show the block IDs on the NameNode's UI report -- Key: HDFS-4449 URL: https://issues.apache.org/jira/browse/HDFS-4449 Project: Hadoop HDFS Issue Type: Improvement Reporter: Harsh J Assignee: Yongjun Zhang It is rather common for people to be complaining about 'DN decommission' hangs cause of live blocks waiting to get completed by some app (especially certain HBase specifics cause a file to be open for a longer time, as compared with MR/etc.). While they can see a count of the blocks that are live, we should add some more details to that view. Particularly add the list of live blocks waiting to be closed, so that a user may understand better on why its hung and also be able to trace back the block to files manually if needed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-964) hdfs-default.xml shouldn't use hadoop.tmp.dir for dfs.data.dir (0.20 and lower) / dfs.datanode.dir (0.21 and up)
[ https://issues.apache.org/jira/browse/HDFS-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074787#comment-14074787 ] Juan Yu commented on HDFS-964: -- Hi [~aw], this seems a good idea, why is it marked as Won't Fix? any reason, does it cause incompatibility issue? hdfs-default.xml shouldn't use hadoop.tmp.dir for dfs.data.dir (0.20 and lower) / dfs.datanode.dir (0.21 and up) Key: HDFS-964 URL: https://issues.apache.org/jira/browse/HDFS-964 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.2 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HDFS-964.txt This question/problem pops up all the time. Can we *please* eliminate hadoop.tmp.dir's usage from the default in dfs.data.dir. It is confusing to new people and results in all sorts of weird accidents. If we want the same value, fine, but there are a lot of implied things by the variable re-use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-223) Asynchronous IO Handling in Hadoop and HDFS
[ https://issues.apache.org/jira/browse/HDFS-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074837#comment-14074837 ] Colin Patrick McCabe commented on HDFS-223: --- I think the existing thread-pool model kind of makes sense for the Datanode. The DN has to compute checksums, which inevitably chews the CPU. You can't really chew the CPU in a non-blocking way. Realistically, if you have 10 disks and 4096 DN threads chugging along at once (the current {{dfs.datanode.max.transfer.threads}}), you're going to have about 400 simultaneous operations per disk. It seems like the CPU consumption for CRC32 or hard disk bandwidth would become a bottleneck long before the number of I/O threads was an issue. Some of the scalability issues here were related to spending too much time creating and tearing down TCP sockets, I think, and were solved by the socket cache. Hedged reads also help with some of the DFSClient latency spikes described here. I think eventually we'll need to re-evaulate this in light of new technology. But for right now, it's hard to see how we'd use non-blocking to get better throughput on the DN (as far as I can see). Asynchronous IO Handling in Hadoop and HDFS --- Key: HDFS-223 URL: https://issues.apache.org/jira/browse/HDFS-223 Project: Hadoop HDFS Issue Type: New Feature Reporter: Raghu Angadi Attachments: GrizzlyEchoServer.patch, MinaEchoServer.patch I think Hadoop needs utilities or framework to make it simpler to deal with generic asynchronous IO in Hadoop. Example use case : Its been a long standing problem that DataNode takes too many threads for data transfers. Each write operation takes up 2 threads at each of the datanodes and each read operation takes one irrespective of how much activity is on the sockets. The kinds of load that HDFS serves has been expanding quite fast and HDFS should handle these varied loads better. If there is a framework for non-blocking IO, read and write pipeline state machines could be implemented with async events on a fixed number of threads. A generic utility is better since it could be used in other places like DFSClient. DFSClient currently creates 2 extra threads for each file it has open for writing. Initially I started writing a primitive selector, then tried to see if such facility already exists. [Apache MINA|http://mina.apache.org] seemed to do exactly this. My impression after looking the the interface and examples is that it does not give kind control we might prefer or need. First use case I was thinking of implementing using MINA was to replace response handlers in DataNode. The response handlers are simpler since they don't involve disk I/O. I [asked on MINA user list|http://www.nabble.com/Async-events-with-existing-NIO-sockets.-td18640767.html], but looks like it can not be done, I think mainly because the sockets are already created. Essentially what I have in mind is similar to MINA, except that read and write of the sockets is done by the event handlers. The lowest layer essentially invokes selectors, invokes event handlers on single or on multiple threads. Each event handler is is expected to do some non-blocking work. We would of course have utility handler implementations that do read, write, accept etc, that are useful for simple processing. Sam Pullara mentioned that [xSockets|http://xsocket.sourceforge.net/] is more flexible. It is under GPL. Are there other such implementations we should look at? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074853#comment-14074853 ] Arpit Agarwal commented on HDFS-6756: - Hi Juan, are you seeing any specific instance where the 64MB limit is a problem? Default ipc.maximum.data.length should be increased to 128MB from 64MB -- Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-223) Asynchronous IO Handling in Hadoop and HDFS
[ https://issues.apache.org/jira/browse/HDFS-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074856#comment-14074856 ] Andrew Purtell commented on HDFS-223: - Agreed. CPU becomes more of an issue if the block devices are (increasingly) solid state. Reducing threading overheads would give you more headroom for work like checksumming, therefore, throughput. I'm not sure how much improvement is possible but it could be worth investigation. There are other options besides rewriting the DataNode, you could look at something like Parallel Universe's lightweight threading library. Asynchronous IO Handling in Hadoop and HDFS --- Key: HDFS-223 URL: https://issues.apache.org/jira/browse/HDFS-223 Project: Hadoop HDFS Issue Type: New Feature Reporter: Raghu Angadi Attachments: GrizzlyEchoServer.patch, MinaEchoServer.patch I think Hadoop needs utilities or framework to make it simpler to deal with generic asynchronous IO in Hadoop. Example use case : Its been a long standing problem that DataNode takes too many threads for data transfers. Each write operation takes up 2 threads at each of the datanodes and each read operation takes one irrespective of how much activity is on the sockets. The kinds of load that HDFS serves has been expanding quite fast and HDFS should handle these varied loads better. If there is a framework for non-blocking IO, read and write pipeline state machines could be implemented with async events on a fixed number of threads. A generic utility is better since it could be used in other places like DFSClient. DFSClient currently creates 2 extra threads for each file it has open for writing. Initially I started writing a primitive selector, then tried to see if such facility already exists. [Apache MINA|http://mina.apache.org] seemed to do exactly this. My impression after looking the the interface and examples is that it does not give kind control we might prefer or need. First use case I was thinking of implementing using MINA was to replace response handlers in DataNode. The response handlers are simpler since they don't involve disk I/O. I [asked on MINA user list|http://www.nabble.com/Async-events-with-existing-NIO-sockets.-td18640767.html], but looks like it can not be done, I think mainly because the sockets are already created. Essentially what I have in mind is similar to MINA, except that read and write of the sockets is done by the event handlers. The lowest layer essentially invokes selectors, invokes event handlers on single or on multiple threads. Each event handler is is expected to do some non-blocking work. We would of course have utility handler implementations that do read, write, accept etc, that are useful for simple processing. Sam Pullara mentioned that [xSockets|http://xsocket.sourceforge.net/] is more flexible. It is under GPL. Are there other such implementations we should look at? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6247) Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer
[ https://issues.apache.org/jira/browse/HDFS-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074909#comment-14074909 ] Hadoop QA commented on HDFS-6247: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657871/HDFS-6247.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7466//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7466//console This message is automatically generated. Avoid timeouts for replaceBlock() call by sending intermediate responses to Balancer Key: HDFS-6247 URL: https://issues.apache.org/jira/browse/HDFS-6247 Project: Hadoop HDFS Issue Type: Bug Components: balancer, datanode Affects Versions: 2.4.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch, HDFS-6247.patch Currently there is no response sent from target Datanode to Balancer for the replaceBlock() calls. Since the Block movement for balancing is throttled, complete block movement will take time and this could result in timeout at Balancer, which will be trying to read the status message. To Avoid this during replaceBlock() call in in progress Datanode can send IN_PROGRESS status messages to Balancer to avoid timeouts and treat BlockMovement as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6740) FSDataset adds data volumes dynamically
[ https://issues.apache.org/jira/browse/HDFS-6740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6740: Attachment: HDFS-6740.000.patch Upload patch that supports adding volumes to {{FsDatasetAsyncDiskService}}, {{FsVolumeList}} and {{FsDatasetImpl}}. FSDataset adds data volumes dynamically --- Key: HDFS-6740 URL: https://issues.apache.org/jira/browse/HDFS-6740 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6740.000.patch To support volume management in DN (HDFS-1362), it requires FSDatasetImpl to be able to add volumes dynamically during runtime. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074937#comment-14074937 ] Juan Yu commented on HDFS-6756: --- Got lots of messages like Requested data length 72293417 is longer than maximum configured RPC length 67108864 after upgrading. fsck shows there are thousands of under-replicated blocks After increased the RPC length, the remaining messages cleared out. Though the default block size is 64M, 128M seems a more common setting. wouldn't 128M make more sense? Default ipc.maximum.data.length should be increased to 128MB from 64MB -- Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074969#comment-14074969 ] Arpit Agarwal commented on HDFS-6756: - Did you figure out which specific RPC call? Was it a block report? Also what version of Hadoop are you running? We used to see this error message when the block count per DataNode would exceed roughly 6 Million. We fixed it in Apache Hadoop 2.4 by splitting block reports per storage. This error is likely a symptom of an underlying problem that needs to be fixed. A arge protocol message take seconds to process and can 'freeze' the callee if there is a lock held while processing it. As a last resort this limit can be increased on a cluster-specific basis. I don't think it is a good idea to just change the default. Default ipc.maximum.data.length should be increased to 128MB from 64MB -- Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074969#comment-14074969 ] Arpit Agarwal edited comment on HDFS-6756 at 7/25/14 9:37 PM: -- Did you figure out which specific RPC call? Was it a block report? Also what version of Hadoop are you running? We used to see this error message when the block count per DataNode would exceed roughly 6 Million. We fixed it in v2.4 by splitting block reports per storage. A large protocol message take seconds to process and can 'freeze' the callee if there is a lock held while processing it. As a last resort this limit can be increased on a cluster-specific basis. I don't think it is a good idea to just change the default. was (Author: arpitagarwal): Did you figure out which specific RPC call? Was it a block report? Also what version of Hadoop are you running? We used to see this error message when the block count per DataNode would exceed roughly 6 Million. We fixed it in Apache Hadoop 2.4 by splitting block reports per storage. This error is likely a symptom of an underlying problem that needs to be fixed. A arge protocol message take seconds to process and can 'freeze' the callee if there is a lock held while processing it. As a last resort this limit can be increased on a cluster-specific basis. I don't think it is a good idea to just change the default. Default ipc.maximum.data.length should be increased to 128MB from 64MB -- Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6570) add api that enables checking if a user has certain permissions on a file
[ https://issues.apache.org/jira/browse/HDFS-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-6570: --- Status: Open (was: Patch Available) add api that enables checking if a user has certain permissions on a file - Key: HDFS-6570 URL: https://issues.apache.org/jira/browse/HDFS-6570 Project: Hadoop HDFS Issue Type: Bug Reporter: Thejas M Nair Assignee: Jitendra Nath Pandey Attachments: HDFS-6570-prototype.1.patch, HDFS-6570.2.patch, HDFS-6570.3.patch, HDFS-6570.4.patch For some of the authorization modes in Hive, the servers in Hive check if a given user has permissions on a certain file or directory. For example, the storage based authorization mode allows hive table metadata to be modified only when the user has access to the corresponding table directory on hdfs. There are likely to be such use cases outside of Hive as well. HDFS does not provide an api for such checks. As a result, the logic to check if a user has permissions on a directory gets replicated in Hive. This results in duplicate logic and there introduces possibilities for inconsistencies in the interpretation of the permission model. This becomes a bigger problem with the complexity of ACL logic. HDFS should provide an api that provides functionality that is similar to access function in unistd.h - http://linux.die.net/man/2/access . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074996#comment-14074996 ] Hadoop QA commented on HDFS-6755: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657875/HDFS-6755.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7467//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7467//console This message is automatically generated. There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: HDFS-6755.patch DFSOutputStream#close has a loop where it tries to contact the NameNode, to call {{complete}} on the file which is open-for-write. This loop includes a sleep which increases exponentially (exponential backoff). It makes sense to sleep before re-contacting the NameNode, but the code also sleeps even in the case where it has already decided to give up and throw an exception back to the user. It should not sleep after it has already decided to give up, since there's no point. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6570) add api that enables checking if a user has certain permissions on a file
[ https://issues.apache.org/jira/browse/HDFS-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-6570: --- Attachment: HDFS-6570.4.patch Thanks for the reviews Chris. Updated patch addresses the comments. add api that enables checking if a user has certain permissions on a file - Key: HDFS-6570 URL: https://issues.apache.org/jira/browse/HDFS-6570 Project: Hadoop HDFS Issue Type: Bug Reporter: Thejas M Nair Assignee: Jitendra Nath Pandey Attachments: HDFS-6570-prototype.1.patch, HDFS-6570.2.patch, HDFS-6570.3.patch, HDFS-6570.4.patch For some of the authorization modes in Hive, the servers in Hive check if a given user has permissions on a certain file or directory. For example, the storage based authorization mode allows hive table metadata to be modified only when the user has access to the corresponding table directory on hdfs. There are likely to be such use cases outside of Hive as well. HDFS does not provide an api for such checks. As a result, the logic to check if a user has permissions on a directory gets replicated in Hive. This results in duplicate logic and there introduces possibilities for inconsistencies in the interpretation of the permission model. This becomes a bigger problem with the complexity of ACL logic. HDFS should provide an api that provides functionality that is similar to access function in unistd.h - http://linux.die.net/man/2/access . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6570) add api that enables checking if a user has certain permissions on a file
[ https://issues.apache.org/jira/browse/HDFS-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-6570: --- Status: Patch Available (was: Open) add api that enables checking if a user has certain permissions on a file - Key: HDFS-6570 URL: https://issues.apache.org/jira/browse/HDFS-6570 Project: Hadoop HDFS Issue Type: Bug Reporter: Thejas M Nair Assignee: Jitendra Nath Pandey Attachments: HDFS-6570-prototype.1.patch, HDFS-6570.2.patch, HDFS-6570.3.patch, HDFS-6570.4.patch For some of the authorization modes in Hive, the servers in Hive check if a given user has permissions on a certain file or directory. For example, the storage based authorization mode allows hive table metadata to be modified only when the user has access to the corresponding table directory on hdfs. There are likely to be such use cases outside of Hive as well. HDFS does not provide an api for such checks. As a result, the logic to check if a user has permissions on a directory gets replicated in Hive. This results in duplicate logic and there introduces possibilities for inconsistencies in the interpretation of the permission model. This becomes a bigger problem with the complexity of ACL logic. HDFS should provide an api that provides functionality that is similar to access function in unistd.h - http://linux.die.net/man/2/access . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6729) Support maintenance mode for DN
[ https://issues.apache.org/jira/browse/HDFS-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075005#comment-14075005 ] Lei (Eddy) Xu commented on HDFS-6729: - [~aw] and [~cmccabe] Thanks for looking into this issue! We have customers encountering significant lag time between each decommissioned node (e.g., pulling data away from each other node), as described by [~cmccabe]. This significant lag time _blew users maintenance window_. So, I am wondering whether it is possible to allow users to set a maintenance mode for DN for a given time (e.g., the user specifies the maintenance time as 1 hour), after that if the DN does not come back, NN starts the normal re-replicate process? Support maintenance mode for DN --- Key: HDFS-6729 URL: https://issues.apache.org/jira/browse/HDFS-6729 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Some maintenance works (e.g., upgrading RAM or add disks) on DataNode only takes a short amount of time (e.g., 10 minutes). In these cases, the users do not want to report missing blocks on this DN because the DN will be online shortly without data lose. Thus, we need a maintenance mode for a DN so that maintenance work can be carried out on the DN without having to decommission it or the DN being marked as dead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075019#comment-14075019 ] Juan Yu commented on HDFS-6756: --- [~arpitagarwal] You're right, this is in block report, this is v2.0. the DN has large number of replicas. Could you point out the JIRA that fixes this issue? thanks. Default ipc.maximum.data.length should be increased to 128MB from 64MB -- Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075023#comment-14075023 ] Arpit Agarwal commented on HDFS-6756: - HDFS-5153 Default ipc.maximum.data.length should be increased to 128MB from 64MB -- Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075029#comment-14075029 ] Colin Patrick McCabe commented on HDFS-6755: No new tests are needed, since this is just a one-line change moving a Thread.sleep in an error case. Committing. Thanks, Mit. There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: HDFS-6755.patch DFSOutputStream#close has a loop where it tries to contact the NameNode, to call {{complete}} on the file which is open-for-write. This loop includes a sleep which increases exponentially (exponential backoff). It makes sense to sleep before re-contacting the NameNode, but the code also sleeps even in the case where it has already decided to give up and throw an exception back to the user. It should not sleep after it has already decided to give up, since there's no point. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075037#comment-14075037 ] Juan Yu commented on HDFS-6756: --- Thx a lot. One more question, The split is per storage directory / disk volume. Isn't there chance that a storage dir could still contain more than 10 million blocks in the future? Default ipc.maximum.data.length should be increased to 128MB from 64MB -- Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-3607) log a message when fuse_dfs is not built
[ https://issues.apache.org/jira/browse/HDFS-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HDFS-3607. Resolution: Fixed Target Version/s: (was: ) log a message when fuse_dfs is not built Key: HDFS-3607 URL: https://issues.apache.org/jira/browse/HDFS-3607 Project: Hadoop HDFS Issue Type: Improvement Components: fuse-dfs Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor We should log a message when fuse_dfs is not built explaining why -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6755) There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode
[ https://issues.apache.org/jira/browse/HDFS-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075050#comment-14075050 ] Hudson commented on HDFS-6755: -- FAILURE: Integrated in Hadoop-trunk-Commit #5971 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5971/]) HDFS-6755. There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode (mitdesai21 via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1613522) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java There is an unnecessary sleep in the code path where DFSOutputStream#close gives up its attempt to contact the namenode --- Key: HDFS-6755 URL: https://issues.apache.org/jira/browse/HDFS-6755 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: HDFS-6755.patch DFSOutputStream#close has a loop where it tries to contact the NameNode, to call {{complete}} on the file which is open-for-write. This loop includes a sleep which increases exponentially (exponential backoff). It makes sense to sleep before re-contacting the NameNode, but the code also sleeps even in the case where it has already decided to give up and throw an exception back to the user. It should not sleep after it has already decided to give up, since there's no point. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-3607) log a message when fuse_dfs is not built
[ https://issues.apache.org/jira/browse/HDFS-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075044#comment-14075044 ] Colin Patrick McCabe commented on HDFS-3607: We now do log a message when fuse isn't built. from hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/CMakeLists.txt : {code} # Find Linux FUSE IF (${CMAKE_SYSTEM_NAME} MATCHES Linux) find_package(PkgConfig REQUIRED) pkg_check_modules(FUSE fuse) IF(FUSE_FOUND) ... ELSE(FUSE_FOUND) MESSAGE(STATUS Failed to find Linux FUSE libraries or include files. Will not build FUSE client.) ENDIF(FUSE_FOUND) ELSE (${CMAKE_SYSTEM_NAME} MATCHES Linux) MESSAGE(STATUS Non-Linux system detected. Will not build FUSE client.) ENDIF (${CMAKE_SYSTEM_NAME} MATCHES Linux) {code} log a message when fuse_dfs is not built Key: HDFS-3607 URL: https://issues.apache.org/jira/browse/HDFS-3607 Project: Hadoop HDFS Issue Type: Improvement Components: fuse-dfs Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor We should log a message when fuse_dfs is not built explaining why -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075051#comment-14075051 ] Arpit Agarwal commented on HDFS-6756: - Possible but unlikely in the near future at least. Even if you assume a conservative 32MB average block size you would need 300TB disks. Default ipc.maximum.data.length should be increased to 128MB from 64MB -- Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6756) Default ipc.maximum.data.length should be increased to 128MB from 64MB
[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juan Yu resolved HDFS-6756. --- Resolution: Invalid Default ipc.maximum.data.length should be increased to 128MB from 64MB -- Key: HDFS-6756 URL: https://issues.apache.org/jira/browse/HDFS-6756 Project: Hadoop HDFS Issue Type: Bug Reporter: Juan Yu Assignee: Juan Yu Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-3607) log a message when fuse_dfs is not built
[ https://issues.apache.org/jira/browse/HDFS-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3607: --- Target Version/s: 2.0.2-alpha Fix Version/s: 2.0.2-alpha log a message when fuse_dfs is not built Key: HDFS-3607 URL: https://issues.apache.org/jira/browse/HDFS-3607 Project: Hadoop HDFS Issue Type: Improvement Components: fuse-dfs Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.2-alpha We should log a message when fuse_dfs is not built explaining why -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-3607) log a message when fuse_dfs is not built
[ https://issues.apache.org/jira/browse/HDFS-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075060#comment-14075060 ] Colin Patrick McCabe commented on HDFS-3607: This was fixed by HADOOP-8368. log a message when fuse_dfs is not built Key: HDFS-3607 URL: https://issues.apache.org/jira/browse/HDFS-3607 Project: Hadoop HDFS Issue Type: Improvement Components: fuse-dfs Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.2-alpha We should log a message when fuse_dfs is not built explaining why -- This message was sent by Atlassian JIRA (v6.2#6252)