[jira] [Commented] (HDFS-8716) introduce a new config specifically for safe mode block count
[ https://issues.apache.org/jira/browse/HDFS-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627589#comment-14627589 ] Hadoop QA commented on HDFS-8716: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 3s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 37s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 44s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 20s | The applied patch generated 1 new checkstyle issues (total was 676, now 676). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 28s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 4s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 56s | Tests failed in hadoop-hdfs. | | | | 204m 32s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.server.namenode.ha.TestDNFencing | | | hadoop.hdfs.TestDistributedFileSystem | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745376/HDFS-8716.7.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 0a16ee6 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11708/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11708/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11708/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11708/console | This message was automatically generated. > introduce a new config specifically for safe mode block count > - > > Key: HDFS-8716 > URL: https://issues.apache.org/jira/browse/HDFS-8716 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: HDFS-8716.1.patch, HDFS-8716.2.patch, HDFS-8716.3.patch, > HDFS-8716.4.patch, HDFS-8716.5.patch, HDFS-8716.6.patch, HDFS-8716.7.patch, > HDFS-8716.7.patch > > > During the start up, namenode waits for n replicas of each block to be > reported by datanodes before exiting the safe mode. Currently n is tied to > the min replicas config. We could set min replicas to more than one but we > might want to exit safe mode as soon as each block has one replica reported. > This can be worked out by introducing a new config variable for safe mode > block count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8762) Erasure Coding: the log of each streamer should show its index
[ https://issues.apache.org/jira/browse/HDFS-8762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-8762: Status: Patch Available (was: Open) > Erasure Coding: the log of each streamer should show its index > -- > > Key: HDFS-8762 > URL: https://issues.apache.org/jira/browse/HDFS-8762 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-8762-HDFS-7285-001.patch > > > The log in {{DataStreamer}} doesn't show which streamer it's generated from. > In order to make log information more convenient for debugging, each log > should include the index of the streamer it's generated from. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8762) Erasure Coding: the log of each streamer should show its index
[ https://issues.apache.org/jira/browse/HDFS-8762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-8762: Attachment: HDFS-8762-HDFS-7285-001.patch > Erasure Coding: the log of each streamer should show its index > -- > > Key: HDFS-8762 > URL: https://issues.apache.org/jira/browse/HDFS-8762 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-8762-HDFS-7285-001.patch > > > The log in {{DataStreamer}} doesn't show which streamer it's generated from. > In order to make log information more convenient for debugging, each log > should include the index of the streamer it's generated from. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8742) Inotify: Support event for OP_TRUNCATE
[ https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627585#comment-14627585 ] Surendra Singh Lilhore commented on HDFS-8742: -- Thanks [~ajisakaa] for review and commit > Inotify: Support event for OP_TRUNCATE > -- > > Key: HDFS-8742 > URL: https://issues.apache.org/jira/browse/HDFS-8742 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8742-001.patch, HDFS-8742.patch > > > Currently inotify is not giving any event for Truncate operation. NN should > send event for "Truncate". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6945) BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed
[ https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627539#comment-14627539 ] Akira AJISAKA commented on HDFS-6945: - If my understand is correct, HDFS-8616 suggests that this issue should be cherry-picked to branch-2.7. I'd like to do this on July 18 if there are no further comment. > BlockManager should remove a block from excessReplicateMap and decrement > ExcessBlocks metric when the block is removed > -- > > Key: HDFS-6945 > URL: https://issues.apache.org/jira/browse/HDFS-6945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.5.0 >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA >Priority: Critical > Labels: metrics > Fix For: 2.8.0 > > Attachments: HDFS-6945-003.patch, HDFS-6945-004.patch, > HDFS-6945-005.patch, HDFS-6945.2.patch, HDFS-6945.patch > > > I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, > however, there are no over-replicated blocks (confirmed by fsck). > After a further research, I noticed when deleting a block, BlockManager does > not remove the block from excessReplicateMap or decrement excessBlocksCount. > Usually the metric is decremented when processing block report, however, if > the block has been deleted, BlockManager does not remove the block from > excessReplicateMap or decrement the metric. > That way the metric and excessReplicateMap can increase infinitely (i.e. > memory leak can occur). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-8616) Cherry pick HDFS-6495 for excess block leak
[ https://issues.apache.org/jira/browse/HDFS-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA reassigned HDFS-8616: --- Assignee: Akira AJISAKA > Cherry pick HDFS-6495 for excess block leak > --- > > Key: HDFS-8616 > URL: https://issues.apache.org/jira/browse/HDFS-8616 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Akira AJISAKA > > Busy clusters quickly leak tens or hundreds of thousands of excess blocks > which slow BR processing. HDFS-6495 should be cherry picked into 2.7.x. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8616) Cherry pick HDFS-6495 for excess block leak
[ https://issues.apache.org/jira/browse/HDFS-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627537#comment-14627537 ] Akira AJISAKA commented on HDFS-8616: - I'd like to cherry-pick HDFS-6945 to branch-2.7 on July 18 if there are no further comment. > Cherry pick HDFS-6495 for excess block leak > --- > > Key: HDFS-8616 > URL: https://issues.apache.org/jira/browse/HDFS-8616 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp > > Busy clusters quickly leak tens or hundreds of thousands of excess blocks > which slow BR processing. HDFS-6495 should be cherry picked into 2.7.x. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2
[ https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627512#comment-14627512 ] Duo Zhang commented on HDFS-7966: - I'd say sorry... The performance result above is useless since the flow control part of my code does not work at that time... I found it when I tried to transfer 512MB block-I got an OOM... I have rewritten the flow control part, and setup a cluster with 1 NN and DN to evaluate the performance. There is a netty bug(https://github.com/netty/netty/pull/3929) so I need to modify my code when running different tests. The performance test code is here https://github.com/Apache9/hadoop/blob/HDFS-7966-POC/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/http2/PerformanceTest.java First I ran a large read test with 1 file with a 1GB block. Each ran 5 times with the command {noformat} ./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 1 1 1073741824 ./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 1 1 1073741824 {noformat} Note that I set {{dfs.datanode.transferTo.allowed}} to {{false}} since http2 implementation can not use transferTo(I'm currently working on implementing {{FileRegion}} support in netty-http2-codec, see https://github.com/netty/netty/issues/3927) The result is {noformat} *** time based on http2 9953 *** time based on http2 9967 *** time based on http2 9954 *** time based on http2 9985 *** time based on http2 9976 *** time based on tcp 9383 *** time based on tcp 9375 *** time based on tcp 9377 *** time based on tcp 9373 *** time based on tcp 9376 {noformat} The average latency of http2 is 9967ms, and for tcp it is 9376.8ms. 9967/9376.8=1.063, so http2 is about 6% slow than tcp. I think this is an acceptable result? Let me test small read later and post the result here. Thanks. > New Data Transfer Protocol via HTTP/2 > - > > Key: HDFS-7966 > URL: https://issues.apache.org/jira/browse/HDFS-7966 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Haohui Mai >Assignee: Qianqian Shi > Labels: gsoc, gsoc2015, mentor > Attachments: GSoC2015_Proposal.pdf, > TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg > > > The current Data Transfer Protocol (DTP) implements a rich set of features > that span across multiple layers, including: > * Connection pooling and authentication (session layer) > * Encryption (presentation layer) > * Data writing pipeline (application layer) > All these features are HDFS-specific and defined by implementation. As a > result it requires non-trivial amount of work to implement HDFS clients and > servers. > This jira explores to delegate the responsibilities of the session and > presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles > connection multiplexing, QoS, authentication and encryption, reducing the > scope of DTP to the application layer only. By leveraging the existing HTTP/2 > library, it should simplify the implementation of both HDFS clients and > servers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files
[ https://issues.apache.org/jira/browse/HDFS-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8777: Assignee: Rakesh R > Erasure Coding: add tests for taking snapshots on EC files > -- > > Key: HDFS-8777 > URL: https://issues.apache.org/jira/browse/HDFS-8777 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Rakesh R > > We need to add more tests for (EC + snapshots). The tests need to verify the > fsimage saving/loading is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files
[ https://issues.apache.org/jira/browse/HDFS-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627500#comment-14627500 ] Jing Zhao commented on HDFS-8777: - Yes. Let me assign this jira to you. > Erasure Coding: add tests for taking snapshots on EC files > -- > > Key: HDFS-8777 > URL: https://issues.apache.org/jira/browse/HDFS-8777 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao > > We need to add more tests for (EC + snapshots). The tests need to verify the > fsimage saving/loading is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8260) Erasure Coding: system test of writing EC file
[ https://issues.apache.org/jira/browse/HDFS-8260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627488#comment-14627488 ] Xinwei Qin commented on HDFS-8260: --- Hi, [~demongaorui], Maybe I incorrectly understand what you mean. These jiras under the umbrella(HDFS8197) need system test results in real cluster, but not the patch in code. So, the attached patch does not match your intention, right? > Erasure Coding: system test of writing EC file > --- > > Key: HDFS-8260 > URL: https://issues.apache.org/jira/browse/HDFS-8260 > Project: Hadoop HDFS > Issue Type: Test >Affects Versions: HDFS-7285 >Reporter: GAO Rui >Assignee: Xinwei Qin > Attachments: HDFS-8260-HDFS-7285.001.patch > > > 1. Normally writing EC file(writing without datanote failure) > 2. Writing EC file with tolerable number of datanodes failing. > 3. Writing EC file with intolerable number of datanodes failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8260) Erasure Coding: system test of writing EC file
[ https://issues.apache.org/jira/browse/HDFS-8260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinwei Qin updated HDFS-8260: -- Attachment: HDFS-8260-HDFS-7285.001.patch Attach the system test patch of writing EC file with some datanodes failing. Some tests cannot pass due to the issues(HDFS-8704, HDFS-8383) have not been fixed. > Erasure Coding: system test of writing EC file > --- > > Key: HDFS-8260 > URL: https://issues.apache.org/jira/browse/HDFS-8260 > Project: Hadoop HDFS > Issue Type: Test >Affects Versions: HDFS-7285 >Reporter: GAO Rui >Assignee: Xinwei Qin > Attachments: HDFS-8260-HDFS-7285.001.patch > > > 1. Normally writing EC file(writing without datanote failure) > 2. Writing EC file with tolerable number of datanodes failing. > 3. Writing EC file with intolerable number of datanodes failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8778) TestBlockReportRateLimiting#testLeaseExpiration can deadlock
[ https://issues.apache.org/jira/browse/HDFS-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627453#comment-14627453 ] Hadoop QA commented on HDFS-8778: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 7m 40s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 37s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 19s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 24s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 5s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 161m 27s | Tests failed in hadoop-hdfs. | | | | 184m 0s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745354/HDFS-8778.01.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 0a16ee6 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11707/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11707/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11707/console | This message was automatically generated. > TestBlockReportRateLimiting#testLeaseExpiration can deadlock > > > Key: HDFS-8778 > URL: https://issues.apache.org/jira/browse/HDFS-8778 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.1 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8778.01.patch > > > {{requestBlockReportLease}} blocks on DataNode registration while holding the > NameSystem read lock. > DataNode registration can block on the NameSystem read lock if a writer gets > in the queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8716) introduce a new config specifically for safe mode block count
[ https://issues.apache.org/jira/browse/HDFS-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-8716: --- Attachment: HDFS-8716.7.patch > introduce a new config specifically for safe mode block count > - > > Key: HDFS-8716 > URL: https://issues.apache.org/jira/browse/HDFS-8716 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: HDFS-8716.1.patch, HDFS-8716.2.patch, HDFS-8716.3.patch, > HDFS-8716.4.patch, HDFS-8716.5.patch, HDFS-8716.6.patch, HDFS-8716.7.patch, > HDFS-8716.7.patch > > > During the start up, namenode waits for n replicas of each block to be > reported by datanodes before exiting the safe mode. Currently n is tied to > the min replicas config. We could set min replicas to more than one but we > might want to exit safe mode as soon as each block has one replica reported. > This can be worked out by introducing a new config variable for safe mode > block count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8763) Incremental blockreport order may replicate unnecessary block
[ https://issues.apache.org/jira/browse/HDFS-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627439#comment-14627439 ] jiangyu commented on HDFS-8763: --- Hi,[~shv] , the situation is reporting RBW after FINALIZE. It is rare when the cluster is not big enough, but for larger cluster, it is easy to find. I don't know the impact if we change dfs.namenode.replication.min, i will investigate it from code and test. > Incremental blockreport order may replicate unnecessary block > - > > Key: HDFS-8763 > URL: https://issues.apache.org/jira/browse/HDFS-8763 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.4.0 >Reporter: jiangyu >Priority: Minor > > For our cluster, the NameNode is always very busy, so for every incremental > block report , the contention of lock is heavy. > The logic of incremental block report is as follow, client send block to dn1 > and dn1 mirror to dn2, dn2 mirror to dn3. After finish this block, all > datanode will report the newly received block to namenode. In NameNode side, > all will go to the method processIncrementalBlockReport in BlockManager > class. But the status of the block reported from dn2,dn3 is RECEIVING_BLOCK, > for dn1 is RECEIED_BLOCK. It is okay if dn2, dn3 report before dn1(that is > common), but in some busy environment, it is easy to find dn1 report before > dn2 or dn3, let’s assume dn2 report first, dn1 report second, and dn3 report > third. > So dn1 will addStoredBlock and find the replica of this block is not reach > the the original number(which is 3), and the block will add to > neededReplications construction and soon ask some node in pipeline (dn1 or > dn2)to replica it dn4 . After sometime, dn4 and dn3 all report this block, > then choose one node to invalidate. > Here is one log i found in our cluster: > 2015-07-08 01:05:34,675 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /logs/***_bigdata_spam/logs/application_1435099124107_470749/xx.xx.4.62_45454.tmp. > BP-1386326728-xx.xx.2.131-1382089338395 > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-a7c0f8f6-2399-4980-9479-efa08487b7b3:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-c75145a0-ed63-4180-87ee-d48ccaa647c5:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW]]} > 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.7.75:50010 is added to > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]} > size 0 > 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.4.62:50010 is added to > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]} > size 0 > 2015-07-08 01:05:35,003 INFO BlockStateChange: BLOCK* ask xx.xx.4.62:50010 to > replicate blk_3194502674_2121080184 to datanode(s) xx.xx.4.65:50010 > 2015-07-08 01:05:35,403 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.7.73:50010 is added to blk_3194502674_2121080184 size > 67750 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.4.65:50010 is added to blk_3194502674_2121080184 size > 67750 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* InvalidateBlocks: add > blk_3194502674_2121080184 to xx.xx.7.75:50010 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* chooseExcessReplicates: > (xx.xx.7.75:50010, blk_3194502674_2121080184) is added to invalidated blocks > set > 2015-07-08 01:05:35,852 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > InvalidateBlocks: ask xx.xx.7.75:50010 to delete [blk_3194502674_2121080184, > blk_3194497594_2121075104] > Some day, the number of this situation can be 40, that is not good for > the performance and waste network band. > Our base version is hadoop 2.4 and i have checked hadoop 2.7.1 didn’t find > any difference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8763) Incremental blockreport order may replicate unnecessary block
[ https://issues.apache.org/jira/browse/HDFS-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627438#comment-14627438 ] jiangyu commented on HDFS-8763: --- Hi,[~shv] , the situation is reporting RBW after FINALIZE. It is rare when the cluster is not big enough, but for larger cluster, it is easy to find. I don't know the impact if we change dfs.namenode.replication.min, i will investigate it from code and test. > Incremental blockreport order may replicate unnecessary block > - > > Key: HDFS-8763 > URL: https://issues.apache.org/jira/browse/HDFS-8763 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.4.0 >Reporter: jiangyu >Priority: Minor > > For our cluster, the NameNode is always very busy, so for every incremental > block report , the contention of lock is heavy. > The logic of incremental block report is as follow, client send block to dn1 > and dn1 mirror to dn2, dn2 mirror to dn3. After finish this block, all > datanode will report the newly received block to namenode. In NameNode side, > all will go to the method processIncrementalBlockReport in BlockManager > class. But the status of the block reported from dn2,dn3 is RECEIVING_BLOCK, > for dn1 is RECEIED_BLOCK. It is okay if dn2, dn3 report before dn1(that is > common), but in some busy environment, it is easy to find dn1 report before > dn2 or dn3, let’s assume dn2 report first, dn1 report second, and dn3 report > third. > So dn1 will addStoredBlock and find the replica of this block is not reach > the the original number(which is 3), and the block will add to > neededReplications construction and soon ask some node in pipeline (dn1 or > dn2)to replica it dn4 . After sometime, dn4 and dn3 all report this block, > then choose one node to invalidate. > Here is one log i found in our cluster: > 2015-07-08 01:05:34,675 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /logs/***_bigdata_spam/logs/application_1435099124107_470749/xx.xx.4.62_45454.tmp. > BP-1386326728-xx.xx.2.131-1382089338395 > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-a7c0f8f6-2399-4980-9479-efa08487b7b3:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-c75145a0-ed63-4180-87ee-d48ccaa647c5:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW]]} > 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.7.75:50010 is added to > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]} > size 0 > 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.4.62:50010 is added to > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]} > size 0 > 2015-07-08 01:05:35,003 INFO BlockStateChange: BLOCK* ask xx.xx.4.62:50010 to > replicate blk_3194502674_2121080184 to datanode(s) xx.xx.4.65:50010 > 2015-07-08 01:05:35,403 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.7.73:50010 is added to blk_3194502674_2121080184 size > 67750 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.4.65:50010 is added to blk_3194502674_2121080184 size > 67750 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* InvalidateBlocks: add > blk_3194502674_2121080184 to xx.xx.7.75:50010 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* chooseExcessReplicates: > (xx.xx.7.75:50010, blk_3194502674_2121080184) is added to invalidated blocks > set > 2015-07-08 01:05:35,852 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > InvalidateBlocks: ask xx.xx.7.75:50010 to delete [blk_3194502674_2121080184, > blk_3194497594_2121075104] > Some day, the number of this situation can be 40, that is not good for > the performance and waste network band. > Our base version is hadoop 2.4 and i have checked hadoop 2.7.1 didn’t find > any difference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files
[ https://issues.apache.org/jira/browse/HDFS-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627440#comment-14627440 ] Rakesh R commented on HDFS-8777: [~jingzhao] I hope this task is to add unit tests. Could you please take a look at HDFS-8266 system test task, as an initial attempt I had attached patch to unit tests EC + snapshots in that jira. I think will move the unit tests here and in that jira will focus only system test cases. Probably I will include few more test cases with fsimage saving/loading scenarios. Does this make sense? > Erasure Coding: add tests for taking snapshots on EC files > -- > > Key: HDFS-8777 > URL: https://issues.apache.org/jira/browse/HDFS-8777 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao > > We need to add more tests for (EC + snapshots). The tests need to verify the > fsimage saving/loading is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627392#comment-14627392 ] Hadoop QA commented on HDFS-8058: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 14m 58s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 8 new or modified test files. | | {color:green}+1{color} | javac | 7m 24s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 16s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 38s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 11s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 23s | The patch appears to introduce 5 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 15s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 187m 37s | Tests failed in hadoop-hdfs. | | | | 229m 33s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | | Timed out tests | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745335/HDFS-8058-HDFS-7285.009.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / 0a93712 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11705/artifact/patchprocess/patchReleaseAuditProblems.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11705/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11705/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11705/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11705/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11705/console | This message was automatically generated. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, > HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8768) Erasure Coding: block group ID displayed in WebUI is not consistent with fsck
[ https://issues.apache.org/jira/browse/HDFS-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8768: Summary: Erasure Coding: block group ID displayed in WebUI is not consistent with fsck (was: The display of Erasure Code file block group ID in WebUI is not consistent with fsck command) > Erasure Coding: block group ID displayed in WebUI is not consistent with fsck > - > > Key: HDFS-8768 > URL: https://issues.apache.org/jira/browse/HDFS-8768 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui > Attachments: Screen Shot 2015-07-14 at 15.33.08.png > > > For example, In WebUI( usually, namenode port: 50070) , one Erasure Code > file with one block group was displayed as the attached screenshot [^Screen > Shot 2015-07-14 at 15.33.08.png]. But, with fsck command, the block group of > the same file was displayed like: {{0. > BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 > len=6438256640}} > After checking block file names in datanodes, we believe WebUI may have some > problem with Erasure Code block group display. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8769) Erasure Coding: unit test for SequentialBlockGroupIdGenerator
[ https://issues.apache.org/jira/browse/HDFS-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8769: Summary: Erasure Coding: unit test for SequentialBlockGroupIdGenerator (was: unit test for SequentialBlockGroupIdGenerator) > Erasure Coding: unit test for SequentialBlockGroupIdGenerator > - > > Key: HDFS-8769 > URL: https://issues.apache.org/jira/browse/HDFS-8769 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su >Assignee: Rakesh R > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7483) Display information per tier on the Namenode UI
[ https://issues.apache.org/jira/browse/HDFS-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627342#comment-14627342 ] Benoy Antony commented on HDFS-7483: I don't think the suggested approach will work. But I could be wrong. [~wheat9], Could you please provide a working code snippet for this approach ? > Display information per tier on the Namenode UI > --- > > Key: HDFS-7483 > URL: https://issues.apache.org/jira/browse/HDFS-7483 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Benoy Antony >Assignee: Benoy Antony > Attachments: HDFS-7483-001.patch, HDFS-7483-002.patch, overview.png, > storagetypes.png, storagetypes_withnostorage.png, withOneStorageType.png, > withTwoStorageType.png > > > If cluster has different types of storage, it is useful to display the > storage information per type. > The information will be available via JMX (HDFS-7390) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7483) Display information per tier on the Namenode UI
[ https://issues.apache.org/jira/browse/HDFS-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627322#comment-14627322 ] Haohui Mai commented on HDFS-7483: -- {code} + var helpers = { +'percentage': function(chunk, context, bodies, params) { + var a = dust.helpers.tap(params.a, chunk, context); + var b = dust.helpers.tap(params.b, chunk, context); + var f = parseFloat(a)/parseFloat(b); + return chunk.write(Math.round(f * 1) / 100 + '%'); +} + }; + + for(var key in helpers) { + dust.helpers[key] = helpers[key]; + } + {code} A better approach is to do it through the math helper (http://www.dustjs.com/guides/dust-helpers/) and to format it with the {{fmt_percentage}} helper. > Display information per tier on the Namenode UI > --- > > Key: HDFS-7483 > URL: https://issues.apache.org/jira/browse/HDFS-7483 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Benoy Antony >Assignee: Benoy Antony > Attachments: HDFS-7483-001.patch, HDFS-7483-002.patch, overview.png, > storagetypes.png, storagetypes_withnostorage.png, withOneStorageType.png, > withTwoStorageType.png > > > If cluster has different types of storage, it is useful to display the > storage information per type. > The information will be available via JMX (HDFS-7390) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8619) Erasure Coding: revisit replica counting for striped blocks
[ https://issues.apache.org/jira/browse/HDFS-8619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8619: Attachment: HDFS-8619-HDFS-7285.001.patch Update the patch. Instead of tracking BlockInfo in CorruptReplicasMap, the 001 patch uses a Block with the same ID of the striped block group. > Erasure Coding: revisit replica counting for striped blocks > --- > > Key: HDFS-8619 > URL: https://issues.apache.org/jira/browse/HDFS-8619 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-8619-HDFS-7285.001.patch, HDFS-8619.000.patch > > > Currently we use the same {{BlockManager#countNodes}} method for striped > blocks, which simply treat each internal block as a replica. However, for a > striped block, we may have more complicated scenario, e.g., we have multiple > replicas of the first internal block while we miss some other internal > blocks. Using the current {{countNodes}} methods can lead to wrong decision > in these scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8778) TestBlockReportRateLimiting#testLeaseExpiration can deadlock
[ https://issues.apache.org/jira/browse/HDFS-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8778: Attachment: HDFS-8778.01.patch The fix is to make {{requestBlockReportLease}} non-blocking. Also ensure that the cluster is shutdown on test failure. > TestBlockReportRateLimiting#testLeaseExpiration can deadlock > > > Key: HDFS-8778 > URL: https://issues.apache.org/jira/browse/HDFS-8778 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.1 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8778.01.patch > > > {{requestBlockReportLease}} blocks on DataNode registration while holding the > NameSystem read lock. > DataNode registration can block on the NameSystem read lock if a writer gets > in the queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8778) TestBlockReportRateLimiting#testLeaseExpiration can deadlock
[ https://issues.apache.org/jira/browse/HDFS-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8778: Status: Patch Available (was: Open) > TestBlockReportRateLimiting#testLeaseExpiration can deadlock > > > Key: HDFS-8778 > URL: https://issues.apache.org/jira/browse/HDFS-8778 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.1 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8778.01.patch > > > {{requestBlockReportLease}} blocks on DataNode registration while holding the > NameSystem read lock. > DataNode registration can block on the NameSystem read lock if a writer gets > in the queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8778) TestBlockReportRateLimiting#testLeaseExpiration can deadlock
Arpit Agarwal created HDFS-8778: --- Summary: TestBlockReportRateLimiting#testLeaseExpiration can deadlock Key: HDFS-8778 URL: https://issues.apache.org/jira/browse/HDFS-8778 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.7.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal {{requestBlockReportLease}} blocks on DataNode registration while holding the NameSystem read lock. DataNode registration can block on the NameSystem read lock if a writer gets in the queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627215#comment-14627215 ] Walter Su commented on HDFS-8433: - I was thinking {code} public void readFields(DataInput in) throws IOException { ... for (int i = 0; i < length; i++) { modes.add(WritableUtils.readEnum(in, AccessMode.class)); } + idRange = WritableUtils.readVLong(in); } @Override public void write(DataOutput out) throws IOException { ... for (AccessMode aMode : modes) { WritableUtils.writeEnum(out, aMode); } + WritableUtils.writeVLong(out, idRange); } {code} A token generated by new NN can be parsed by old DN. A token generated by old DN can be parsed by old DN. A token generated by old DN can't be parsed by new DN. I hope DN don't generate token then there is no problem. Actually {{DataNode.DataTransfer}} generates token. Now I think this idea is a bad idea. I miss protobuf now but nothing we can do. > blockToken is not set in constructInternalBlock and parseStripedBlockGroup in > StripedBlockUtil > -- > > Key: HDFS-8433 > URL: https://issues.apache.org/jira/browse/HDFS-8433 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: Walter Su > Attachments: HDFS-8433-HDFS-7285.02.patch, HDFS-8433.00.patch, > HDFS-8433.01.patch > > > The blockToken provided in LocatedStripedBlock is not used to create > LocatedBlock in constructInternalBlock and parseStripedBlockGroup in > StripedBlockUtil. > We should also add ec tests with security on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8716) introduce a new config specifically for safe mode block count
[ https://issues.apache.org/jira/browse/HDFS-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627210#comment-14627210 ] Hadoop QA commented on HDFS-8716: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 6s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 4s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 9s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 21s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 29s | The applied patch generated 1 new checkstyle issues (total was 676, now 676). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 23s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 2s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 159m 31s | Tests failed in hadoop-hdfs. | | | | 205m 14s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestSecondaryWebUi | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.namenode.TestQuotaByStorageType | | | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete | | | hadoop.hdfs.TestDistributedFileSystem | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745311/HDFS-8716.7.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 59388a8 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11703/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11703/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11703/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11703/console | This message was automatically generated. > introduce a new config specifically for safe mode block count > - > > Key: HDFS-8716 > URL: https://issues.apache.org/jira/browse/HDFS-8716 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: HDFS-8716.1.patch, HDFS-8716.2.patch, HDFS-8716.3.patch, > HDFS-8716.4.patch, HDFS-8716.5.patch, HDFS-8716.6.patch, HDFS-8716.7.patch > > > During the start up, namenode waits for n replicas of each block to be > reported by datanodes before exiting the safe mode. Currently n is tied to > the min replicas config. We could set min replicas to more than one but we > might want to exit safe mode as soon as each block has one replica reported. > This can be worked out by introducing a new config variable for safe mode > block count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627196#comment-14627196 ] Hadoop QA commented on HDFS-8058: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 59s | Pre-patch HDFS-7285 has 5 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 8 new or modified test files. | | {color:green}+1{color} | javac | 7m 38s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 22s | The applied patch generated 10 new checkstyle issues (total was 643, now 634). | | {color:red}-1{color} | whitespace | 0m 8s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 20s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 15s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 186m 20s | Tests failed in hadoop-hdfs. | | | | 233m 10s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | | Timed out tests | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745307/HDFS-8058-HDFS-7285.008.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / 6ff957b | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/HDFS-7285FindbugsWarningshadoop-hdfs.html | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/patchReleaseAuditProblems.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11702/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11702/console | This message was automatically generated. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, > HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627181#comment-14627181 ] Hadoop QA commented on HDFS-8058: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 59s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 8 new or modified test files. | | {color:green}+1{color} | javac | 7m 41s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 39s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 7s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 23s | The patch appears to introduce 5 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 16s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 0m 27s | Tests failed in hadoop-hdfs. | | | | 43m 39s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed build | hadoop-hdfs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745335/HDFS-8058-HDFS-7285.009.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / 0a93712 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11706/artifact/patchprocess/patchReleaseAuditProblems.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11706/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11706/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11706/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11706/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11706/console | This message was automatically generated. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, > HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627180#comment-14627180 ] Jing Zhao commented on HDFS-8058: - Looks like the timeout of TestDFSStripedOutputStreamWithFailure is related. Could you please take a look at it, Zhe? > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, > HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627112#comment-14627112 ] Jing Zhao commented on HDFS-8058: - Created HDFS-8777 to add more tests for (snapshot + EC). > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, > HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files
Jing Zhao created HDFS-8777: --- Summary: Erasure Coding: add tests for taking snapshots on EC files Key: HDFS-8777 URL: https://issues.apache.org/jira/browse/HDFS-8777 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao We need to add more tests for (EC + snapshots). The tests need to verify the fsimage saving/loading is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627104#comment-14627104 ] Jing Zhao commented on HDFS-8058: - Thanks for the update, Zhe. +1 for the 09 patch. {{testTruncateWithDataNodesRestartImmediately}} has been fixed in trunk recently so we can ignore it in the feature branch now. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, > HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8058: Attachment: HDFS-8058-HDFS-7285.009.patch Good catch Jing! Uploading 09 patch to address the issue. {{testTruncateWithDataNodesRestartImmediately}} fails even without the patch. We should do some more debugging around it. To verify the new change I ran {{TestFSImage}}, {{TestINodeFile}} and {{TestStripedINodeFile}} locally and they all passed. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, > HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8759) Implement remote block reader in libhdfspp
[ https://issues.apache.org/jira/browse/HDFS-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627037#comment-14627037 ] Jing Zhao commented on HDFS-8759: - Agree with [~James Clampffer] that we need to have detailed documentation especially for class like Status. But it's fine to do this in a separate jira. +1 for the 001 patch. > Implement remote block reader in libhdfspp > -- > > Key: HDFS-8759 > URL: https://issues.apache.org/jira/browse/HDFS-8759 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8759.000.patch, HDFS-8759.001.patch > > > This jira tracks the effort of implementing the remote block reader that > communicates with DN in libhdfspp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627017#comment-14627017 ] Hadoop QA commented on HDFS-8058: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 13s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 7 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 37s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 7s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 27s | The patch appears to introduce 5 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 16s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 188m 12s | Tests failed in hadoop-hdfs. | | | | 230m 36s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | | Timed out tests | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745196/HDFS-8058-HDFS-7285.007.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / b1e6429 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/artifact/patchprocess/patchReleaseAuditProblems.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/console | This message was automatically generated. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627005#comment-14627005 ] Jing Zhao commented on HDFS-8433: - bq. This time, we don't need BlockIdRange class. We extends the fields of BlockTokenIdentifier. (See BlockTokenIdentifier#readFields(..) / write(..) ) Maybe more details? Not sure if I catch the idea here. How to avoid changing the readFields/writeFields but still adding a new field? > blockToken is not set in constructInternalBlock and parseStripedBlockGroup in > StripedBlockUtil > -- > > Key: HDFS-8433 > URL: https://issues.apache.org/jira/browse/HDFS-8433 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: Walter Su > Attachments: HDFS-8433-HDFS-7285.02.patch, HDFS-8433.00.patch, > HDFS-8433.01.patch > > > The blockToken provided in LocatedStripedBlock is not used to create > LocatedBlock in constructInternalBlock and parseStripedBlockGroup in > StripedBlockUtil. > We should also add ec tests with security on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8776) Decom manager should not be active on standby
Daryn Sharp created HDFS-8776: - Summary: Decom manager should not be active on standby Key: HDFS-8776 URL: https://issues.apache.org/jira/browse/HDFS-8776 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The decommission manager should not be actively processing on the standby. The decomm manager goes through the costly computation for determining every block on the node requires replication yet doesn't queue them for replication - because it's in standby. The decomm manager is holding the namesystem write lock, causing DNs to timeout on heartbeats or IBRs, NN purges the call queue of timed out clients, NN processes some heartbeats/IBRs before the decomm manager locks up the namesystem again. Nodes attempting to register will be sending full BRs which are more costly to send and discard than a heartbeat. If a failover is required, the standby will likely have to struggle very hard to not GC while "catching up" on its queued IBRs while DNs continue to fill the call queue and time out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8775) SASL support for data transfer protocol in libhdfspp
Haohui Mai created HDFS-8775: Summary: SASL support for data transfer protocol in libhdfspp Key: HDFS-8775 URL: https://issues.apache.org/jira/browse/HDFS-8775 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai This jira proposes to implement basic SASL support for the data transfer protocol which allows libhdfspp to talk to secure clusters. Support for encryption is deferred to subsequent jiras. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8734) Erasure Coding: fix one cell need two packets
[ https://issues.apache.org/jira/browse/HDFS-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8734: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7285 Status: Resolved (was: Patch Available) I've committed this to the feature branch. Thanks for the contribution, Walter! Thanks for the review, Bo! > Erasure Coding: fix one cell need two packets > - > > Key: HDFS-8734 > URL: https://issues.apache.org/jira/browse/HDFS-8734 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su >Assignee: Walter Su > Fix For: HDFS-7285 > > Attachments: HDFS-8734-HDFS-7285.01.patch, HDFS-8734.01.patch > > > The default WritePacketSize is 64k > Currently default cellSize is 64k > We hope one cell consumes one packet. In fact it's not. > By default, > chunkSize = 516( 512 data + 4 checksum) > packetSize = 64k > chunksPerPacket = 126 ( See DFSOutputStream#computePacketChunkSize for > details) > numBytes of data in one packet = 64512 > cellSize = 65536 > When first packet is full ( with 64512 data), there are still 65536 - 64512 = > 1024 bytes left. > {code} > super.writeChunk(bytes, offset, len, checksum, ckoff, cklen); > // cell is full and current packet has not been enqueued, > if (cellFull && currentPacket != null) { > enqueueCurrentPacketFull(); > } > {code} > When the last 1024 bytes of the cell was written, we meet {{cellFull}} and > create another packet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8734) Erasure Coding: fix one cell need two packets
[ https://issues.apache.org/jira/browse/HDFS-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626973#comment-14626973 ] Jing Zhao commented on HDFS-8734: - The patch looks good to me. +1. One concern is that we now have more and more variables to track the states of the streamers. We can revisit them later and maybe do some code cleanup. > Erasure Coding: fix one cell need two packets > - > > Key: HDFS-8734 > URL: https://issues.apache.org/jira/browse/HDFS-8734 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-8734-HDFS-7285.01.patch, HDFS-8734.01.patch > > > The default WritePacketSize is 64k > Currently default cellSize is 64k > We hope one cell consumes one packet. In fact it's not. > By default, > chunkSize = 516( 512 data + 4 checksum) > packetSize = 64k > chunksPerPacket = 126 ( See DFSOutputStream#computePacketChunkSize for > details) > numBytes of data in one packet = 64512 > cellSize = 65536 > When first packet is full ( with 64512 data), there are still 65536 - 64512 = > 1024 bytes left. > {code} > super.writeChunk(bytes, offset, len, checksum, ckoff, cklen); > // cell is full and current packet has not been enqueued, > if (cellFull && currentPacket != null) { > enqueueCurrentPacketFull(); > } > {code} > When the last 1024 bytes of the cell was written, we meet {{cellFull}} and > create another packet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8774) Implement FileSystem and InputStream API for libhdfspp
Haohui Mai created HDFS-8774: Summary: Implement FileSystem and InputStream API for libhdfspp Key: HDFS-8774 URL: https://issues.apache.org/jira/browse/HDFS-8774 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai Fix For: HDFS-8707 This jira proposes to implement FileSystem and InputStream APIs for libhdfspp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626969#comment-14626969 ] Arun Suresh commented on HDFS-7858: --- [~arpitagarwal], apologize for sitting on this... I was trying to refactor this as per [~jingzhao]'s suggestion (replacing RetryInvocationHandler with RequestHedgingInvocationHandler). Unfortunately, it was turning out to be a more far reaching impact (technically request hedging is different from retry.. so the whole policy framework etc. would need to be refactored) If everyone is ok with the current approach, we can punt the larger refactoring to another JIRA and I can incorporate [~arpitagarwal]'s suggestion (skip standby for subsequent requests) and provide a quick patch. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626963#comment-14626963 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12702886/HDFS-7858.3.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 979c9ca | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11704/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8742) Inotify: Support event for OP_TRUNCATE
[ https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626957#comment-14626957 ] Hudson commented on HDFS-8742: -- FAILURE: Integrated in Hadoop-trunk-Commit #8164 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8164/]) HDFS-8742. Inotify: Support event for OP_TRUNCATE. Contributed by Surendra Singh Lilhore. (aajisaka: rev 979c9ca2ca89e99dc7165abfa29c78d66de43d9a) * hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/inotify.proto * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/InotifyFSEditLogOpTranslator.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/inotify/Event.java > Inotify: Support event for OP_TRUNCATE > -- > > Key: HDFS-8742 > URL: https://issues.apache.org/jira/browse/HDFS-8742 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8742-001.patch, HDFS-8742.patch > > > Currently inotify is not giving any event for Truncate operation. NN should > send event for "Truncate". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626956#comment-14626956 ] Jing Zhao commented on HDFS-8058: - Here {{fileInPb.getIsStriped()}} should be {{file.isStriped}} since we have not persist the isStriped information into FileDiff. Or we can move "setIsStriped(n.isStriped())" into {{buildINodeFile}}. Maybe the later way is more clean. Let's also create a separate jira to add more tests on the (EC + snapshot) scenario. {code} - (byte)fileInPb.getStoragePolicyID(), xAttrs); + (byte)fileInPb.getStoragePolicyID(), xAttrs, fileInPb.getIsStriped()); {code} Other than this +1 if Jenkins runs fine. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8764) Generate Hadoop RPC stubs from protobuf definitions
[ https://issues.apache.org/jira/browse/HDFS-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8764: - Attachment: HDFS-8764.000.patch > Generate Hadoop RPC stubs from protobuf definitions > --- > > Key: HDFS-8764 > URL: https://issues.apache.org/jira/browse/HDFS-8764 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8764.000.patch > > > It would be nice to have the the RPC stubs generated from the protobuf > definitions which is similar to what the HADOOP-10388 has achieved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626950#comment-14626950 ] Arpit Agarwal commented on HDFS-7858: - Hi [~asuresh], were you thinking of posting an updated patch. The overall approach looks good. One comment from a quick look - RequestHedgingProxyProvider sends all requests to both NNs. Should it skip the standby for subsequent requests? > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8758) Implement the continuation library for libhdfspp
[ https://issues.apache.org/jira/browse/HDFS-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai resolved HDFS-8758. -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-8707 Target Version/s: HDFS-8707 Committed to HDFS-8707. Thanks Jing for reviews. > Implement the continuation library for libhdfspp > > > Key: HDFS-8758 > URL: https://issues.apache.org/jira/browse/HDFS-8758 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: HDFS-8707 > > Attachments: HDFS-8758.000.patch > > > libhdfspp uses continuations as basic building blocks to implement > asynchronous operations. This jira imports the continuation library into the > repository. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8742) Inotify: Support event for OP_TRUNCATE
[ https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-8742: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) I've committed this to trunk and branch-2. Thanks [~surendrasingh] for the contribution! > Inotify: Support event for OP_TRUNCATE > -- > > Key: HDFS-8742 > URL: https://issues.apache.org/jira/browse/HDFS-8742 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > Fix For: 2.8.0 > > Attachments: HDFS-8742-001.patch, HDFS-8742.patch > > > Currently inotify is not giving any event for Truncate operation. NN should > send event for "Truncate". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8759) Implement remote block reader in libhdfspp
[ https://issues.apache.org/jira/browse/HDFS-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8759: - Attachment: HDFS-8759.001.patch > Implement remote block reader in libhdfspp > -- > > Key: HDFS-8759 > URL: https://issues.apache.org/jira/browse/HDFS-8759 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8759.000.patch, HDFS-8759.001.patch > > > This jira tracks the effort of implementing the remote block reader that > communicates with DN in libhdfspp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8742) Inotify: Support event for OP_TRUNCATE
[ https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626921#comment-14626921 ] Akira AJISAKA commented on HDFS-8742: - +1, the test failure looks unrelated to the patch. I confirmed the test passed locally. > Inotify: Support event for OP_TRUNCATE > -- > > Key: HDFS-8742 > URL: https://issues.apache.org/jira/browse/HDFS-8742 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8742-001.patch, HDFS-8742.patch > > > Currently inotify is not giving any event for Truncate operation. NN should > send event for "Truncate". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes
[ https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626895#comment-14626895 ] Hudson commented on HDFS-8722: -- FAILURE: Integrated in Hadoop-trunk-Commit #8163 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8163/]) HDFS-8722. Optimize datanode writes for small writes and flushes. Contributed by Kihwal Lee (kihwal: rev 59388a801514d6af64ef27fbf246d8054f1dcc74) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Optimize datanode writes for small writes and flushes > - > > Key: HDFS-8722 > URL: https://issues.apache.org/jira/browse/HDFS-8722 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.7.2 > > Attachments: HDFS-8722.patch, HDFS-8722.v1.patch > > > After the data corruption fix by HDFS-4660, the CRC recalculation for partial > chunk is executed more frequently, if the client repeats writing few bytes > and calling hflush/hsync. This is because the generic logic forces CRC > recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, > datanode blindly accepted whatever CRC client provided, if the incoming data > is chunk-aligned. This was the source of the corruption. > We can still optimize for the most common case where a client is repeatedly > writing small number of bytes followed by hflush/hsync with no pipeline > recovery or append, by allowing the previous behavior for this specific case. > If the incoming data has a duplicate portion and that is at the last > chunk-boundary before the partial chunk on disk, datanode can use the > checksum supplied by the client without redoing the checksum on its own. > This reduces disk reads as well as CPU load for the checksum calculation. > If the incoming packet data goes back further than the last on-disk chunk > boundary, datanode will still do a recalculation, but this occurs rarely > during pipeline recoveries. Thus the optimization for this specific case > should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8722) Optimize datanode writes for small writes and flushes
[ https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8722: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.2 Status: Resolved (was: Patch Available) > Optimize datanode writes for small writes and flushes > - > > Key: HDFS-8722 > URL: https://issues.apache.org/jira/browse/HDFS-8722 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.7.2 > > Attachments: HDFS-8722.patch, HDFS-8722.v1.patch > > > After the data corruption fix by HDFS-4660, the CRC recalculation for partial > chunk is executed more frequently, if the client repeats writing few bytes > and calling hflush/hsync. This is because the generic logic forces CRC > recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, > datanode blindly accepted whatever CRC client provided, if the incoming data > is chunk-aligned. This was the source of the corruption. > We can still optimize for the most common case where a client is repeatedly > writing small number of bytes followed by hflush/hsync with no pipeline > recovery or append, by allowing the previous behavior for this specific case. > If the incoming data has a duplicate portion and that is at the last > chunk-boundary before the partial chunk on disk, datanode can use the > checksum supplied by the client without redoing the checksum on its own. > This reduces disk reads as well as CPU load for the checksum calculation. > If the incoming packet data goes back further than the last on-disk chunk > boundary, datanode will still do a recalculation, but this occurs rarely > during pipeline recoveries. Thus the optimization for this specific case > should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes
[ https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626888#comment-14626888 ] Kihwal Lee commented on HDFS-8722: -- Thanks for the review, Arpit. I've committed this to trunk, branch-2 and branch-2.7. > Optimize datanode writes for small writes and flushes > - > > Key: HDFS-8722 > URL: https://issues.apache.org/jira/browse/HDFS-8722 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.7.2 > > Attachments: HDFS-8722.patch, HDFS-8722.v1.patch > > > After the data corruption fix by HDFS-4660, the CRC recalculation for partial > chunk is executed more frequently, if the client repeats writing few bytes > and calling hflush/hsync. This is because the generic logic forces CRC > recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, > datanode blindly accepted whatever CRC client provided, if the incoming data > is chunk-aligned. This was the source of the corruption. > We can still optimize for the most common case where a client is repeatedly > writing small number of bytes followed by hflush/hsync with no pipeline > recovery or append, by allowing the previous behavior for this specific case. > If the incoming data has a duplicate portion and that is at the last > chunk-boundary before the partial chunk on disk, datanode can use the > checksum supplied by the client without redoing the checksum on its own. > This reduces disk reads as well as CPU load for the checksum calculation. > If the incoming packet data goes back further than the last on-disk chunk > boundary, datanode will still do a recalculation, but this occurs rarely > during pipeline recoveries. Thus the optimization for this specific case > should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8716) introduce a new config specifically for safe mode block count
[ https://issues.apache.org/jira/browse/HDFS-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-8716: --- Attachment: HDFS-8716.7.patch add a unit test > introduce a new config specifically for safe mode block count > - > > Key: HDFS-8716 > URL: https://issues.apache.org/jira/browse/HDFS-8716 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: HDFS-8716.1.patch, HDFS-8716.2.patch, HDFS-8716.3.patch, > HDFS-8716.4.patch, HDFS-8716.5.patch, HDFS-8716.6.patch, HDFS-8716.7.patch > > > During the start up, namenode waits for n replicas of each block to be > reported by datanodes before exiting the safe mode. Currently n is tied to > the min replicas config. We could set min replicas to more than one but we > might want to exit safe mode as soon as each block has one replica reported. > This can be worked out by introducing a new config variable for safe mode > block count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8763) Incremental blockreport order may replicate unnecessary block
[ https://issues.apache.org/jira/browse/HDFS-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626839#comment-14626839 ] Konstantin Shvachko commented on HDFS-8763: --- Blocks should not be scheduled for replication, while they are UNDER_CONSTRUCTION. Could you check if that is the case. If so it is a bug. If not then your DNs seem to be quite busy because of too many of small blocks. - You can try to increase {{dfs.namenode.replication.min}} to 2 (the default is 1). That way NN should wait for 2 replicas reported by DNs before scheduling replication. - Also as you know HDFS does not "like" small files. Where small means < 128MB. You may try to encourage your users to combine data into larger blobs. > Incremental blockreport order may replicate unnecessary block > - > > Key: HDFS-8763 > URL: https://issues.apache.org/jira/browse/HDFS-8763 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.4.0 >Reporter: jiangyu >Priority: Minor > > For our cluster, the NameNode is always very busy, so for every incremental > block report , the contention of lock is heavy. > The logic of incremental block report is as follow, client send block to dn1 > and dn1 mirror to dn2, dn2 mirror to dn3. After finish this block, all > datanode will report the newly received block to namenode. In NameNode side, > all will go to the method processIncrementalBlockReport in BlockManager > class. But the status of the block reported from dn2,dn3 is RECEIVING_BLOCK, > for dn1 is RECEIED_BLOCK. It is okay if dn2, dn3 report before dn1(that is > common), but in some busy environment, it is easy to find dn1 report before > dn2 or dn3, let’s assume dn2 report first, dn1 report second, and dn3 report > third. > So dn1 will addStoredBlock and find the replica of this block is not reach > the the original number(which is 3), and the block will add to > neededReplications construction and soon ask some node in pipeline (dn1 or > dn2)to replica it dn4 . After sometime, dn4 and dn3 all report this block, > then choose one node to invalidate. > Here is one log i found in our cluster: > 2015-07-08 01:05:34,675 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /logs/***_bigdata_spam/logs/application_1435099124107_470749/xx.xx.4.62_45454.tmp. > BP-1386326728-xx.xx.2.131-1382089338395 > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-a7c0f8f6-2399-4980-9479-efa08487b7b3:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-c75145a0-ed63-4180-87ee-d48ccaa647c5:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW]]} > 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.7.75:50010 is added to > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]} > size 0 > 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.4.62:50010 is added to > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]} > size 0 > 2015-07-08 01:05:35,003 INFO BlockStateChange: BLOCK* ask xx.xx.4.62:50010 to > replicate blk_3194502674_2121080184 to datanode(s) xx.xx.4.65:50010 > 2015-07-08 01:05:35,403 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.7.73:50010 is added to blk_3194502674_2121080184 size > 67750 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.4.65:50010 is added to blk_3194502674_2121080184 size > 67750 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* InvalidateBlocks: add > blk_3194502674_2121080184 to xx.xx.7.75:50010 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* chooseExcessReplicates: > (xx.xx.7.75:50010, blk_3194502674_2121080184) is added to invalidated blocks > set > 2015-07-08 01:05:35,852 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > InvalidateBlocks: ask xx.xx.7.75:50010 to delete [blk_3194502674_2121080184, > blk_3194497594_2121075104] > Some day, the number of this situation can be 40, that is not good for > the performance and wa
[jira] [Updated] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8058: Attachment: HDFS-8058-HDFS-7285.008.patch Thanks Jing for reviewing again! Uploading new patch to address the 2 issues. When creating a snapshot copy in {{FSImageFormat}}, {{isStriped}} is set to false. IIRC we don't support EC on non-PB images. Let me know if that's correct. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058-HDFS-7285.008.patch, HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626810#comment-14626810 ] Jing Zhao commented on HDFS-8058: - The 007 patch looks good to me. Just two minors: # In {{INodeFileAttributes}}, it's better to keep the {{isStriped}} attribute of the snapshot copy the same with the INodeFile. Thus maybe we can add a boolean parameter to {{INodeFileAttributes.SnapshotCopy}}'s constructor, and in {{FSImageFormatPBSnapshot#loadFileDiffList}} we pass in {{file.isStriped}}. {code} - header = HeaderFormat.toLong(preferredBlockSize, replication, + header = HeaderFormat.toLong(preferredBlockSize, replication, false, storagePolicyID); {code} # Looks like we do not need to add a new constructor for INodeFile. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7608) hdfs dfsclient newConnectedPeer has no write timeout
[ https://issues.apache.org/jira/browse/HDFS-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626783#comment-14626783 ] Hudson commented on HDFS-7608: -- FAILURE: Integrated in Hadoop-trunk-Commit #8162 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8162/]) HDFS-7608: hdfs dfsclient newConnectedPeer has no write timeout (Xiaoyu Yao via Colin P. McCabe) (cmccabe: rev 1d74ccececaefffaa90c0c18b40a3645dbc819d9) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java HDFS-7608: add CHANGES.txt (cmccabe: rev b7fb6ec4513de7d342c541eb3d9e14642286e2cf) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > hdfs dfsclient newConnectedPeer has no write timeout > - > > Key: HDFS-7608 > URL: https://issues.apache.org/jira/browse/HDFS-7608 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs, hdfs-client >Affects Versions: 2.3.0, 2.6.0 > Environment: hdfs 2.3.0 hbase 0.98.6 >Reporter: zhangshilong >Assignee: Xiaoyu Yao > Fix For: 2.8.0 > > Attachments: HDFS-7608.0.patch, HDFS-7608.1.patch, HDFS-7608.2.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > problem: > hbase compactSplitThread may lock forever on read datanode blocks. > debug found: epollwait timeout set to 0,so epollwait can not run out. > cause: in hdfs 2.3.0 > hbase using DFSClient to read and write blocks. > DFSClient creates one socket using newConnectedPeer(addr), but has no read > or write timeout. > in v 2.6.0, newConnectedPeer has added readTimeout to deal with the > problem,but did not add writeTimeout. why did not add write Timeout? > I think NioInetPeer need a default socket timeout,so appalications will no > need to force adding timeout by themselives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7608) hdfs dfsclient newConnectedPeer has no write timeout
[ https://issues.apache.org/jira/browse/HDFS-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7608: --- Resolution: Fixed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) committed to 2.8, thanks all > hdfs dfsclient newConnectedPeer has no write timeout > - > > Key: HDFS-7608 > URL: https://issues.apache.org/jira/browse/HDFS-7608 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs, hdfs-client >Affects Versions: 2.3.0, 2.6.0 > Environment: hdfs 2.3.0 hbase 0.98.6 >Reporter: zhangshilong >Assignee: Xiaoyu Yao > Fix For: 2.8.0 > > Attachments: HDFS-7608.0.patch, HDFS-7608.1.patch, HDFS-7608.2.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > problem: > hbase compactSplitThread may lock forever on read datanode blocks. > debug found: epollwait timeout set to 0,so epollwait can not run out. > cause: in hdfs 2.3.0 > hbase using DFSClient to read and write blocks. > DFSClient creates one socket using newConnectedPeer(addr), but has no read > or write timeout. > in v 2.6.0, newConnectedPeer has added readTimeout to deal with the > problem,but did not add writeTimeout. why did not add write Timeout? > I think NioInetPeer need a default socket timeout,so appalications will no > need to force adding timeout by themselives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8694) Expose the stats of IOErrors on each FsVolume through JMX
[ https://issues.apache.org/jira/browse/HDFS-8694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626770#comment-14626770 ] Andrew Wang commented on HDFS-8694: --- Hi Eddy, overall this patch looks great, thanks for working on it. Some review comments: High-level: * I have a hard time understanding when we should call handle the disk error vs. just bubbling up, since it bubbles there seems like a danger of handling the same root IOE more than once. What's the methodology here? Is it possible to move handling to the top-level somewhere? I can manually examine all the current callsites and callers, but that's not very future-proof. * Related to the above, our unit tests do not cover anywhere close to all the locations that handle an IOError. Adding tests for all of these would be onerous, so very interested in a solution to the above. * Since we now have the volume as context, we should really move the disk checker to be per-volume rather than DN wide. One volume throwing an error is no reason to check all of them. This can be deferred to a follow-up; I think it's a slam dunk. Nits: * FsDatasetImpl#moveBlockAcrossStorage, can we get rid of volume and move targetVolume's declaration outside of the try? Looks equal to volume. * In places like BlockSender#close we actually could IOE multiple times but only increment once. Thoughts? * Extra debug print in TestDataTransferKeepalive#testSlowReader * Linebreak here is kind of awkward, move the end parens or the try up? {code} try (ReplicaHandler replica = dataset.append( block, block.getGenerationStamp() + 1, block.getNumBytes()) ) { {code} * TestMover adds a test timeout, looks unrelated to this patch? > Expose the stats of IOErrors on each FsVolume through JMX > - > > Key: HDFS-8694 > URL: https://issues.apache.org/jira/browse/HDFS-8694 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, HDFS >Affects Versions: 2.7.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-8694.000.patch, HDFS-8694.001.patch > > > Currently, once DataNode hits an {{IOError}} when writing / reading block > files, it starts a background {{DiskChecker.checkDirs()}} thread. But if this > thread successfully finishes, DN does not record this {{IOError}}. > We need one measurement to count all {{IOErrors}} for each volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block
[ https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8702: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7285 Status: Resolved (was: Patch Available) I've committed this to the feature branch. Thanks for the contribution, [~kaisasak]! Thanks for the review, [~walter.k.su]! > Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped > block > --- > > Key: HDFS-8702 > URL: https://issues.apache.org/jira/browse/HDFS-8702 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su >Assignee: Kai Sasaki > Fix For: HDFS-7285 > > Attachments: HDFS-8702-HDFS-7285.00.patch, > HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, > HDFS-8702-HDFS-7285.03.patch, HDFS-8702-HDFS-7285.04.patch > > > Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs > updated for striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block
[ https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626748#comment-14626748 ] Jing Zhao commented on HDFS-8702: - bq. I think we can use getRealDataBlockNum(); Hmm, you're right. The parameter is the expected storage number not the rack number. +1 for the latest patch. I will commit it shortly. > Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped > block > --- > > Key: HDFS-8702 > URL: https://issues.apache.org/jira/browse/HDFS-8702 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su >Assignee: Kai Sasaki > Attachments: HDFS-8702-HDFS-7285.00.patch, > HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, > HDFS-8702-HDFS-7285.03.patch, HDFS-8702-HDFS-7285.04.patch > > > Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs > updated for striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes
[ https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626709#comment-14626709 ] Arpit Agarwal commented on HDFS-8722: - +1 for the patch. Verified that it brings unaligned writes on par with 2.6.0. > Optimize datanode writes for small writes and flushes > - > > Key: HDFS-8722 > URL: https://issues.apache.org/jira/browse/HDFS-8722 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-8722.patch, HDFS-8722.v1.patch > > > After the data corruption fix by HDFS-4660, the CRC recalculation for partial > chunk is executed more frequently, if the client repeats writing few bytes > and calling hflush/hsync. This is because the generic logic forces CRC > recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, > datanode blindly accepted whatever CRC client provided, if the incoming data > is chunk-aligned. This was the source of the corruption. > We can still optimize for the most common case where a client is repeatedly > writing small number of bytes followed by hflush/hsync with no pipeline > recovery or append, by allowing the previous behavior for this specific case. > If the incoming data has a duplicate portion and that is at the last > chunk-boundary before the partial chunk on disk, datanode can use the > checksum supplied by the client without redoing the checksum on its own. > This reduces disk reads as well as CPU load for the checksum calculation. > If the incoming packet data goes back further than the last on-disk chunk > boundary, datanode will still do a recalculation, but this occurs rarely > during pipeline recoveries. Thus the optimization for this specific case > should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626713#comment-14626713 ] Steve Loughran commented on HDFS-8736: -- You will also need to guard against untrusted code trying to open a network port and talking to hadoop direct, and doing the same for webhdfs. Given a socket and sufficient code, I can talk to an HDFS filesystem. There is a well defined way to stop untrusted code talking to HDFS, it is called Kerberos. Yes, we all hate it. Yes, we all fear it, Yes, none of us understand it properly. But we know that it does lock things down so that not only are untrusted applications forbidden access, the caller gets the specific rights associated with the identity of the user making the operation. (There's also a little detail of that patch still being un-applicable, but that's a detail here). As I stated on the related MR JIRA, file an uber-JIRA where the whole aspect of running Hadoop (client?) in a sandbox can be discussed, rather than piece by piece patches which will probably get rejected on a case-by-case basis. > ability to deny access to HDFS filesystems > -- > > Key: HDFS-8736 > URL: https://issues.apache.org/jira/browse/HDFS-8736 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Affects Versions: 2.5.0 >Reporter: Purvesh Patel >Priority: Minor > Labels: security > Attachments: HDFS-8736-1.patch > > > In order to run in a secure context, ability to deny access to different > filesystems(specifically the local file system) to non-trusted code this > patch adds a new SecurityPermission class(AccessFileSystemPermission) and > checks the permission in FileSystem#get before returning a cached file system > or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626645#comment-14626645 ] Zhe Zhang commented on HDFS-8058: - Triggering Jenkins again. Last run generated a lot of "Class not found" errors. > Erasure coding: use BlockInfo[] for both striped and contiguous blocks in > INodeFile > --- > > Key: HDFS-8058 > URL: https://issues.apache.org/jira/browse/HDFS-8058 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Yi Liu >Assignee: Zhe Zhang > Attachments: HDFS-8058-HDFS-7285.003.patch, > HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, > HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, > HDFS-8058.001.patch, HDFS-8058.002.patch > > > This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous > blocks in INodeFile. > Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped > blocks, and the methods there duplicate with those in INodeFile, and current > code need to judge {{isStriped}} then do different things. Also if file is > striped, the {{blocks}} in INodeFile occupy a reference memory space. > These are not necessary, and we can use the same {{blocks}} to make code more > clear. > I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file > a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from > *BlockInfoStriped* to INodeFile, since ideally they are the same for all > striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626642#comment-14626642 ] Allen Wittenauer commented on HDFS-8344: +1 > NameNode doesn't recover lease for files with missing blocks > > > Key: HDFS-8344 > URL: https://issues.apache.org/jira/browse/HDFS-8344 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, > HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch > > > I found another\(?) instance in which the lease is not recovered. This is > reproducible easily on a pseudo-distributed single node cluster > # Before you start it helps if you set. This is not necessary, but simply > reduces how long you have to wait > {code} > public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; > public static final long LEASE_HARDLIMIT_PERIOD = 2 * > LEASE_SOFTLIMIT_PERIOD; > {code} > # Client starts to write a file. (could be less than 1 block, but it hflushed > so some of the data has landed on the datanodes) (I'm copying the client code > I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) > # Client crashes. (I simulate this by kill -9 the $(hadoop jar > TestHadoop.jar) process after it has printed "Wrote to the bufferedWriter" > # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was > only 1) > I believe the lease should be recovered and the block should be marked > missing. However this is not happening. The lease is never recovered. > The effect of this bug for us was that nodes could not be decommissioned > cleanly. Although we knew that the client had crashed, the Namenode never > released the leases (even after restarting the Namenode) (even months > afterwards). There are actually several other cases too where we don't > consider what happens if ALL the datanodes die while the file is being > written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626633#comment-14626633 ] Allen Wittenauer commented on HDFS-8736: Trying to solve server security problems from the client side never works. bq. with the caveat that you'd need to also guard against users trying to instantiate the file system implementation directly using other permissions. ... which is nearly impossible. There's not a lot of work here to do exactly that: java -Dfs.hdfs.impl=myclass or java -Dfs.s3.impl=DistributedFileSystem or whatever Now what? > ability to deny access to HDFS filesystems > -- > > Key: HDFS-8736 > URL: https://issues.apache.org/jira/browse/HDFS-8736 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Affects Versions: 2.5.0 >Reporter: Purvesh Patel >Priority: Minor > Labels: security > Attachments: HDFS-8736-1.patch > > > In order to run in a secure context, ability to deny access to different > filesystems(specifically the local file system) to non-trusted code this > patch adds a new SecurityPermission class(AccessFileSystemPermission) and > checks the permission in FileSystem#get before returning a cached file system > or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626625#comment-14626625 ] Haohui Mai commented on HDFS-8767: -- It looks like that a cleaner approach is to call {{list()}} only when the file is a directory. > RawLocalFileSystem.listStatus() returns null for UNIX pipefile > -- > > Key: HDFS-8767 > URL: https://issues.apache.org/jira/browse/HDFS-8767 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: kanaka kumar avvaru >Priority: Critical > Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch > > > Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of > the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626596#comment-14626596 ] Walter Su commented on HDFS-8433: - I have another idea: Pick up the {{BlockIdRange}} idea again in 01 patch. This time, we don't need {{BlockIdRange}} class. We extends the fields of {{BlockTokenIdentifier}}. (See BlockTokenIdentifier#readFields(..) / write(..) ) Just add a field {{IdRange}}, default value is 0. I think the performance impact on contiguous block is little. And I think it also support old DN, old DN just doesn't read the last field. I prefer to implement this against trunk. And just merge 02 patch(multiple tokens method) into feature branch and see how it works. Then we decide if it's worth to pick up BlockIdRange. Hi, [~jingzhao], [~szetszwo]! Any idea? > blockToken is not set in constructInternalBlock and parseStripedBlockGroup in > StripedBlockUtil > -- > > Key: HDFS-8433 > URL: https://issues.apache.org/jira/browse/HDFS-8433 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: Walter Su > Attachments: HDFS-8433-HDFS-7285.02.patch, HDFS-8433.00.patch, > HDFS-8433.01.patch > > > The blockToken provided in LocatedStripedBlock is not used to create > LocatedBlock in constructInternalBlock and parseStripedBlockGroup in > StripedBlockUtil. > We should also add ec tests with security on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626500#comment-14626500 ] Hadoop QA commented on HDFS-8767: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 26s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 39s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 7s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 21s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 51s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 4s | Tests passed in hadoop-common. | | | | 61m 3s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745256/HDFS-8767-01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4084eaf | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11700/artifact/patchprocess/testrun_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11700/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11700/console | This message was automatically generated. > RawLocalFileSystem.listStatus() returns null for UNIX pipefile > -- > > Key: HDFS-8767 > URL: https://issues.apache.org/jira/browse/HDFS-8767 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: kanaka kumar avvaru >Priority: Critical > Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch > > > Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of > the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626490#comment-14626490 ] Hudson commented on HDFS-8541: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #254 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/254/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Mover should exit with NO_MOVE_PROGRESS if there is no move progress > > > Key: HDFS-8541 > URL: https://issues.apache.org/jira/browse/HDFS-8541 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch > > > HDFS-8143 changed Mover to exit after some retry when failed to move blocks. > Two additional suggestions: > # Mover retry counter should be incremented only if all moves fail. If there > are some successful moves, the counter should be reset. > # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626492#comment-14626492 ] Hudson commented on HDFS-8143: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #254 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/254/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > HDFS Mover tool should exit after some retry when failed to move blocks. > > > Key: HDFS-8143 > URL: https://issues.apache.org/jira/browse/HDFS-8143 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.7.1 > > Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, > HDFS-8143_3.patch > > > Mover is not coming out in case of failed to move blocks. > {code} > hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); > {code} > {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks > migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel
[ https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626472#comment-14626472 ] Hadoop QA commented on HDFS-8578: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 57s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 33s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 21s | The applied patch generated 1 new checkstyle issues (total was 597, now 592). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 29s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 3s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 161m 1s | Tests failed in hadoop-hdfs. | | | | 204m 22s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745242/HDFS-8578-07.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4084eaf | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11699/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11699/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11699/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11699/console | This message was automatically generated. > On upgrade, Datanode should process all storage/data dirs in parallel > - > > Key: HDFS-8578 > URL: https://issues.apache.org/jira/browse/HDFS-8578 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Raju Bairishetti >Assignee: Vinayakumar B >Priority: Critical > Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, > HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, > HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-branch-2.6.0.patch > > > Right now, during upgrades datanode is processing all the storage dirs > sequentially. Assume it takes ~20 mins to process a single storage dir then > datanode which has ~10 disks will take around 3hours to come up. > *BlockPoolSliceStorage.java* > {code} >for (int idx = 0; idx < getNumStorageDirs(); idx++) { > doTransition(datanode, getStorageDir(idx), nsInfo, startOpt); > assert getCTime() == nsInfo.getCTime() > : "Data-node and name-node CTimes must be the same."; > } > {code} > It would save lots of time during major upgrades if datanode process all > storagedirs/disks parallelly. > Can we make datanode to process all storage dirs parallelly? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626463#comment-14626463 ] Purvesh Patel commented on HDFS-8736: - There is little confusion on the description of issue. This patch is introduced to prevent untrusted user code from accessing to HDFS, not the local file system. It's written in such a way as to potentially enable it to be used to block access to any type of FileSystem, with the caveat that you'd need to also guard against users trying to instantiate the file system implementation directly using other permissions. Additional permission to prevent users from getting access to instances of the HDFS FileSystem that were created when the user code was off-stack and that have pre-cached network connections. > ability to deny access to HDFS filesystems > -- > > Key: HDFS-8736 > URL: https://issues.apache.org/jira/browse/HDFS-8736 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Affects Versions: 2.5.0 >Reporter: Purvesh Patel >Priority: Minor > Labels: security > Attachments: HDFS-8736-1.patch > > > In order to run in a secure context, ability to deny access to different > filesystems(specifically the local file system) to non-trusted code this > patch adds a new SecurityPermission class(AccessFileSystemPermission) and > checks the permission in FileSystem#get before returning a cached file system > or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626452#comment-14626452 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2202 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2202/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > HDFS Mover tool should exit after some retry when failed to move blocks. > > > Key: HDFS-8143 > URL: https://issues.apache.org/jira/browse/HDFS-8143 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.7.1 > > Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, > HDFS-8143_3.patch > > > Mover is not coming out in case of failed to move blocks. > {code} > hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); > {code} > {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks > migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626450#comment-14626450 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2202 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2202/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java > Mover should exit with NO_MOVE_PROGRESS if there is no move progress > > > Key: HDFS-8541 > URL: https://issues.apache.org/jira/browse/HDFS-8541 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch > > > HDFS-8143 changed Mover to exit after some retry when failed to move blocks. > Two additional suggestions: > # Mover retry counter should be incremented only if all moves fail. If there > are some successful moves, the counter should be reset. > # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Purvesh Patel updated HDFS-8736: Summary: ability to deny access to HDFS filesystems (was: ability to deny access to different filesystems) > ability to deny access to HDFS filesystems > -- > > Key: HDFS-8736 > URL: https://issues.apache.org/jira/browse/HDFS-8736 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Affects Versions: 2.5.0 >Reporter: Purvesh Patel >Priority: Minor > Labels: security > Attachments: HDFS-8736-1.patch > > > In order to run in a secure context, ability to deny access to different > filesystems(specifically the local file system) to non-trusted code this > patch adds a new SecurityPermission class(AccessFileSystemPermission) and > checks the permission in FileSystem#get before returning a cached file system > or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Purvesh Patel updated HDFS-8736: Attachment: (was: Patch.pdf) > ability to deny access to HDFS filesystems > -- > > Key: HDFS-8736 > URL: https://issues.apache.org/jira/browse/HDFS-8736 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Affects Versions: 2.5.0 >Reporter: Purvesh Patel >Priority: Minor > Labels: security > Attachments: HDFS-8736-1.patch > > > In order to run in a secure context, ability to deny access to different > filesystems(specifically the local file system) to non-trusted code this > patch adds a new SecurityPermission class(AccessFileSystemPermission) and > checks the permission in FileSystem#get before returning a cached file system > or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Purvesh Patel updated HDFS-8736: Attachment: HDFS-8736-1.patch > ability to deny access to HDFS filesystems > -- > > Key: HDFS-8736 > URL: https://issues.apache.org/jira/browse/HDFS-8736 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Affects Versions: 2.5.0 >Reporter: Purvesh Patel >Priority: Minor > Labels: security > Attachments: HDFS-8736-1.patch > > > In order to run in a secure context, ability to deny access to different > filesystems(specifically the local file system) to non-trusted code this > patch adds a new SecurityPermission class(AccessFileSystemPermission) and > checks the permission in FileSystem#get before returning a cached file system > or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626401#comment-14626401 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #244 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/244/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java > Mover should exit with NO_MOVE_PROGRESS if there is no move progress > > > Key: HDFS-8541 > URL: https://issues.apache.org/jira/browse/HDFS-8541 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch > > > HDFS-8143 changed Mover to exit after some retry when failed to move blocks. > Two additional suggestions: > # Mover retry counter should be incremented only if all moves fail. If there > are some successful moves, the counter should be reset. > # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626403#comment-14626403 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #244 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/244/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > HDFS Mover tool should exit after some retry when failed to move blocks. > > > Key: HDFS-8143 > URL: https://issues.apache.org/jira/browse/HDFS-8143 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.7.1 > > Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, > HDFS-8143_3.patch > > > Mover is not coming out in case of failed to move blocks. > {code} > hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); > {code} > {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks > migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626397#comment-14626397 ] kanaka kumar avvaru commented on HDFS-8767: --- Updated patch with the test case on UNIX based systems as per [~ste...@apache.org] comment > RawLocalFileSystem.listStatus() returns null for UNIX pipefile > -- > > Key: HDFS-8767 > URL: https://issues.apache.org/jira/browse/HDFS-8767 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: kanaka kumar avvaru >Priority: Critical > Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch > > > Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of > the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanaka kumar avvaru updated HDFS-8767: -- Attachment: HDFS-8767-01.patch > RawLocalFileSystem.listStatus() returns null for UNIX pipefile > -- > > Key: HDFS-8767 > URL: https://issues.apache.org/jira/browse/HDFS-8767 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: kanaka kumar avvaru >Priority: Critical > Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch > > > Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of > the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626387#comment-14626387 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2183 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2183/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > HDFS Mover tool should exit after some retry when failed to move blocks. > > > Key: HDFS-8143 > URL: https://issues.apache.org/jira/browse/HDFS-8143 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.7.1 > > Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, > HDFS-8143_3.patch > > > Mover is not coming out in case of failed to move blocks. > {code} > hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); > {code} > {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks > migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626385#comment-14626385 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2183 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2183/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Mover should exit with NO_MOVE_PROGRESS if there is no move progress > > > Key: HDFS-8541 > URL: https://issues.apache.org/jira/browse/HDFS-8541 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch > > > HDFS-8143 changed Mover to exit after some retry when failed to move blocks. > Two additional suggestions: > # Mover retry counter should be incremented only if all moves fail. If there > are some successful moves, the counter should be reset. > # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewe
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626381#comment-14626381 ] kanaka kumar avvaru commented on HDFS-8622: --- Thanks for update [~jagadesh.kiran], "-02.patch" looks good for me. +1 (Non binding). [~ajisakaa], please give your view. > Implement GETCONTENTSUMMARY operation for WebImageViewe > --- > > Key: HDFS-8622 > URL: https://issues.apache.org/jira/browse/HDFS-8622 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jagadesh Kiran N >Assignee: Jagadesh Kiran N > Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, > HDFS-8622-02.patch > > > it would be better for administrators if {code} GETCONTENTSUMMARY {code} are > supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8772) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-8772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626312#comment-14626312 ] Hadoop QA commented on HDFS-8772: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 5m 40s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 19s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 7s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 28s | Tests failed in hadoop-hdfs. | | | | 180m 16s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745227/HDFS-8772.01.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / ac94ba3 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11697/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11697/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11697/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11697/console | This message was automatically generated. > fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails > > > Key: HDFS-8772 > URL: https://issues.apache.org/jira/browse/HDFS-8772 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-8772.01.patch > > > https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11598/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11600/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11606/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11608/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11612/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11618/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11650/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11655/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11659/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11663/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11664/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11667/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11669/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11676/testReport/ > https://builds.apache.org/job/PreCommit-HDFS-Build/11677/testReport/ > {noformat} > java.lang.AssertionError: expected:<0> but was:<4> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot.testDatanodeRestarts(TestStandbyIsHot.java:188) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8773) Few FSNamesystem metrics are not documented in the Metrics page
Rakesh R created HDFS-8773: -- Summary: Few FSNamesystem metrics are not documented in the Metrics page Key: HDFS-8773 URL: https://issues.apache.org/jira/browse/HDFS-8773 Project: Hadoop HDFS Issue Type: Bug Components: documentation Reporter: Rakesh R Assignee: Rakesh R This jira is to document missing metrics in the [Metrics page|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Metrics.html#FSNamesystem]. Following are not documented: {code} MissingReplOneBlocks NumFilesUnderConstruction NumActiveClients HAState FSState {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8475) Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available
[ https://issues.apache.org/jira/browse/HDFS-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626288#comment-14626288 ] Vinod Valecha commented on HDFS-8475: - Hi Team, Could this be a configuration issue for hadoop. Can you pls point us to the configuration that we should be looking at in order to correct this. Thanks a lot! > Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no > length prefix available > > > Key: HDFS-8475 > URL: https://issues.apache.org/jira/browse/HDFS-8475 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Vinod Valecha >Priority: Blocker > > Scenraio: > = > write a file > corrupt block manually > Exception stack trace- > 2015-05-24 02:31:55.291 INFO [T-33716795] > [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1155) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1088) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514) > [5/24/15 2:31:55:291 UTC] 02027a3b DFSClient I > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer createBlockOutputStream > Exception in createBlockOutputStream > java.io.EOFException: Premature EOF: no > length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1155) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1088) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514) > 2015-05-24 02:31:55.291 INFO [T-33716795] > [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] Abandoning > BP-176676314-10.108.106.59-1402620296713:blk_1404621403_330880579 > [5/24/15 2:31:55:291 UTC] 02027a3b DFSClient I > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream > Abandoning BP-176676314-10.108.106.59-1402620296713:blk_1404621403_330880579 > 2015-05-24 02:31:55.299 INFO [T-33716795] > [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] Excluding datanode > 10.108.106.59:50010 > [5/24/15 2:31:55:299 UTC] 02027a3b DFSClient I > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream > Excluding datanode 10.108.106.59:50010 > 2015-05-24 02:31:55.300 WARNING [T-33716795] > [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /var/db/opera/files/B4889CCDA75F9751DDBB488E5AAB433E/BE4DAEF290B7136ED6EF3D4B157441A2/BE4DAEF290B7136ED6EF3D4B157441A2-4.pag > could only be replicated to 0 nodes instead of minReplication (=1). There > are 1 datanode(s) running and 1 node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555) > [5/24/15 2:31:55:300 UTC] 02027a3b DFSClient W > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer run DataStreamer Exception > > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /var/db/opera/files/B4889CCDA75F9751DDBB488E5AAB433E/BE4DAEF290B7136ED6EF3D4B157441A2/BE4DAEF290B7136ED6EF3D4B157441A2-4.pag > could only be replicated to 0 nodes instead of minReplication (=1). There > are 1 datanode(s) running and 1 node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolPro
[jira] [Updated] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel
[ https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-8578: Attachment: HDFS-8578-07.patch Fixed {{TestReplication}} which fails intermittently, not exactly related to this Jira. {{TestHDFSCLI}} failed due to collision with hadoop-common precommit job. {{TestDataNodeRollingUpgrade}} passes locally. > On upgrade, Datanode should process all storage/data dirs in parallel > - > > Key: HDFS-8578 > URL: https://issues.apache.org/jira/browse/HDFS-8578 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Raju Bairishetti >Assignee: Vinayakumar B >Priority: Critical > Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, > HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, > HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-branch-2.6.0.patch > > > Right now, during upgrades datanode is processing all the storage dirs > sequentially. Assume it takes ~20 mins to process a single storage dir then > datanode which has ~10 disks will take around 3hours to come up. > *BlockPoolSliceStorage.java* > {code} >for (int idx = 0; idx < getNumStorageDirs(); idx++) { > doTransition(datanode, getStorageDir(idx), nsInfo, startOpt); > assert getCTime() == nsInfo.getCTime() > : "Data-node and name-node CTimes must be the same."; > } > {code} > It would save lots of time during major upgrades if datanode process all > storagedirs/disks parallelly. > Can we make datanode to process all storage dirs parallelly? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626241#comment-14626241 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Yarn-trunk #986 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/986/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > HDFS Mover tool should exit after some retry when failed to move blocks. > > > Key: HDFS-8143 > URL: https://issues.apache.org/jira/browse/HDFS-8143 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.7.1 > > Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, > HDFS-8143_3.patch > > > Mover is not coming out in case of failed to move blocks. > {code} > hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); > {code} > {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks > migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626239#comment-14626239 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Yarn-trunk #986 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/986/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java > Mover should exit with NO_MOVE_PROGRESS if there is no move progress > > > Key: HDFS-8541 > URL: https://issues.apache.org/jira/browse/HDFS-8541 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch > > > HDFS-8143 changed Mover to exit after some retry when failed to move blocks. > Two additional suggestions: > # Mover retry counter should be incremented only if all moves fail. If there > are some successful moves, the counter should be reset. > # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626231#comment-14626231 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #256 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/256/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > HDFS Mover tool should exit after some retry when failed to move blocks. > > > Key: HDFS-8143 > URL: https://issues.apache.org/jira/browse/HDFS-8143 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.7.1 > > Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, > HDFS-8143_3.patch > > > Mover is not coming out in case of failed to move blocks. > {code} > hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); > {code} > {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks > migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626228#comment-14626228 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #256 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/256/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java > Mover should exit with NO_MOVE_PROGRESS if there is no move progress > > > Key: HDFS-8541 > URL: https://issues.apache.org/jira/browse/HDFS-8541 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch > > > HDFS-8143 changed Mover to exit after some retry when failed to move blocks. > Two additional suggestions: > # Mover retry counter should be incremented only if all moves fail. If there > are some successful moves, the counter should be reset. > # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626209#comment-14626209 ] Hadoop QA commented on HDFS-8767: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 11s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 6s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 2s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 7s | The applied patch generated 1 new checkstyle issues (total was 21, now 21). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 21s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 53s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 4s | Tests failed in hadoop-common. | | | | 62m 44s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.crypto.key.TestValueQueue | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745231/HDFS-8767-00.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4084eaf | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11698/artifact/patchprocess/diffcheckstylehadoop-common.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11698/artifact/patchprocess/testrun_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11698/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11698/console | This message was automatically generated. > RawLocalFileSystem.listStatus() returns null for UNIX pipefile > -- > > Key: HDFS-8767 > URL: https://issues.apache.org/jira/browse/HDFS-8767 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: kanaka kumar avvaru >Priority: Critical > Attachments: HDFS-8767-00.patch > > > Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of > the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626184#comment-14626184 ] Steve Loughran commented on HDFS-8767: -- this'll need a test for unix which at least downgrades on windows > RawLocalFileSystem.listStatus() returns null for UNIX pipefile > -- > > Key: HDFS-8767 > URL: https://issues.apache.org/jira/browse/HDFS-8767 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: kanaka kumar avvaru >Priority: Critical > Attachments: HDFS-8767-00.patch > > > Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of > the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel
[ https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626168#comment-14626168 ] Hadoop QA commented on HDFS-8578: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 1s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 0s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 37s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 7s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 158m 41s | Tests failed in hadoop-hdfs. | | | | 201m 29s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestReplication | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745204/HDFS-8578-06.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / a431ed9 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11695/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11695/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11695/console | This message was automatically generated. > On upgrade, Datanode should process all storage/data dirs in parallel > - > > Key: HDFS-8578 > URL: https://issues.apache.org/jira/browse/HDFS-8578 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Raju Bairishetti >Assignee: Vinayakumar B >Priority: Critical > Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, > HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, > HDFS-8578-06.patch, HDFS-8578-branch-2.6.0.patch > > > Right now, during upgrades datanode is processing all the storage dirs > sequentially. Assume it takes ~20 mins to process a single storage dir then > datanode which has ~10 disks will take around 3hours to come up. > *BlockPoolSliceStorage.java* > {code} >for (int idx = 0; idx < getNumStorageDirs(); idx++) { > doTransition(datanode, getStorageDir(idx), nsInfo, startOpt); > assert getCTime() == nsInfo.getCTime() > : "Data-node and name-node CTimes must be the same."; > } > {code} > It would save lots of time during major upgrades if datanode process all > storagedirs/disks parallelly. > Can we make datanode to process all storage dirs parallelly? -- This message was sent by Atlassian JIRA (v6.3.4#6332)