[jira] [Commented] (HDFS-5688) Wire-encription in QJM
[ https://issues.apache.org/jira/browse/HDFS-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908067#comment-13908067 ] Suresh Srinivas commented on HDFS-5688: --- [~wheat9], can you please comment on this issue? > Wire-encription in QJM > -- > > Key: HDFS-5688 > URL: https://issues.apache.org/jira/browse/HDFS-5688 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, security >Affects Versions: 2.2.0 >Reporter: Juan Carlos Fernandez >Priority: Blocker > Labels: security > Attachments: core-site.xml, hdfs-site.xml, jaas.conf, ssl-client.xml, > ssl-server.xml > > > When HA is implemented with QJM and using kerberos, it's not possible to set > wire-encrypted data. > If it's set property hadoop.rpc.protection to something different to > authentication it doesn't work propertly, getting the error: > ERROR security.UserGroupInformation: PriviledgedActionException > as:principal@REALM (auth:KERBEROS) cause:javax.security.sasl.SaslException: > No common protection layer between client and server > With NFS as shared storage everything works like a charm -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HDFS-5993) org.apache.hadoop.fs.loadGenerator.TestLoadGenerator failure in trunk
[ https://issues.apache.org/jira/browse/HDFS-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HDFS-5993. - Resolution: Duplicate Closed as duplicate of HADOOP-10355. But thanks for the report [~yzhangal]! > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator failure in trunk > - > > Key: HDFS-5993 > URL: https://issues.apache.org/jira/browse/HDFS-5993 > Project: Hadoop HDFS > Issue Type: Bug > Environment: CentOS release 6.5 (Final) > cpe:/o:centos:linux:6:GA >Reporter: Yongjun Zhang > > With today's latest trunk at > commit d926e51bdc27f08e916534567a1edcfd994e2784 > When running it locally, I can consistently see the following test fails as: > --- > T E S T S > --- > --- > T E S T S > --- > Running org.apache.hadoop.fs.loadGenerator.TestLoadGenerator > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 14.888 sec > <<< FAILURE! - in org.apache.hadoop.fs.loadGenerator.TestLoadGenerator > testLoadGenerator(org.apache.hadoop.fs.loadGenerator.TestLoadGenerator) Time > elapsed: 14.285 sec <<< ERROR! > java.io.IOException: Stream closed > at java.io.BufferedReader.ensureOpen(BufferedReader.java:115) > at java.io.BufferedReader.readLine(BufferedReader.java:310) > at java.io.BufferedReader.readLine(BufferedReader.java:382) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) > at > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) > Results : > Tests in error: > TestLoadGenerator.testLoadGenerator:231 » IO Stream closed > This failure is also reported in one upstream test for HDFS-5939 patch. > (I can see the same problem locally without applying this patch). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5993) org.apache.hadoop.fs.loadGenerator.TestLoadGenerator failure in trunk
Yongjun Zhang created HDFS-5993: --- Summary: org.apache.hadoop.fs.loadGenerator.TestLoadGenerator failure in trunk Key: HDFS-5993 URL: https://issues.apache.org/jira/browse/HDFS-5993 Project: Hadoop HDFS Issue Type: Bug Environment: CentOS release 6.5 (Final) cpe:/o:centos:linux:6:GA Reporter: Yongjun Zhang With today's latest trunk at commit d926e51bdc27f08e916534567a1edcfd994e2784 When running it locally, I can consistently see the following test fails as: --- T E S T S --- --- T E S T S --- Running org.apache.hadoop.fs.loadGenerator.TestLoadGenerator Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 14.888 sec <<< FAILURE! - in org.apache.hadoop.fs.loadGenerator.TestLoadGenerator testLoadGenerator(org.apache.hadoop.fs.loadGenerator.TestLoadGenerator) Time elapsed: 14.285 sec <<< ERROR! java.io.IOException: Stream closed at java.io.BufferedReader.ensureOpen(BufferedReader.java:115) at java.io.BufferedReader.readLine(BufferedReader.java:310) at java.io.BufferedReader.readLine(BufferedReader.java:382) at org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) at org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) at org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) at org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) Results : Tests in error: TestLoadGenerator.testLoadGenerator:231 » IO Stream closed This failure is also reported in one upstream test for HDFS-5939 patch. (I can see the same problem locally without applying this patch). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908054#comment-13908054 ] Hadoop QA commented on HDFS-5274: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630259/ss-5274v8-get.png against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6204//console This message is automatically generated. > Add Tracing to HDFS > --- > > Key: HDFS-5274 > URL: https://issues.apache.org/jira/browse/HDFS-5274 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 2.1.1-beta >Reporter: Elliott Clark >Assignee: Elliott Clark > Attachments: HDFS-5274-0.patch, HDFS-5274-1.patch, HDFS-5274-2.patch, > HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, HDFS-5274-6.patch, > HDFS-5274-7.patch, HDFS-5274-8.patch, Zipkin Trace a06e941b0172ec73.png, > Zipkin Trace d0f0d66b8a258a69.png, ss-5274v8-get.png, ss-5274v8-put.png > > > Since Google's Dapper paper has shown the benefits of tracing for a large > distributed system, it seems like a good time to add tracing to HDFS. HBase > has added tracing using HTrace. I propose that the same can be done within > HDFS. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5939) WebHdfs returns misleading error code and logs nothing if trying to create a file with no DNs in cluster
[ https://issues.apache.org/jira/browse/HDFS-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908051#comment-13908051 ] Yongjun Zhang commented on HDFS-5939: - I think the two failed tests are irrelevant to the patch I submitted: org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator The latest patch I submitted is no different than previous one, except some cosmetic changes. My previous versions passed all tests successfully. I can consistently reproduce TestLoadGenerator failure locally with and without my changes. My local run of TestSafeMode is always successful with and without my change. So upstream failure may be specific to upstream test env. > WebHdfs returns misleading error code and logs nothing if trying to create a > file with no DNs in cluster > > > Key: HDFS-5939 > URL: https://issues.apache.org/jira/browse/HDFS-5939 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-5939.001.patch, HDFS-5939.002.patch, > HDFS-5939.003.patch > > > When trying to access hdfs via webhdfs, and when datanode is dead, user will > see an exception below without any clue that it's caused by dead datanode: > $ curl -i -X PUT > ".../webhdfs/v1/t1?op=CREATE&user.name=&overwrite=false" > ... > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"n > must be positive"}} > Need to fix the report to give user hint about dead datanode. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5981) PBImageXmlWriter generates malformed XML
[ https://issues.apache.org/jira/browse/HDFS-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5981: Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) I committed this patch to trunk, branch-2 and branch-2.4. Thank you for the patch, [~wheat9],. > PBImageXmlWriter generates malformed XML > > > Key: HDFS-5981 > URL: https://issues.apache.org/jira/browse/HDFS-5981 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0, 2.4.0 >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 3.0.0, 2.4.0 > > Attachments: HDFS-5981.000.patch, HDFS-5981.001.patch, > HDFS-5981.002.patch, HDFS-5981.003.patch > > > {{PBImageXmlWriter}} outputs malformed XML file because it closes the > {{SnapshotDiffSection}}, {{NameSection}} and {{INodeReferenceSection}} > incorrectly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5981) PBImageXmlWriter generates malformed XML
[ https://issues.apache.org/jira/browse/HDFS-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908039#comment-13908039 ] Hudson commented on HDFS-5981: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5202 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5202/]) HDFS-5981. loadGenerator exit code is not reliable. Contributed by Haohui Mai. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570468) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageXmlWriter.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java > PBImageXmlWriter generates malformed XML > > > Key: HDFS-5981 > URL: https://issues.apache.org/jira/browse/HDFS-5981 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0, 2.4.0 >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Attachments: HDFS-5981.000.patch, HDFS-5981.001.patch, > HDFS-5981.002.patch, HDFS-5981.003.patch > > > {{PBImageXmlWriter}} outputs malformed XML file because it closes the > {{SnapshotDiffSection}}, {{NameSection}} and {{INodeReferenceSection}} > incorrectly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908033#comment-13908033 ] Masatake Iwasaki commented on HDFS-5274: bq. Masatake Iwasaki You know, I was thinking... Maybe it ok that there are so many spans? Tracing doesn't cost unless enabled. When debugging, you might want to see in the trace that HDFS is doing a bunch of small reads? I just missed your comment on uploading v.8 patch. Because there were so many spans than I expected and receiver's queue was filled, I think disabling those spans is safer as a starting point. {noformat} 14/02/19 22:25:23 ERROR impl.ZipkinSpanReceiver: Error trying to append span (DFSOutputStream.write) to the queue. Blocking Queue was full. {noformat} > Add Tracing to HDFS > --- > > Key: HDFS-5274 > URL: https://issues.apache.org/jira/browse/HDFS-5274 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 2.1.1-beta >Reporter: Elliott Clark >Assignee: Elliott Clark > Attachments: HDFS-5274-0.patch, HDFS-5274-1.patch, HDFS-5274-2.patch, > HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, HDFS-5274-6.patch, > HDFS-5274-7.patch, HDFS-5274-8.patch, Zipkin Trace a06e941b0172ec73.png, > Zipkin Trace d0f0d66b8a258a69.png, ss-5274v8-get.png, ss-5274v8-put.png > > > Since Google's Dapper paper has shown the benefits of tracing for a large > distributed system, it seems like a good time to add tracing to HDFS. HBase > has added tracing using HTrace. I propose that the same can be done within > HDFS. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5939) WebHdfs returns misleading error code and logs nothing if trying to create a file with no DNs in cluster
[ https://issues.apache.org/jira/browse/HDFS-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908027#comment-13908027 ] Hadoop QA commented on HDFS-5939: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630229/HDFS-5939.003.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestSafeMode org.apache.hadoop.fs.loadGenerator.TestLoadGenerator {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6202//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6202//console This message is automatically generated. > WebHdfs returns misleading error code and logs nothing if trying to create a > file with no DNs in cluster > > > Key: HDFS-5939 > URL: https://issues.apache.org/jira/browse/HDFS-5939 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-5939.001.patch, HDFS-5939.002.patch, > HDFS-5939.003.patch > > > When trying to access hdfs via webhdfs, and when datanode is dead, user will > see an exception below without any clue that it's caused by dead datanode: > $ curl -i -X PUT > ".../webhdfs/v1/t1?op=CREATE&user.name=&overwrite=false" > ... > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"n > must be positive"}} > Need to fix the report to give user hint about dead datanode. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-5274: --- Attachment: ss-5274v8-get.png ss-5274v8-put.png HDFS-5274-8.patch I am attaching updated patch and screen shots of trace of putting and getting a 200MB file. bq. Fix these in next patch: fixed. bq. Is formatting ok here? fixed. bq. In BlockReceiver, should traceSpan be getting closed? added description to span and calling close(). {quote} Is it possible that below throws an exception? + scope.getSpan().addKVAnnotation( + "stream".getBytes(), + jas.getCurrentStream().toString().getBytes()); i..e. we can hope out w/o closing the span since the try/finally only happens later. This is in JournalSet in a few places. {quote} I moved these code in try block to make sure. bq. TraceInfo and RPCTInfo seem to be same datastructure? Should we define it onetime only and share?' I prefer keeping this as is because of simplicity and independency between datatransfer protocol and o.a.h.ipc. bq. I checked the trace of putting and getting a big file by Zipkin today. There seems to be too many spans concerning "DFSInputStream.read" and "DFSOutputStream.write". I will fix this in the next version of patch. just removed those spans from DFSInputStream and DFSOutputStream. > Add Tracing to HDFS > --- > > Key: HDFS-5274 > URL: https://issues.apache.org/jira/browse/HDFS-5274 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 2.1.1-beta >Reporter: Elliott Clark >Assignee: Elliott Clark > Attachments: HDFS-5274-0.patch, HDFS-5274-1.patch, HDFS-5274-2.patch, > HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, HDFS-5274-6.patch, > HDFS-5274-7.patch, HDFS-5274-8.patch, Zipkin Trace a06e941b0172ec73.png, > Zipkin Trace d0f0d66b8a258a69.png, ss-5274v8-get.png, ss-5274v8-put.png > > > Since Google's Dapper paper has shown the benefits of tracing for a large > distributed system, it seems like a good time to add tracing to HDFS. HBase > has added tracing using HTrace. I propose that the same can be done within > HDFS. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5396) FSImage.getFsImageName should check whether fsimage exists
[ https://issues.apache.org/jira/browse/HDFS-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated HDFS-5396: --- Attachment: HDFS-5396-branch-1.2.patch Update patch. > FSImage.getFsImageName should check whether fsimage exists > -- > > Key: HDFS-5396 > URL: https://issues.apache.org/jira/browse/HDFS-5396 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 1.2.1 >Reporter: zhaoyunjiong >Assignee: zhaoyunjiong > Fix For: 1.3.0 > > Attachments: HDFS-5396-branch-1.2.patch, HDFS-5396-branch-1.2.patch > > > In https://issues.apache.org/jira/browse/HDFS-5367, fsimage may not write to > all IMAGE dir, so we need to check whether fsimage exists before > FSImage.getFsImageName returned. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5496) Make replication queue initialization asynchronous
[ https://issues.apache.org/jira/browse/HDFS-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907955#comment-13907955 ] Vinayakumar B commented on HDFS-5496: - These failures are not there in the second patch's test report. I think since first patch was missing LightWeightGSet changes, so those are failed. > Make replication queue initialization asynchronous > -- > > Key: HDFS-5496 > URL: https://issues.apache.org/jira/browse/HDFS-5496 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Kihwal Lee >Assignee: Vinayakumar B > Fix For: HDFS-5535 (Rolling upgrades) > > Attachments: HDFS-5496.patch, HDFS-5496.patch, HDFS-5496.patch, > HDFS-5496.patch, HDFS-5496.patch > > > Today, initialization of replication queues blocks safe mode exit and certain > HA state transitions. For a big name space, this can take hundreds of seconds > with the FSNamesystem write lock held. During this time, important requests > (e.g. initial block reports, heartbeat, etc) are blocked. > The effect of delaying the initialization would be not starting replication > right away, but I think the benefit outweighs. If we make it asynchronous, > the work per iteration should be limited, so that the lock duration is > capped. > If full/incremental block reports and any other requests that modifies block > state properly performs replication checks while the blocks are scanned and > the queues populated in background, every block will be processed. (Some may > be done twice) The replication monitor should run even before all blocks are > processed. > This will allow namenode to exit safe mode and start serving immediately even > with a big name space. It will also reduce the HA failover latency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5535) Umbrella jira for improved HDFS rolling upgrades
[ https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5535: - Attachment: h5535_20140220b.patch h5535_20140220b.patch: includes HDFS-5992. > Umbrella jira for improved HDFS rolling upgrades > > > Key: HDFS-5535 > URL: https://issues.apache.org/jira/browse/HDFS-5535 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, ha, hdfs-client, namenode >Affects Versions: 3.0.0, 2.2.0 >Reporter: Nathan Roberts > Attachments: HDFSRollingUpgradesHighLevelDesign.pdf, > h5535_20140219.patch, h5535_20140220-1554.patch, h5535_20140220b.patch > > > In order to roll a new HDFS release through a large cluster quickly and > safely, a few enhancements are needed in HDFS. An initial High level design > document will be attached to this jira, and sub-jiras will itemize the > individual tasks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5992) Fix NPE in MD5FileUtils
[ https://issues.apache.org/jira/browse/HDFS-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5992: - Attachment: editsStored h5992_20140220.patch h5992_20140220.patch: fix MD5FileUtils and TestOfflineEditsViewer. editsStored: the new binary file. > Fix NPE in MD5FileUtils > --- > > Key: HDFS-5992 > URL: https://issues.apache.org/jira/browse/HDFS-5992 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Attachments: editsStored, h5992_20140220.patch > > > MD5FileUtils.readStoredMd5(File md5File) may return null but the callers may > not check it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5778) Document new commands and parameters for improved rolling upgrades
[ https://issues.apache.org/jira/browse/HDFS-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5778: - Attachment: (was: h5992_20140220.patch) > Document new commands and parameters for improved rolling upgrades > -- > > Key: HDFS-5778 > URL: https://issues.apache.org/jira/browse/HDFS-5778 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: documentation >Affects Versions: HDFS-5535 (Rolling upgrades) >Reporter: Akira AJISAKA >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h5778_20140220.patch > > > "hdfs dfsadmin -rollingUpgrade" command was newly added in HDFS-5752, and > some other commands and parameters will be added in the future. This issue > exists to flag undocumented commands and parameters when HDFS-5535 branch is > merging to trunk. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5778) Document new commands and parameters for improved rolling upgrades
[ https://issues.apache.org/jira/browse/HDFS-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5778: - Attachment: h5778_20140220.patch h5992_20140220.patch h5778_20140220.patch: wrote a few sections but not yet finished. > Document new commands and parameters for improved rolling upgrades > -- > > Key: HDFS-5778 > URL: https://issues.apache.org/jira/browse/HDFS-5778 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: documentation >Affects Versions: HDFS-5535 (Rolling upgrades) >Reporter: Akira AJISAKA >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h5778_20140220.patch > > > "hdfs dfsadmin -rollingUpgrade" command was newly added in HDFS-5752, and > some other commands and parameters will be added in the future. This issue > exists to flag undocumented commands and parameters when HDFS-5535 branch is > merging to trunk. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907945#comment-13907945 ] stack commented on HDFS-5274: - [~iwasakims] You know, I was thinking... Maybe it ok that there are so many spans? Tracing doesn't cost unless enabled. When debugging, you might want to see in the trace that HDFS is doing a bunch of small reads? > Add Tracing to HDFS > --- > > Key: HDFS-5274 > URL: https://issues.apache.org/jira/browse/HDFS-5274 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 2.1.1-beta >Reporter: Elliott Clark >Assignee: Elliott Clark > Attachments: HDFS-5274-0.patch, HDFS-5274-1.patch, HDFS-5274-2.patch, > HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, HDFS-5274-6.patch, > HDFS-5274-7.patch, Zipkin Trace a06e941b0172ec73.png, Zipkin Trace > d0f0d66b8a258a69.png > > > Since Google's Dapper paper has shown the benefits of tracing for a large > distributed system, it seems like a good time to add tracing to HDFS. HBase > has added tracing using HTrace. I propose that the same can be done within > HDFS. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5935) New Namenode UI FS browser should throw smarter error messages
[ https://issues.apache.org/jira/browse/HDFS-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907930#comment-13907930 ] Hadoop QA commented on HDFS-5935: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630215/HDFS-5935-4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.loadGenerator.TestLoadGenerator org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6201//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6201//console This message is automatically generated. > New Namenode UI FS browser should throw smarter error messages > -- > > Key: HDFS-5935 > URL: https://issues.apache.org/jira/browse/HDFS-5935 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Travis Thompson >Assignee: Travis Thompson >Priority: Minor > Attachments: HDFS-5935-1.patch, HDFS-5935-2.patch, HDFS-5935-3.patch, > HDFS-5935-4.patch > > > When browsing using the new FS browser in the namenode, if I try to browse a > folder that I don't have permission to view, it throws the error: > {noformat} > Failed to retreive data from /webhdfs/v1/system?op=LISTSTATUS, cause: > Forbidden > WebHDFS might be disabled. WebHDFS is required to browse the filesystem. > {noformat} > The reason I'm not allowed to see /system is because I don't have permission, > not because WebHDFS is disabled. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5992) Fix NPE in MD5FileUtils
Tsz Wo (Nicholas), SZE created HDFS-5992: Summary: Fix NPE in MD5FileUtils Key: HDFS-5992 URL: https://issues.apache.org/jira/browse/HDFS-5992 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE MD5FileUtils.readStoredMd5(File md5File) may return null but the callers may not check it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures
[ https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907687#comment-13907687 ] Suresh Srinivas edited comment on HDFS-5840 at 2/21/14 3:38 AM: [~atm], sorry for the late reply. I had lost track of this. {quote} As for handling the partial upgrade failure as you've described, I'd like to add one more RPC call to the JournalManager to initiate analysis/recovery of the storage dirs upon first contact, and then refactor the contents of FSImage#recoverStorageDirs into NNUpgradeUtil just like was done with the other upgrade-related procedures. If this sounds OK to you, I'll go ahead and add that stuff and appropriate tests. {quote} Why not always recover in preupgrade step, instead of adding another RPC? With rolling upgrade getting ready, some of the functionality added in that may be useful. For partial failures related to JournalNodes, the choice made in that feature to make the operation to rollback JournalNode idempotent. It looks like lot of rolling upgrade related code can be leveraged here, since upgrade is a special case of rolling upgrade. Should we explore that? was (Author: sureshms): [~atm], sorry for the late reply. I had lost track of this. {quote} As for handling the partial upgrade failure as you've described, I'd like to add one more RPC call to the JournalManager to initiate analysis/recovery of the storage dirs upon first contact, and then refactor the contents of FSImage#recoverStorageDirs into NNUpgradeUtil just like was done with the other upgrade-related procedures. If this sounds OK to you, I'll go ahead and add that stuff and appropriate tests. {quote} Why not always recover in preupgrade/upgrade step, instead of adding another RPC? With rolling upgrade getting ready, some of the functionality added in that may be useful. For partial failures related to JournalNodes, the choice made in that feature to make the operation to rollback JournalNode idempotent. It looks like lot of rolling upgrade related code can be leveraged here, since upgrade is a special case of rolling upgrade. Should we explore that? > Follow-up to HDFS-5138 to improve error handling during partial upgrade > failures > > > Key: HDFS-5840 > URL: https://issues.apache.org/jira/browse/HDFS-5840 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 3.0.0 > > Attachments: HDFS-5840.patch > > > Suresh posted some good comment in HDFS-5138 after that patch had already > been committed to trunk. This JIRA is to address those. See the first comment > of this JIRA for the full content of the review. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5535) Umbrella jira for improved HDFS rolling upgrades
[ https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907921#comment-13907921 ] Hadoop QA commented on HDFS-5535: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630196/h5535_20140220-1554.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 39 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: org.apache.hadoop.hdfs.server.namenode.TestStorageRestore org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade org.apache.hadoop.hdfs.server.namenode.snapshot.TestCheckpointsWithSnapshots org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults org.apache.hadoop.fs.loadGenerator.TestLoadGenerator org.apache.hadoop.hdfs.TestRollingUpgrade org.apache.hadoop.hdfs.TestRollingUpgradeRollback org.apache.hadoop.hdfs.qjournal.server.TestJournalNode org.apache.hadoop.hdfs.util.TestMD5FileUtils org.apache.hadoop.hdfs.qjournal.TestNNWithQJM org.apache.hadoop.hdfs.server.namenode.TestNameEditsConfigs org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager org.apache.hadoop.hdfs.server.namenode.TestStartup org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.server.namenode.TestCheckpoint {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6199//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/6199//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6199//console This message is automatically generated. > Umbrella jira for improved HDFS rolling upgrades > > > Key: HDFS-5535 > URL: https://issues.apache.org/jira/browse/HDFS-5535 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, ha, hdfs-client, namenode >Affects Versions: 3.0.0, 2.2.0 >Reporter: Nathan Roberts > Attachments: HDFSRollingUpgradesHighLevelDesign.pdf, > h5535_20140219.patch, h5535_20140220-1554.patch > > > In order to roll a new HDFS release through a large cluster quickly and > safely, a few enhancements are needed in HDFS. An initial High level design > document will be attached to this jira, and sub-jiras will itemize the > individual tasks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5064) Standby checkpoints should not block concurrent readers
[ https://issues.apache.org/jira/browse/HDFS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907920#comment-13907920 ] Andrew Wang commented on HDFS-5064: --- Hi ATM, I looked at this patch. It needs a small rebase for the lock fairness change, but I was still able to review. I have just one nit: 64-bit reads are not atomic in the current Java memory model, so we need to slap a volatile on {{NNStorage#mostRecentCheckpointId}} since the getter is no longer synchronized. At a high-level, this makes sense to me as an intermediate solution for the specific issue of the SbNN and checkpointing, until we actually separate out block management from the namespace. Kihwal, do you have any reservations about this approach? Otherwise, I'm +1 for this change pending rebase and Jenkins. > Standby checkpoints should not block concurrent readers > --- > > Key: HDFS-5064 > URL: https://issues.apache.org/jira/browse/HDFS-5064 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, namenode >Affects Versions: 2.3.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-5064.patch > > > We've observed an issue which causes fetches of the {{/jmx}} page of the NN > to take a long time to load when the standby is in the process of creating a > checkpoint. > Even though both creating the checkpoint and gathering the statistics for > {{/jmx}} take only the FSNS read lock, the issue is that since the FSNS uses > a _fair_ RW lock, a single writer attempting to get the lock will block all > threads attempting to get only the read lock for the duration of the > checkpoint. This will cause {{/jmx}}, and really any thread only attempting > to get the read lock, to block for the duration of the checkpoint, even > though they should be able to proceed concurrently with the checkpointing > thread. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907917#comment-13907917 ] Suresh Srinivas commented on HDFS-5274: --- bq. Wouldn't adding htrace to the common pom.xml make it "...available in Hadoop common"? Thanks. I agree with the comment [~cutting] had made - https://issues.apache.org/jira/browse/HADOOP-10311?focusedCommentId=13886809&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13886809. I would love to see the code in Hadoop community itself, if possible. Agreed, this is not different using google guava or other such libraries. But I am afraid this could start a trend of capabilities hosted/attributed especially to vendor companies making its way into Hadoop. With that said, I am -0 and would rather not see this becoming a trend. > Add Tracing to HDFS > --- > > Key: HDFS-5274 > URL: https://issues.apache.org/jira/browse/HDFS-5274 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 2.1.1-beta >Reporter: Elliott Clark >Assignee: Elliott Clark > Attachments: HDFS-5274-0.patch, HDFS-5274-1.patch, HDFS-5274-2.patch, > HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, HDFS-5274-6.patch, > HDFS-5274-7.patch, Zipkin Trace a06e941b0172ec73.png, Zipkin Trace > d0f0d66b8a258a69.png > > > Since Google's Dapper paper has shown the benefits of tracing for a large > distributed system, it seems like a good time to add tracing to HDFS. HBase > has added tracing using HTrace. I propose that the same can be done within > HDFS. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5988) Bad fsimage always generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907910#comment-13907910 ] Hudson commented on HDFS-5988: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5201 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5201/]) HDFS-5988. Bad fsimage always generated after upgrade. (wang) (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570429) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/LsrPBImage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java > Bad fsimage always generated after upgrade > -- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Fix For: 2.4.0 > > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5988) Bad fsimage always generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-5988: -- Resolution: Fixed Fix Version/s: 2.4.0 Status: Resolved (was: Patch Available) Committed to trunk, branch-2, branch-2.4. Thanks for the quick +1 Jing! > Bad fsimage always generated after upgrade > -- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Fix For: 2.4.0 > > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5988) Bad fsimage always generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907891#comment-13907891 ] Andrew Wang commented on HDFS-5988: --- I believe the test failure is HDFS-5991, known flake. Will commit. > Bad fsimage always generated after upgrade > -- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5988) Bad fsimage always generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907882#comment-13907882 ] Hadoop QA commented on HDFS-5988: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630183/hdfs-5988-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.loadGenerator.TestLoadGenerator {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6198//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6198//console This message is automatically generated. > Bad fsimage always generated after upgrade > -- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5957) Provide support for different mmap cache retention policies in ShortCircuitCache.
[ https://issues.apache.org/jira/browse/HDFS-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907875#comment-13907875 ] Colin Patrick McCabe commented on HDFS-5957: I talked to [~kkambatl] about this. It seems that YARN is monitoring the process' {{RSS}} (resident set size), which does seem to include the physical memory taken up by memory-mapped files. I think this is unfortunate. The physical memory taken up by mmapped files is basically part of the page cache. If there is any memory pressure at all, it's easy to purge this memory (the pages are "clean") Charging an application for this memory is similar to charging it for the page cache consumed by calls to read(2)-- it doesn't really make sense for this application. I think this is a problem within YARN, which has to be fixed inside YARN. bq. It sounds like you really do need a deterministic way to trigger the munmap calls, i.e. LRU caching or no caching at all described above. The {{munmap}} calls are deterministic now. You can control the number of unused mmaps that we'll store by changing {{dfs.client.mmap.cache.size}}. It's very important to keep in mind that {{dfs.client.mmap.cache.size}} controls the size of the cache, *not* the total number of mmaps. So if my application has 10 threads that each use an mmap at a time, and the maximum cache size is 10, I may have 20 mmaps in existence at any given time. The maximum size of any mmap is going to be the size of a block, so you should be able to use this to calculate how much RSS you will need. bq. For small 200Gb data-sets (~1.4x tasks per container), ZCR does give a perf boost because we get to use HADOOP-10047 instead of shuffling it between byte[] buffers for decompression. As a workaround, have you considered reading into a direct {{ByteBuffer}} that you allocated yourself? {{DFSInputStream}} implements the {{ByteBufferReadable}} interface, which lets you read into any {{ByteBuffer}}. This would avoid the array copy that you're talking about. I hope we can fix this within YARN soon, since otherwise the perf benefit of zero-copy reads will be substantially reduced or eliminated (as well as people's ability to use ZCR in the first place) > Provide support for different mmap cache retention policies in > ShortCircuitCache. > - > > Key: HDFS-5957 > URL: https://issues.apache.org/jira/browse/HDFS-5957 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Chris Nauroth > > Currently, the {{ShortCircuitCache}} retains {{mmap}} regions for reuse by > multiple reads of the same block or by multiple threads. The eventual > {{munmap}} executes on a background thread after an expiration period. Some > client usage patterns would prefer strict bounds on this cache and > deterministic cleanup by calling {{munmap}}. This issue proposes additional > support for different caching policies that better fit these usage patterns. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5939) WebHdfs returns misleading error code and logs nothing if trying to create a file with no DNs in cluster
[ https://issues.apache.org/jira/browse/HDFS-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907872#comment-13907872 ] Yongjun Zhang commented on HDFS-5939: - Hi Haohui and Tsz, Thanks a lot for your earlier review and the good info you provided. I just uploaded a modified version (003) to address all the comments. I found a bug when I'm doing the test (filed HDFS-5989). After working around HDFS-5989, my test of the updated fix is fine. Would you please review again and help to commit it if it's fine with you? Thanks. > WebHdfs returns misleading error code and logs nothing if trying to create a > file with no DNs in cluster > > > Key: HDFS-5939 > URL: https://issues.apache.org/jira/browse/HDFS-5939 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-5939.001.patch, HDFS-5939.002.patch, > HDFS-5939.003.patch > > > When trying to access hdfs via webhdfs, and when datanode is dead, user will > see an exception below without any clue that it's caused by dead datanode: > $ curl -i -X PUT > ".../webhdfs/v1/t1?op=CREATE&user.name=&overwrite=false" > ... > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"n > must be positive"}} > Need to fix the report to give user hint about dead datanode. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5939) WebHdfs returns misleading error code and logs nothing if trying to create a file with no DNs in cluster
[ https://issues.apache.org/jira/browse/HDFS-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-5939: Attachment: HDFS-5939.003.patch > WebHdfs returns misleading error code and logs nothing if trying to create a > file with no DNs in cluster > > > Key: HDFS-5939 > URL: https://issues.apache.org/jira/browse/HDFS-5939 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-5939.001.patch, HDFS-5939.002.patch, > HDFS-5939.003.patch > > > When trying to access hdfs via webhdfs, and when datanode is dead, user will > see an exception below without any clue that it's caused by dead datanode: > $ curl -i -X PUT > ".../webhdfs/v1/t1?op=CREATE&user.name=&overwrite=false" > ... > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"n > must be positive"}} > Need to fix the report to give user hint about dead datanode. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907863#comment-13907863 ] Liang Xie commented on HDFS-5776: - bq. little discernible overall difference in spite of my flushing file system cache you need to have a huge test data size than physical memory, such that lots of HBase read will come to disks, if the disk contend is big enough(e.g. await from iostat reached tens of ms, even hundreds of ms), then the slow disk will make the difference obviously:) That's why i set up my test env with only one sata disk per dn instance, that will need less test data be loaded to observe a difference. > Support 'hedged' reads in DFSClient > --- > > Key: HDFS-5776 > URL: https://issues.apache.org/jira/browse/HDFS-5776 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0 >Reporter: Liang Xie >Assignee: Liang Xie > Attachments: HDFS-5776-v10.txt, HDFS-5776-v11.txt, HDFS-5776-v12.txt, > HDFS-5776-v12.txt, HDFS-5776-v13.wip.txt, HDFS-5776-v14.txt, > HDFS-5776-v15.txt, HDFS-5776-v17.txt, HDFS-5776-v17.txt, HDFS-5776-v2.txt, > HDFS-5776-v3.txt, HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt, > HDFS-5776-v7.txt, HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt, > HDFS-5776v18.txt, HDFS-5776v21.txt > > > This is a placeholder of hdfs related stuff backport from > https://issues.apache.org/jira/browse/HBASE-7509 > The quorum read ability should be helpful especially to optimize read outliers > we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & > "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read > ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we > could export the interested metric valus into client system(e.g. HBase's > regionserver metric). > The core logic is in pread code path, we decide to goto the original > fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per > the above config items. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HDFS-2538) option to disable fsck dots
[ https://issues.apache.org/jira/browse/HDFS-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam reassigned HDFS-2538: --- Assignee: Mohammad Kamrul Islam > option to disable fsck dots > > > Key: HDFS-2538 > URL: https://issues.apache.org/jira/browse/HDFS-2538 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.20.204.0, 1.0.0 >Reporter: Allen Wittenauer >Assignee: Mohammad Kamrul Islam >Priority: Minor > Labels: newbie > Attachments: HDFS-2538-branch-0.20-security-204.patch, > HDFS-2538-branch-0.20-security-204.patch, HDFS-2538-branch-1.0.patch > > > this patch turns the dots during fsck off by default and provides an option > to turn them back on if you have a fetish for millions and millions of dots > on your terminal. i haven't done any benchmarks, but i suspect fsck is now > 300% faster to boot. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5991) TestLoadGenerator#testLoadGenerator fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907853#comment-13907853 ] Jing Zhao commented on HDFS-5991: - +1 for the patch. I will commit it shortly. > TestLoadGenerator#testLoadGenerator fails on trunk > -- > > Key: HDFS-5991 > URL: https://issues.apache.org/jira/browse/HDFS-5991 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Akira AJISAKA >Assignee: Haohui Mai > Attachments: HDFS-5991.000.patch, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator-output.txt, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.txt > > > From https://builds.apache.org/job/PreCommit-HDFS-Build/6194//testReport/ > {code} > java.io.IOException: Stream closed > at java.io.BufferedReader.ensureOpen(BufferedReader.java:97) > at java.io.BufferedReader.readLine(BufferedReader.java:292) > at java.io.BufferedReader.readLine(BufferedReader.java:362) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) > at > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5931) Potential bugs and improvements for exception handlers
[ https://issues.apache.org/jira/browse/HDFS-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907848#comment-13907848 ] Ding Yuan commented on HDFS-5931: - Hi [~atm], I took another close look at the test output. It seems the SocketTimeoutException might not be caused by my patch (I ran the test on my machine and it passed). Is there any chance to comment on this patch, and if indeed the test was broke by the patch or there is any other problems with my patch I can further fix it? Thanks, > Potential bugs and improvements for exception handlers > -- > > Key: HDFS-5931 > URL: https://issues.apache.org/jira/browse/HDFS-5931 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, namenode >Affects Versions: 2.2.0 >Reporter: Ding Yuan > Attachments: hdfs-5931-v2.patch, hdfs-5931-v3.patch, hdfs-5931.patch > > > This is to report some improvements and potential bug fixes to some error > handling code. Also attaching a patch for review. > Details in the first comment. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5991) TestLoadGenerator#testLoadGenerator fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907843#comment-13907843 ] Hadoop QA commented on HDFS-5991: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630203/HDFS-5991.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6200//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6200//console This message is automatically generated. > TestLoadGenerator#testLoadGenerator fails on trunk > -- > > Key: HDFS-5991 > URL: https://issues.apache.org/jira/browse/HDFS-5991 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Akira AJISAKA >Assignee: Haohui Mai > Attachments: HDFS-5991.000.patch, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator-output.txt, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.txt > > > From https://builds.apache.org/job/PreCommit-HDFS-Build/6194//testReport/ > {code} > java.io.IOException: Stream closed > at java.io.BufferedReader.ensureOpen(BufferedReader.java:97) > at java.io.BufferedReader.readLine(BufferedReader.java:292) > at java.io.BufferedReader.readLine(BufferedReader.java:362) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) > at > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HDFS-5990) Create options to search files/dirs in OfflineImageViewer
[ https://issues.apache.org/jira/browse/HDFS-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA reassigned HDFS-5990: --- Assignee: Akira AJISAKA > Create options to search files/dirs in OfflineImageViewer > - > > Key: HDFS-5990 > URL: https://issues.apache.org/jira/browse/HDFS-5990 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: tools >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA >Priority: Minor > > The enhancement of HDFS-5975. > I suggest options to search files/dirs in OfflineImageViewer. > An example command is as follows: > {code} > hdfs oiv -i input -o output -p Ls -owner theuser -group supergroup -minSize > 1024 -maxSize 1048576 > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5935) New Namenode UI FS browser should throw smarter error messages
[ https://issues.apache.org/jira/browse/HDFS-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Travis Thompson updated HDFS-5935: -- Attachment: HDFS-5935-4.patch > New Namenode UI FS browser should throw smarter error messages > -- > > Key: HDFS-5935 > URL: https://issues.apache.org/jira/browse/HDFS-5935 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Travis Thompson >Assignee: Travis Thompson >Priority: Minor > Attachments: HDFS-5935-1.patch, HDFS-5935-2.patch, HDFS-5935-3.patch, > HDFS-5935-4.patch > > > When browsing using the new FS browser in the namenode, if I try to browse a > folder that I don't have permission to view, it throws the error: > {noformat} > Failed to retreive data from /webhdfs/v1/system?op=LISTSTATUS, cause: > Forbidden > WebHDFS might be disabled. WebHDFS is required to browse the filesystem. > {noformat} > The reason I'm not allowed to see /system is because I don't have permission, > not because WebHDFS is disabled. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5935) New Namenode UI FS browser should throw smarter error messages
[ https://issues.apache.org/jira/browse/HDFS-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Travis Thompson updated HDFS-5935: -- Attachment: (was: HDFS-5935-4.patch) > New Namenode UI FS browser should throw smarter error messages > -- > > Key: HDFS-5935 > URL: https://issues.apache.org/jira/browse/HDFS-5935 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Travis Thompson >Assignee: Travis Thompson >Priority: Minor > Attachments: HDFS-5935-1.patch, HDFS-5935-2.patch, HDFS-5935-3.patch > > > When browsing using the new FS browser in the namenode, if I try to browse a > folder that I don't have permission to view, it throws the error: > {noformat} > Failed to retreive data from /webhdfs/v1/system?op=LISTSTATUS, cause: > Forbidden > WebHDFS might be disabled. WebHDFS is required to browse the filesystem. > {noformat} > The reason I'm not allowed to see /system is because I don't have permission, > not because WebHDFS is disabled. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5935) New Namenode UI FS browser should throw smarter error messages
[ https://issues.apache.org/jira/browse/HDFS-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Travis Thompson updated HDFS-5935: -- Attachment: HDFS-5935-4.patch Update with combined if statement > New Namenode UI FS browser should throw smarter error messages > -- > > Key: HDFS-5935 > URL: https://issues.apache.org/jira/browse/HDFS-5935 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Travis Thompson >Assignee: Travis Thompson >Priority: Minor > Attachments: HDFS-5935-1.patch, HDFS-5935-2.patch, HDFS-5935-3.patch > > > When browsing using the new FS browser in the namenode, if I try to browse a > folder that I don't have permission to view, it throws the error: > {noformat} > Failed to retreive data from /webhdfs/v1/system?op=LISTSTATUS, cause: > Forbidden > WebHDFS might be disabled. WebHDFS is required to browse the filesystem. > {noformat} > The reason I'm not allowed to see /system is because I don't have permission, > not because WebHDFS is disabled. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5935) New Namenode UI FS browser should throw smarter error messages
[ https://issues.apache.org/jira/browse/HDFS-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907814#comment-13907814 ] Travis Thompson commented on HDFS-5935: --- Looks like you are right: [http://jsfiddle.net/GD4zN/] I'll update the patch with the combined if. > New Namenode UI FS browser should throw smarter error messages > -- > > Key: HDFS-5935 > URL: https://issues.apache.org/jira/browse/HDFS-5935 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Travis Thompson >Assignee: Travis Thompson >Priority: Minor > Attachments: HDFS-5935-1.patch, HDFS-5935-2.patch, HDFS-5935-3.patch > > > When browsing using the new FS browser in the namenode, if I try to browse a > folder that I don't have permission to view, it throws the error: > {noformat} > Failed to retreive data from /webhdfs/v1/system?op=LISTSTATUS, cause: > Forbidden > WebHDFS might be disabled. WebHDFS is required to browse the filesystem. > {noformat} > The reason I'm not allowed to see /system is because I don't have permission, > not because WebHDFS is disabled. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907816#comment-13907816 ] stack commented on HDFS-5274: - [~iwasakims] Thanks. When you are done, I'll try hooking it up w/ hbase to make sure we get a trace that spans the two systems. [~sureshms] Wouldn't adding htrace to the common pom.xml make it "...available in Hadoop common"? Thanks. > Add Tracing to HDFS > --- > > Key: HDFS-5274 > URL: https://issues.apache.org/jira/browse/HDFS-5274 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 2.1.1-beta >Reporter: Elliott Clark >Assignee: Elliott Clark > Attachments: HDFS-5274-0.patch, HDFS-5274-1.patch, HDFS-5274-2.patch, > HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, HDFS-5274-6.patch, > HDFS-5274-7.patch, Zipkin Trace a06e941b0172ec73.png, Zipkin Trace > d0f0d66b8a258a69.png > > > Since Google's Dapper paper has shown the benefits of tracing for a large > distributed system, it seems like a good time to add tracing to HDFS. HBase > has added tracing using HTrace. I propose that the same can be done within > HDFS. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5989) merge of HDFS-4685 to trunk introduced trunk test failure
[ https://issues.apache.org/jira/browse/HDFS-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907804#comment-13907804 ] Yongjun Zhang commented on HDFS-5989: - Hi Chris, I was writing my last update and just saw yours. Cool you figured out the root cause of this bug! Thanks for following-up! --Yongjun > merge of HDFS-4685 to trunk introduced trunk test failure > - > > Key: HDFS-5989 > URL: https://issues.apache.org/jira/browse/HDFS-5989 > Project: Hadoop HDFS > Issue Type: Bug > Environment: CentOS release 6.5 (Final) > cpe:/o:centos:linux:6:GA >Reporter: Yongjun Zhang >Assignee: Chris Nauroth > > HI, > I'm seeing trunk branch test failure locally (centOs6) today. And I > identified it's this commit that caused the failure. > Author: Chris Nauroth 2014-02-19 10:34:52 > Committer: Chris Nauroth 2014-02-19 10:34:52 > Parent: 7215d12fdce727e1f4bce21a156b0505bd9ba72a (YARN-1666. Modified RM HA > handling of include/exclude node-lists to be available across RM failover by > making using of a remote configuration-provider. Contributed by Xuan Gong.) > Parent: 603ebb82b31e9300cfbf81ed5dd6110f1cb31b27 (HDFS-4685. Correct minor > whitespace difference in FSImageSerialization.java in preparation for trunk > merge.) > Child: ef8a5bceb7f3ce34d08a5968777effd40e0b1d0f (YARN-1171. Add default > queue properties to Fair Scheduler documentation (Naren Koneru via Sandy > Ryza)) > Branches: remotes/apache/HDFS-5535, remotes/apache/trunk, testv10, testv3, > testv4, testv7 > Follows: testv5 > Precedes: > Merge HDFS-4685 to trunk. > > git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1569870 > 13f79535-47bb-0310-9956-ffa450edef68 > I'm not sure whether other folks are seeing the same, or maybe related to my > environment. But prior to chis change, I don't see this problem. > The failures are in TestWebHDFS: > Running org.apache.hadoop.hdfs.web.TestWebHDFS > Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 3.687 sec <<< > FAILURE! - in org.apache.hadoop.hdfs.web.TestWebHDFS > testLargeDirectory(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 2.478 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:699) > at > org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:359) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:340) > at > org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeDirectory(TestWebHDFS.java:229) > testNamenodeRestart(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 0.342 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) > at > org.apache.hadoop.hdfs.server.datanod
[jira] [Commented] (HDFS-5989) merge of HDFS-4685 to trunk introduced trunk test failure
[ https://issues.apache.org/jira/browse/HDFS-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907803#comment-13907803 ] Yongjun Zhang commented on HDFS-5989: - Hi Jing. Good to know that you also saw the same problem. I guess most developers take the default setting of ACL. I think it would be nice if the unit test is self-contained so its success/failure is not subject to which env setting we are running. So I wonder if the test itself can be modified to loose the ACL restriction, rather than to change the setting of a specific machine. What do you think? Thanks. > merge of HDFS-4685 to trunk introduced trunk test failure > - > > Key: HDFS-5989 > URL: https://issues.apache.org/jira/browse/HDFS-5989 > Project: Hadoop HDFS > Issue Type: Bug > Environment: CentOS release 6.5 (Final) > cpe:/o:centos:linux:6:GA >Reporter: Yongjun Zhang >Assignee: Chris Nauroth > > HI, > I'm seeing trunk branch test failure locally (centOs6) today. And I > identified it's this commit that caused the failure. > Author: Chris Nauroth 2014-02-19 10:34:52 > Committer: Chris Nauroth 2014-02-19 10:34:52 > Parent: 7215d12fdce727e1f4bce21a156b0505bd9ba72a (YARN-1666. Modified RM HA > handling of include/exclude node-lists to be available across RM failover by > making using of a remote configuration-provider. Contributed by Xuan Gong.) > Parent: 603ebb82b31e9300cfbf81ed5dd6110f1cb31b27 (HDFS-4685. Correct minor > whitespace difference in FSImageSerialization.java in preparation for trunk > merge.) > Child: ef8a5bceb7f3ce34d08a5968777effd40e0b1d0f (YARN-1171. Add default > queue properties to Fair Scheduler documentation (Naren Koneru via Sandy > Ryza)) > Branches: remotes/apache/HDFS-5535, remotes/apache/trunk, testv10, testv3, > testv4, testv7 > Follows: testv5 > Precedes: > Merge HDFS-4685 to trunk. > > git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1569870 > 13f79535-47bb-0310-9956-ffa450edef68 > I'm not sure whether other folks are seeing the same, or maybe related to my > environment. But prior to chis change, I don't see this problem. > The failures are in TestWebHDFS: > Running org.apache.hadoop.hdfs.web.TestWebHDFS > Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 3.687 sec <<< > FAILURE! - in org.apache.hadoop.hdfs.web.TestWebHDFS > testLargeDirectory(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 2.478 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:699) > at > org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:359) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:340) > at > org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeDirectory(TestWebHDFS.java:229) > testNamenodeRestart(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 0.342 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(
[jira] [Commented] (HDFS-5988) Bad fsimage always generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907801#comment-13907801 ] Andrew Wang commented on HDFS-5988: --- Thanks for the review Jing, I'll commit if test-patch comes back clean. > Bad fsimage always generated after upgrade > -- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907800#comment-13907800 ] Todd Lipcon commented on HDFS-5274: --- Hi Suresh. As one of the primary authors of HTrace I'd say that there are no plans to put it in Hadoop Common. Personally I think Hadoop Common as a grab-bag of all things Hadoop gets pretty messy, since it's harder to piece-meal upgrade different components. It also means that dependent apps like HBase would need to wait on new versions of Common and have even more difficulty building the same code against different versions of Hadoop. Happy to take contributions to HTrace via github pull request if you like, though. > Add Tracing to HDFS > --- > > Key: HDFS-5274 > URL: https://issues.apache.org/jira/browse/HDFS-5274 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 2.1.1-beta >Reporter: Elliott Clark >Assignee: Elliott Clark > Attachments: HDFS-5274-0.patch, HDFS-5274-1.patch, HDFS-5274-2.patch, > HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, HDFS-5274-6.patch, > HDFS-5274-7.patch, Zipkin Trace a06e941b0172ec73.png, Zipkin Trace > d0f0d66b8a258a69.png > > > Since Google's Dapper paper has shown the benefits of tracing for a large > distributed system, it seems like a good time to add tracing to HDFS. HBase > has added tracing using HTrace. I propose that the same can be done within > HDFS. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5989) merge of HDFS-4685 to trunk introduced trunk test failure
[ https://issues.apache.org/jira/browse/HDFS-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907799#comment-13907799 ] Chris Nauroth commented on HDFS-5989: - Hi, [~yzhangal]. Thank you for the bug report. I don't have a repro locally for this. I suspect that you're running with Smack enabled on your local file system. I believe the extra '.' indicator in the permission string indicates the presence of a Smack label. For Jing, it would be a similar situation with '+' appended to files with an ACL on the local file system. I suspect I know the root cause of this bug. I'll post a patch later tonight. > merge of HDFS-4685 to trunk introduced trunk test failure > - > > Key: HDFS-5989 > URL: https://issues.apache.org/jira/browse/HDFS-5989 > Project: Hadoop HDFS > Issue Type: Bug > Environment: CentOS release 6.5 (Final) > cpe:/o:centos:linux:6:GA >Reporter: Yongjun Zhang > > HI, > I'm seeing trunk branch test failure locally (centOs6) today. And I > identified it's this commit that caused the failure. > Author: Chris Nauroth 2014-02-19 10:34:52 > Committer: Chris Nauroth 2014-02-19 10:34:52 > Parent: 7215d12fdce727e1f4bce21a156b0505bd9ba72a (YARN-1666. Modified RM HA > handling of include/exclude node-lists to be available across RM failover by > making using of a remote configuration-provider. Contributed by Xuan Gong.) > Parent: 603ebb82b31e9300cfbf81ed5dd6110f1cb31b27 (HDFS-4685. Correct minor > whitespace difference in FSImageSerialization.java in preparation for trunk > merge.) > Child: ef8a5bceb7f3ce34d08a5968777effd40e0b1d0f (YARN-1171. Add default > queue properties to Fair Scheduler documentation (Naren Koneru via Sandy > Ryza)) > Branches: remotes/apache/HDFS-5535, remotes/apache/trunk, testv10, testv3, > testv4, testv7 > Follows: testv5 > Precedes: > Merge HDFS-4685 to trunk. > > git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1569870 > 13f79535-47bb-0310-9956-ffa450edef68 > I'm not sure whether other folks are seeing the same, or maybe related to my > environment. But prior to chis change, I don't see this problem. > The failures are in TestWebHDFS: > Running org.apache.hadoop.hdfs.web.TestWebHDFS > Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 3.687 sec <<< > FAILURE! - in org.apache.hadoop.hdfs.web.TestWebHDFS > testLargeDirectory(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 2.478 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:699) > at > org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:359) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:340) > at > org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeDirectory(TestWebHDFS.java:229) > testNamenodeRestart(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 0.342 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(Dat
[jira] [Assigned] (HDFS-5989) merge of HDFS-4685 to trunk introduced trunk test failure
[ https://issues.apache.org/jira/browse/HDFS-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth reassigned HDFS-5989: --- Assignee: Chris Nauroth > merge of HDFS-4685 to trunk introduced trunk test failure > - > > Key: HDFS-5989 > URL: https://issues.apache.org/jira/browse/HDFS-5989 > Project: Hadoop HDFS > Issue Type: Bug > Environment: CentOS release 6.5 (Final) > cpe:/o:centos:linux:6:GA >Reporter: Yongjun Zhang >Assignee: Chris Nauroth > > HI, > I'm seeing trunk branch test failure locally (centOs6) today. And I > identified it's this commit that caused the failure. > Author: Chris Nauroth 2014-02-19 10:34:52 > Committer: Chris Nauroth 2014-02-19 10:34:52 > Parent: 7215d12fdce727e1f4bce21a156b0505bd9ba72a (YARN-1666. Modified RM HA > handling of include/exclude node-lists to be available across RM failover by > making using of a remote configuration-provider. Contributed by Xuan Gong.) > Parent: 603ebb82b31e9300cfbf81ed5dd6110f1cb31b27 (HDFS-4685. Correct minor > whitespace difference in FSImageSerialization.java in preparation for trunk > merge.) > Child: ef8a5bceb7f3ce34d08a5968777effd40e0b1d0f (YARN-1171. Add default > queue properties to Fair Scheduler documentation (Naren Koneru via Sandy > Ryza)) > Branches: remotes/apache/HDFS-5535, remotes/apache/trunk, testv10, testv3, > testv4, testv7 > Follows: testv5 > Precedes: > Merge HDFS-4685 to trunk. > > git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1569870 > 13f79535-47bb-0310-9956-ffa450edef68 > I'm not sure whether other folks are seeing the same, or maybe related to my > environment. But prior to chis change, I don't see this problem. > The failures are in TestWebHDFS: > Running org.apache.hadoop.hdfs.web.TestWebHDFS > Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 3.687 sec <<< > FAILURE! - in org.apache.hadoop.hdfs.web.TestWebHDFS > testLargeDirectory(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 2.478 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:699) > at > org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:359) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:340) > at > org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeDirectory(TestWebHDFS.java:229) > testNamenodeRestart(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 0.342 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243) > at > org.apach
[jira] [Commented] (HDFS-5865) Update OfflineImageViewer document
[ https://issues.apache.org/jira/browse/HDFS-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907795#comment-13907795 ] Hadoop QA commented on HDFS-5865: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630161/HDFS-5865.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.loadGenerator.TestLoadGenerator {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6197//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6197//console This message is automatically generated. > Update OfflineImageViewer document > -- > > Key: HDFS-5865 > URL: https://issues.apache.org/jira/browse/HDFS-5865 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: documentation >Affects Versions: 2.4.0 >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA > Labels: newbie > Attachments: HDFS-5865.patch > > > OfflineImageViewer is renewed to handle the new format of fsimage by > HDFS-5698 (fsimage in protobuf). > We should document followings: > * The tool can handle the layout version of Hadoop 2.4 and up. (If you want > to handle the older version, you can use OfflineImageViewer of Hadoop 2.3) > * Remove deprecated options such as Delimited and Indented processor. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5935) New Namenode UI FS browser should throw smarter error messages
[ https://issues.apache.org/jira/browse/HDFS-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907793#comment-13907793 ] Haohui Mai commented on HDFS-5935: -- Does {code} if (jqxhr.responseJSON !== undefined && jqxhr.responseJSON.RemoteException !== undefined) { ... {code} work? > New Namenode UI FS browser should throw smarter error messages > -- > > Key: HDFS-5935 > URL: https://issues.apache.org/jira/browse/HDFS-5935 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Travis Thompson >Assignee: Travis Thompson >Priority: Minor > Attachments: HDFS-5935-1.patch, HDFS-5935-2.patch, HDFS-5935-3.patch > > > When browsing using the new FS browser in the namenode, if I try to browse a > folder that I don't have permission to view, it throws the error: > {noformat} > Failed to retreive data from /webhdfs/v1/system?op=LISTSTATUS, cause: > Forbidden > WebHDFS might be disabled. WebHDFS is required to browse the filesystem. > {noformat} > The reason I'm not allowed to see /system is because I don't have permission, > not because WebHDFS is disabled. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5982) Need to update snapshot manager when applying editlog for deleting a snapshottable directory
[ https://issues.apache.org/jira/browse/HDFS-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907791#comment-13907791 ] Hudson commented on HDFS-5982: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5200 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5200/]) HDFS-5982. Need to update snapshot manager when applying editlog for deleting a snapshottable directory. Contributed by Jing Zhao. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570395) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java > Need to update snapshot manager when applying editlog for deleting a > snapshottable directory > > > Key: HDFS-5982 > URL: https://issues.apache.org/jira/browse/HDFS-5982 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jing Zhao >Priority: Critical > Fix For: 2.4.0 > > Attachments: HDFS-5982.000.patch, HDFS-5982.001.patch, > HDFS-5982.001.patch > > > Currently after deleting a snapshottable directory which does not have > snapshots any more, we also remove the directory from the snapshottable > directory list in SnapshotManager. This works fine when handling a delete > request from user. However, when we apply the OP_DELETE editlog, > FSDirectory#unprotectedDelete(String, long) is called, which does not contain > the "updating snapshot manager" process. This may leave an non-existent inode > id in the snapshottable directory list, and can even lead to FSImage > corruption. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5935) New Namenode UI FS browser should throw smarter error messages
[ https://issues.apache.org/jira/browse/HDFS-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907780#comment-13907780 ] Travis Thompson commented on HDFS-5935: --- The reason I didn't combine them is because I was worried it would throw an error checking for {{jqxhr.responseJSON.RemoteException}} if {{jqxhr.responseJSON}} is undefined. If you don't think this is something to worry about, I can combine them. > New Namenode UI FS browser should throw smarter error messages > -- > > Key: HDFS-5935 > URL: https://issues.apache.org/jira/browse/HDFS-5935 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Travis Thompson >Assignee: Travis Thompson >Priority: Minor > Attachments: HDFS-5935-1.patch, HDFS-5935-2.patch, HDFS-5935-3.patch > > > When browsing using the new FS browser in the namenode, if I try to browse a > folder that I don't have permission to view, it throws the error: > {noformat} > Failed to retreive data from /webhdfs/v1/system?op=LISTSTATUS, cause: > Forbidden > WebHDFS might be disabled. WebHDFS is required to browse the filesystem. > {noformat} > The reason I'm not allowed to see /system is because I don't have permission, > not because WebHDFS is disabled. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5982) Need to update snapshot manager when applying editlog for deleting a snapshottable directory
[ https://issues.apache.org/jira/browse/HDFS-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-5982: Resolution: Fixed Fix Version/s: 2.4.0 Status: Resolved (was: Patch Available) Thanks for the review, Chris! I've committed this to trunk, branch-2 and branch-2.4.0. > Need to update snapshot manager when applying editlog for deleting a > snapshottable directory > > > Key: HDFS-5982 > URL: https://issues.apache.org/jira/browse/HDFS-5982 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jing Zhao >Priority: Critical > Fix For: 2.4.0 > > Attachments: HDFS-5982.000.patch, HDFS-5982.001.patch, > HDFS-5982.001.patch > > > Currently after deleting a snapshottable directory which does not have > snapshots any more, we also remove the directory from the snapshottable > directory list in SnapshotManager. This works fine when handling a delete > request from user. However, when we apply the OP_DELETE editlog, > FSDirectory#unprotectedDelete(String, long) is called, which does not contain > the "updating snapshot manager" process. This may leave an non-existent inode > id in the snapshottable directory list, and can even lead to FSImage > corruption. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5981) PBImageXmlWriter generates malformed XML
[ https://issues.apache.org/jira/browse/HDFS-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907766#comment-13907766 ] Haohui Mai commented on HDFS-5981: -- HDFS-5991 is tracking the failure of {{TestLoadGenerator}}. The failure of {{TestCacheDirectives}} is unrelated. > PBImageXmlWriter generates malformed XML > > > Key: HDFS-5981 > URL: https://issues.apache.org/jira/browse/HDFS-5981 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0, 2.4.0 >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Attachments: HDFS-5981.000.patch, HDFS-5981.001.patch, > HDFS-5981.002.patch, HDFS-5981.003.patch > > > {{PBImageXmlWriter}} outputs malformed XML file because it closes the > {{SnapshotDiffSection}}, {{NameSection}} and {{INodeReferenceSection}} > incorrectly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5991) TestLoadGenerator#testLoadGenerator fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5991: - Attachment: HDFS-5991.000.patch > TestLoadGenerator#testLoadGenerator fails on trunk > -- > > Key: HDFS-5991 > URL: https://issues.apache.org/jira/browse/HDFS-5991 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Akira AJISAKA >Assignee: Haohui Mai > Attachments: HDFS-5991.000.patch, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator-output.txt, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.txt > > > From https://builds.apache.org/job/PreCommit-HDFS-Build/6194//testReport/ > {code} > java.io.IOException: Stream closed > at java.io.BufferedReader.ensureOpen(BufferedReader.java:97) > at java.io.BufferedReader.readLine(BufferedReader.java:292) > at java.io.BufferedReader.readLine(BufferedReader.java:362) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) > at > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5951) Provide diagnosis information in the Web UI
[ https://issues.apache.org/jira/browse/HDFS-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907764#comment-13907764 ] Travis Thompson commented on HDFS-5951: --- I have to agree with [~sureshms], especially since the WebUI is hitting the JMX REST API which I think is a fairly common way to monitor these things with other tools like Nagios. To me, it seems to be more like a convince thing to expose useful diagnosis information, like the missing files message, but not try to replace things like Nagios checks. > Provide diagnosis information in the Web UI > --- > > Key: HDFS-5951 > URL: https://issues.apache.org/jira/browse/HDFS-5951 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-5951.000.patch, diagnosis-failure.png, > diagnosis-succeed.png > > > HDFS should provide operation statistics in its UI. it can go one step > further by leveraging the information to diagnose common problems. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5991) TestLoadGenerator#testLoadGenerator fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5991: - Status: Patch Available (was: Open) > TestLoadGenerator#testLoadGenerator fails on trunk > -- > > Key: HDFS-5991 > URL: https://issues.apache.org/jira/browse/HDFS-5991 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Akira AJISAKA >Assignee: Haohui Mai > Attachments: HDFS-5991.000.patch, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator-output.txt, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.txt > > > From https://builds.apache.org/job/PreCommit-HDFS-Build/6194//testReport/ > {code} > java.io.IOException: Stream closed > at java.io.BufferedReader.ensureOpen(BufferedReader.java:97) > at java.io.BufferedReader.readLine(BufferedReader.java:292) > at java.io.BufferedReader.readLine(BufferedReader.java:362) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) > at > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5981) PBImageXmlWriter generates malformed XML
[ https://issues.apache.org/jira/browse/HDFS-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907763#comment-13907763 ] Hadoop QA commented on HDFS-5981: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630158/HDFS-5981.003.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.loadGenerator.TestLoadGenerator org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6196//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6196//console This message is automatically generated. > PBImageXmlWriter generates malformed XML > > > Key: HDFS-5981 > URL: https://issues.apache.org/jira/browse/HDFS-5981 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0, 2.4.0 >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Attachments: HDFS-5981.000.patch, HDFS-5981.001.patch, > HDFS-5981.002.patch, HDFS-5981.003.patch > > > {{PBImageXmlWriter}} outputs malformed XML file because it closes the > {{SnapshotDiffSection}}, {{NameSection}} and {{INodeReferenceSection}} > incorrectly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5988) Bad fsimage always generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907757#comment-13907757 ] Jing Zhao commented on HDFS-5988: - Hi Andrew, I think you're right: getLayoutVersion() returns the layout version of the old fsimage, but we should update the current inode map any way. Thanks for the fix. +1 for the patch. > Bad fsimage always generated after upgrade > -- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5991) TestLoadGenerator#testLoadGenerator fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-5991: Attachment: org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.txt org.apache.hadoop.fs.loadGenerator.TestLoadGenerator-output.txt > TestLoadGenerator#testLoadGenerator fails on trunk > -- > > Key: HDFS-5991 > URL: https://issues.apache.org/jira/browse/HDFS-5991 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Akira AJISAKA >Assignee: Haohui Mai > Attachments: > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator-output.txt, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.txt > > > From https://builds.apache.org/job/PreCommit-HDFS-Build/6194//testReport/ > {code} > java.io.IOException: Stream closed > at java.io.BufferedReader.ensureOpen(BufferedReader.java:97) > at java.io.BufferedReader.readLine(BufferedReader.java:292) > at java.io.BufferedReader.readLine(BufferedReader.java:362) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) > at > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HDFS-5991) TestLoadGenerator#testLoadGenerator fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai reassigned HDFS-5991: Assignee: Haohui Mai > TestLoadGenerator#testLoadGenerator fails on trunk > -- > > Key: HDFS-5991 > URL: https://issues.apache.org/jira/browse/HDFS-5991 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Akira AJISAKA >Assignee: Haohui Mai > Attachments: > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator-output.txt, > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.txt > > > From https://builds.apache.org/job/PreCommit-HDFS-Build/6194//testReport/ > {code} > java.io.IOException: Stream closed > at java.io.BufferedReader.ensureOpen(BufferedReader.java:97) > at java.io.BufferedReader.readLine(BufferedReader.java:292) > at java.io.BufferedReader.readLine(BufferedReader.java:362) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) > at > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5981) PBImageXmlWriter generates malformed XML
[ https://issues.apache.org/jira/browse/HDFS-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907752#comment-13907752 ] Hadoop QA commented on HDFS-5981: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630155/HDFS-5981.002.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.loadGenerator.TestLoadGenerator {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6195//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6195//console This message is automatically generated. > PBImageXmlWriter generates malformed XML > > > Key: HDFS-5981 > URL: https://issues.apache.org/jira/browse/HDFS-5981 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0, 2.4.0 >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Attachments: HDFS-5981.000.patch, HDFS-5981.001.patch, > HDFS-5981.002.patch, HDFS-5981.003.patch > > > {{PBImageXmlWriter}} outputs malformed XML file because it closes the > {{SnapshotDiffSection}}, {{NameSection}} and {{INodeReferenceSection}} > incorrectly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5991) TestLoadGenerator#testLoadGenerator fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907751#comment-13907751 ] Akira AJISAKA commented on HDFS-5991: - I could reproduce this failure. I'll attach the logs. > TestLoadGenerator#testLoadGenerator fails on trunk > -- > > Key: HDFS-5991 > URL: https://issues.apache.org/jira/browse/HDFS-5991 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Akira AJISAKA > > From https://builds.apache.org/job/PreCommit-HDFS-Build/6194//testReport/ > {code} > java.io.IOException: Stream closed > at java.io.BufferedReader.ensureOpen(BufferedReader.java:97) > at java.io.BufferedReader.readLine(BufferedReader.java:292) > at java.io.BufferedReader.readLine(BufferedReader.java:362) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) > at > org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) > at > org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5924) Utilize OOB upgrade message processing for writes
[ https://issues.apache.org/jira/browse/HDFS-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907748#comment-13907748 ] Brandon Li commented on HDFS-5924: -- {quote}This is no worse than the current behavior.{quote} The application could experience much higher write failure rate during the datanode upgrade. For example, all datanodes inaccessible in the pipeline is more possible now. Shutting down a datanode after previous shutdown datanode is up might be able to minimize the write failure but would increase the total upgrade time. Recall there was some discussion that, after datanode sends OOB, its shutdown can be paused until the client agrees to let it go. I don't remember the reason why not taking that approach. > Utilize OOB upgrade message processing for writes > - > > Key: HDFS-5924 > URL: https://issues.apache.org/jira/browse/HDFS-5924 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, ha, hdfs-client, namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5924_RBW_RECOVERY.patch, > HDFS-5924_RBW_RECOVERY.patch > > > After HDFS-5585 and HDFS-5583, clients and datanodes can coordinate > shutdown-restart in order to minimize failures or locality loss. > In this jira, HDFS client is made aware of the restart OOB ack and perform > special write pipeline recovery. Datanode is also modified to load marked RBW > replicas as RBW instead of RWR as long as the restart did not take long. > For clients, it considers doing this kind of recovery only when there is only > one node left in the pipeline or the restarting node is a local datanode. > For both clients and datanodes, the timeout or expiration is configurable, > meaning this feature can be turned off by setting timeout variables to 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5991) TestLoadGenerator#testLoadGenerator fails on trunk
Akira AJISAKA created HDFS-5991: --- Summary: TestLoadGenerator#testLoadGenerator fails on trunk Key: HDFS-5991 URL: https://issues.apache.org/jira/browse/HDFS-5991 Project: Hadoop HDFS Issue Type: Bug Reporter: Akira AJISAKA >From https://builds.apache.org/job/PreCommit-HDFS-Build/6194//testReport/ {code} java.io.IOException: Stream closed at java.io.BufferedReader.ensureOpen(BufferedReader.java:97) at java.io.BufferedReader.readLine(BufferedReader.java:292) at java.io.BufferedReader.readLine(BufferedReader.java:362) at org.apache.hadoop.fs.loadGenerator.LoadGenerator.loadScriptFile(LoadGenerator.java:511) at org.apache.hadoop.fs.loadGenerator.LoadGenerator.init(LoadGenerator.java:418) at org.apache.hadoop.fs.loadGenerator.LoadGenerator.run(LoadGenerator.java:324) at org.apache.hadoop.fs.loadGenerator.TestLoadGenerator.testLoadGenerator(TestLoadGenerator.java:231) {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5535) Umbrella jira for improved HDFS rolling upgrades
[ https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5535: - Attachment: h5535_20140220-1554.patch h5535_20140220-1554.patch: try again after fixed some tests and warnings. > Umbrella jira for improved HDFS rolling upgrades > > > Key: HDFS-5535 > URL: https://issues.apache.org/jira/browse/HDFS-5535 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, ha, hdfs-client, namenode >Affects Versions: 3.0.0, 2.2.0 >Reporter: Nathan Roberts > Attachments: HDFSRollingUpgradesHighLevelDesign.pdf, > h5535_20140219.patch, h5535_20140220-1554.patch > > > In order to roll a new HDFS release through a large cluster quickly and > safely, a few enhancements are needed in HDFS. An initial High level design > document will be attached to this jira, and sub-jiras will itemize the > individual tasks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5989) merge of HDFS-4685 to trunk introduced trunk test failure
[ https://issues.apache.org/jira/browse/HDFS-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907734#comment-13907734 ] Jing Zhao commented on HDFS-5989: - I've met the same test failure also. The cause of the test failure in my machine is that I enabled ACL in my macbook. After disabling it the test passed. But in the meanwhile, do we want to loose the check a little bit for the unit test? > merge of HDFS-4685 to trunk introduced trunk test failure > - > > Key: HDFS-5989 > URL: https://issues.apache.org/jira/browse/HDFS-5989 > Project: Hadoop HDFS > Issue Type: Bug > Environment: CentOS release 6.5 (Final) > cpe:/o:centos:linux:6:GA >Reporter: Yongjun Zhang > > HI, > I'm seeing trunk branch test failure locally (centOs6) today. And I > identified it's this commit that caused the failure. > Author: Chris Nauroth 2014-02-19 10:34:52 > Committer: Chris Nauroth 2014-02-19 10:34:52 > Parent: 7215d12fdce727e1f4bce21a156b0505bd9ba72a (YARN-1666. Modified RM HA > handling of include/exclude node-lists to be available across RM failover by > making using of a remote configuration-provider. Contributed by Xuan Gong.) > Parent: 603ebb82b31e9300cfbf81ed5dd6110f1cb31b27 (HDFS-4685. Correct minor > whitespace difference in FSImageSerialization.java in preparation for trunk > merge.) > Child: ef8a5bceb7f3ce34d08a5968777effd40e0b1d0f (YARN-1171. Add default > queue properties to Fair Scheduler documentation (Naren Koneru via Sandy > Ryza)) > Branches: remotes/apache/HDFS-5535, remotes/apache/trunk, testv10, testv3, > testv4, testv7 > Follows: testv5 > Precedes: > Merge HDFS-4685 to trunk. > > git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1569870 > 13f79535-47bb-0310-9956-ffa450edef68 > I'm not sure whether other folks are seeing the same, or maybe related to my > environment. But prior to chis change, I don't see this problem. > The failures are in TestWebHDFS: > Running org.apache.hadoop.hdfs.web.TestWebHDFS > Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 3.687 sec <<< > FAILURE! - in org.apache.hadoop.hdfs.web.TestWebHDFS > testLargeDirectory(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 2.478 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:699) > at > org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:359) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:340) > at > org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeDirectory(TestWebHDFS.java:229) > testNamenodeRestart(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: > 0.342 sec <<< ERROR! > java.lang.IllegalArgumentException: length != > 10(unixSymbolicPermission=drwxrwxr-x.) > at > org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) > at > org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) >
[jira] [Commented] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907729#comment-13907729 ] Hudson commented on HDFS-5944: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5199 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5199/]) HDFS-5944. LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ and cause SecondaryNameNode failed do checkpoint. Contributed by Yunjiong Zhao (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570366) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestLeaseManager.java > LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and > cause SecondaryNameNode failed do checkpoint > > > Key: HDFS-5944 > URL: https://issues.apache.org/jira/browse/HDFS-5944 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 1.2.0, 2.2.0 >Reporter: zhaoyunjiong >Assignee: zhaoyunjiong > Fix For: 1.3.0, 2.4.0 > > Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, > HDFS-5944.test.txt, HDFS-5944.trunk.patch > > > In our cluster, we encountered error like this: > java.io.IOException: saveLeases found path > /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) > What happened: > Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. > And Client A continue refresh it's lease. > Client B deleted /XXX/20140206/04_30/ > Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write > Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log > Then secondaryNameNode try to do checkpoint and failed due to failed to > delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. > The reason is a bug in findLeaseWithPrefixPath: > int srclen = prefix.length(); > if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { > entries.put(entry.getKey(), entry.getValue()); > } > Here when prefix is /XXX/20140206/04_30/, and p is > /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. > The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5957) Provide support for different mmap cache retention policies in ShortCircuitCache.
[ https://issues.apache.org/jira/browse/HDFS-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907731#comment-13907731 ] Chris Nauroth commented on HDFS-5957: - bq. Chris Nauroth: mmap() does take up physical memory, assuming those pages are mapped into RAM and are not disk-resident. Yes, most definitely. I think Colin was trying to clarify that the initial mmap call dings virtual memory: call mmap for a 1 MB file and you'll immediately see virtual memory increase by 1 MB, but not physical memory. Certainly as the pages get accessed and mapped in, we'll start to consume physical memory. bq. For small 200Gb data-sets (~1.4x tasks per container), ZCR does give a perf boost because we get to use HADOOP-10047 instead of shuffling it between byte[] buffers for decompression. Thanks, that clarifies why zero-copy read was still useful. It sounds like you really do need a deterministic way to trigger the {{munmap}} calls, i.e. LRU caching or no caching at all described above. > Provide support for different mmap cache retention policies in > ShortCircuitCache. > - > > Key: HDFS-5957 > URL: https://issues.apache.org/jira/browse/HDFS-5957 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Chris Nauroth > > Currently, the {{ShortCircuitCache}} retains {{mmap}} regions for reuse by > multiple reads of the same block or by multiple threads. The eventual > {{munmap}} executes on a background thread after an expiration period. Some > client usage patterns would prefer strict bounds on this cache and > deterministic cleanup by calling {{munmap}}. This issue proposes additional > support for different caching policies that better fit these usage patterns. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5982) Need to update snapshot manager when applying editlog for deleting a snapshottable directory
[ https://issues.apache.org/jira/browse/HDFS-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907724#comment-13907724 ] Jing Zhao commented on HDFS-5982: - The failed test should be unrelated. I will commit the patch shortly. > Need to update snapshot manager when applying editlog for deleting a > snapshottable directory > > > Key: HDFS-5982 > URL: https://issues.apache.org/jira/browse/HDFS-5982 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-5982.000.patch, HDFS-5982.001.patch, > HDFS-5982.001.patch > > > Currently after deleting a snapshottable directory which does not have > snapshots any more, we also remove the directory from the snapshottable > directory list in SnapshotManager. This works fine when handling a delete > request from user. However, when we apply the OP_DELETE editlog, > FSDirectory#unprotectedDelete(String, long) is called, which does not contain > the "updating snapshot manager" process. This may leave an non-existent inode > id in the snapshottable directory list, and can even lead to FSImage > corruption. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5982) Need to update snapshot manager when applying editlog for deleting a snapshottable directory
[ https://issues.apache.org/jira/browse/HDFS-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907708#comment-13907708 ] Hadoop QA commented on HDFS-5982: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630107/HDFS-5982.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.loadGenerator.TestLoadGenerator {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6194//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6194//console This message is automatically generated. > Need to update snapshot manager when applying editlog for deleting a > snapshottable directory > > > Key: HDFS-5982 > URL: https://issues.apache.org/jira/browse/HDFS-5982 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-5982.000.patch, HDFS-5982.001.patch, > HDFS-5982.001.patch > > > Currently after deleting a snapshottable directory which does not have > snapshots any more, we also remove the directory from the snapshottable > directory list in SnapshotManager. This works fine when handling a delete > request from user. However, when we apply the OP_DELETE editlog, > FSDirectory#unprotectedDelete(String, long) is called, which does not contain > the "updating snapshot manager" process. This may leave an non-existent inode > id in the snapshottable directory list, and can even lead to FSImage > corruption. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5990) Create options to search files/dirs in OfflineImageViewer
Akira AJISAKA created HDFS-5990: --- Summary: Create options to search files/dirs in OfflineImageViewer Key: HDFS-5990 URL: https://issues.apache.org/jira/browse/HDFS-5990 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Akira AJISAKA Priority: Minor The enhancement of HDFS-5975. I suggest options to search files/dirs in OfflineImageViewer. An example command is as follows: {code} hdfs oiv -i input -o output -p Ls -owner theuser -group supergroup -minSize 1024 -maxSize 1048576 {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-4685) Implementation of ACLs in HDFS
[ https://issues.apache.org/jira/browse/HDFS-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907703#comment-13907703 ] Yongjun Zhang commented on HDFS-4685: - HI [~cnauroth], I'm seeing trunk branch test failure locally (centOs6) today. And I identified it's this merge of this fix that caused the failure. I'm not sure whether other people are seeing the same problem, and whether it's because of my env. Prior to this change, I don't see the problem. I filed HDFS-5989 to log the issue, in case it's a real one. Would you please take a look at it? Thanks. > Implementation of ACLs in HDFS > -- > > Key: HDFS-4685 > URL: https://issues.apache.org/jira/browse/HDFS-4685 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs-client, namenode, security >Affects Versions: 1.1.2 >Reporter: Sachin Jose >Assignee: Chris Nauroth > Fix For: 3.0.0 > > Attachments: HDFS-4685.1.patch, HDFS-4685.2.patch, HDFS-4685.3.patch, > HDFS-4685.4.patch, HDFS-ACLs-Design-1.pdf, HDFS-ACLs-Design-2.pdf, > HDFS-ACLs-Design-3.pdf, Test-Plan-for-Extended-Acls-1.pdf > > > Currenly hdfs doesn't support Extended file ACL. In unix extended ACL can be > achieved using getfacl and setfacl utilities. Is there anybody working on > this feature ? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5957) Provide support for different mmap cache retention policies in ShortCircuitCache.
[ https://issues.apache.org/jira/browse/HDFS-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907700#comment-13907700 ] Gopal V commented on HDFS-5957: --- [~cnauroth]: mmap() does take up physical memory, assuming those pages are mapped into RAM and are not disk-resident. As long as we're on Linux, it will show up in RSS as well as marked in the Shared_Clean/Referenced field in /proc//smaps. YARN could do a better job of calculating "How much memory will be free'd up if this process is killed" vs "How much memory does this process use". But that is a completely different issue. When I set the mmap timeout to 1000ms, some of my queries succeeded - mostly the queries which were taking > 50 seconds. But the really fast ORC queries which take ~10 seconds to run still managed to hit around ~50x task failures out of ~3000 map tasks. The perf dip happens because some of the failures. For small 200Gb data-sets (~1.4x tasks per container), ZCR does give a perf boost because we get to use HADOOP-10047 instead of shuffling it between byte[] buffers for decompression. > Provide support for different mmap cache retention policies in > ShortCircuitCache. > - > > Key: HDFS-5957 > URL: https://issues.apache.org/jira/browse/HDFS-5957 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Chris Nauroth > > Currently, the {{ShortCircuitCache}} retains {{mmap}} regions for reuse by > multiple reads of the same block or by multiple threads. The eventual > {{munmap}} executes on a background thread after an expiration period. Some > client usage patterns would prefer strict bounds on this cache and > deterministic cleanup by calling {{munmap}}. This issue proposes additional > support for different caching policies that better fit these usage patterns. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5981) PBImageXmlWriter generates malformed XML
[ https://issues.apache.org/jira/browse/HDFS-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5981: Component/s: tools Target Version/s: 3.0.0, 2.4.0 Affects Version/s: 2.4.0 3.0.0 +1 for the patch. Thank you for adding the test. I'll commit this later today. > PBImageXmlWriter generates malformed XML > > > Key: HDFS-5981 > URL: https://issues.apache.org/jira/browse/HDFS-5981 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0, 2.4.0 >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Attachments: HDFS-5981.000.patch, HDFS-5981.001.patch, > HDFS-5981.002.patch, HDFS-5981.003.patch > > > {{PBImageXmlWriter}} outputs malformed XML file because it closes the > {{SnapshotDiffSection}}, {{NameSection}} and {{INodeReferenceSection}} > incorrectly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5989) merge of HDFS-4685 to trunk introduced trunk test failure
Yongjun Zhang created HDFS-5989: --- Summary: merge of HDFS-4685 to trunk introduced trunk test failure Key: HDFS-5989 URL: https://issues.apache.org/jira/browse/HDFS-5989 Project: Hadoop HDFS Issue Type: Bug Environment: CentOS release 6.5 (Final) cpe:/o:centos:linux:6:GA Reporter: Yongjun Zhang HI, I'm seeing trunk branch test failure locally (centOs6) today. And I identified it's this commit that caused the failure. Author: Chris Nauroth 2014-02-19 10:34:52 Committer: Chris Nauroth 2014-02-19 10:34:52 Parent: 7215d12fdce727e1f4bce21a156b0505bd9ba72a (YARN-1666. Modified RM HA handling of include/exclude node-lists to be available across RM failover by making using of a remote configuration-provider. Contributed by Xuan Gong.) Parent: 603ebb82b31e9300cfbf81ed5dd6110f1cb31b27 (HDFS-4685. Correct minor whitespace difference in FSImageSerialization.java in preparation for trunk merge.) Child: ef8a5bceb7f3ce34d08a5968777effd40e0b1d0f (YARN-1171. Add default queue properties to Fair Scheduler documentation (Naren Koneru via Sandy Ryza)) Branches: remotes/apache/HDFS-5535, remotes/apache/trunk, testv10, testv3, testv4, testv7 Follows: testv5 Precedes: Merge HDFS-4685 to trunk. git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1569870 13f79535-47bb-0310-9956-ffa450edef68 I'm not sure whether other folks are seeing the same, or maybe related to my environment. But prior to chis change, I don't see this problem. The failures are in TestWebHDFS: Running org.apache.hadoop.hdfs.web.TestWebHDFS Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 3.687 sec <<< FAILURE! - in org.apache.hadoop.hdfs.web.TestWebHDFS testLargeDirectory(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: 2.478 sec <<< ERROR! java.lang.IllegalArgumentException: length != 10(unixSymbolicPermission=drwxrwxr-x.) at org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764) at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243) at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:699) at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:359) at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:340) at org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeDirectory(TestWebHDFS.java:229) testNamenodeRestart(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: 0.342 sec <<< ERROR! java.lang.IllegalArgumentException: length != 10(unixSymbolicPermission=drwxrwxr-x.) at org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323) at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572) at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540) at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764) at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243) at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:699) at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:359) at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:340) at org.apache.hadoop.hdfs.TestDFSClientRetries.namenodeRestartTest(TestDFSClientRetries.java:886) at org.apache.hadoop.hdfs.web.TestWebHDFS.testNamenode
[jira] [Resolved] (HDFS-5987) Fix findbugs warnings in Rolling Upgrade branch
[ https://issues.apache.org/jira/browse/HDFS-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HDFS-5987. - Resolution: Fixed Fix Version/s: HDFS-5535 (Rolling upgrades) Target Version/s: HDFS-5535 (Rolling upgrades) Hadoop Flags: Reviewed +1 for the patch. Thanks for fixing these Nicholas! > Fix findbugs warnings in Rolling Upgrade branch > --- > > Key: HDFS-5987 > URL: https://issues.apache.org/jira/browse/HDFS-5987 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, ha, hdfs-client, namenode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE >Priority: Minor > Fix For: HDFS-5535 (Rolling upgrades) > > Attachments: h5987_20140220.patch > > > {noformat} > RV > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.restoreBlockFilesFromTrash(File) > ignores exceptional return value of java.io.File.mkdirs() > RV > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.restoreBlockFilesFromTrash(File) > ignores exceptional return value of java.io.File.renameTo(File) > RV > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService$ReplicaFileDeleteTask.moveFiles() > ignores exceptional return value of java.io.File.mkdirs() > ISInconsistent synchronization of > org.apache.hadoop.hdfs.qjournal.server.Journal.committedTxnId; locked 92% of > time > NPDereference of the result of readLine() without nullcheck in > org.apache.hadoop.hdfs.util.MD5FileUtils.renameMD5File(File, File) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures
[ https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907687#comment-13907687 ] Suresh Srinivas commented on HDFS-5840: --- [~atm], sorry for the late reply. I had lost track of this. {quote} As for handling the partial upgrade failure as you've described, I'd like to add one more RPC call to the JournalManager to initiate analysis/recovery of the storage dirs upon first contact, and then refactor the contents of FSImage#recoverStorageDirs into NNUpgradeUtil just like was done with the other upgrade-related procedures. If this sounds OK to you, I'll go ahead and add that stuff and appropriate tests. {quote} Why not always recover in preupgrade/upgrade step, instead of adding another RPC? With rolling upgrade getting ready, some of the functionality added in that may be useful. For partial failures related to JournalNodes, the choice made in that feature to make the operation to rollback JournalNode idempotent. It looks like lot of rolling upgrade related code can be leveraged here, since upgrade is a special case of rolling upgrade. Should we explore that? > Follow-up to HDFS-5138 to improve error handling during partial upgrade > failures > > > Key: HDFS-5840 > URL: https://issues.apache.org/jira/browse/HDFS-5840 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 3.0.0 > > Attachments: HDFS-5840.patch > > > Suresh posted some good comment in HDFS-5138 after that patch had already > been committed to trunk. This JIRA is to address those. See the first comment > of this JIRA for the full content of the review. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5957) Provide support for different mmap cache retention policies in ShortCircuitCache.
[ https://issues.apache.org/jira/browse/HDFS-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907681#comment-13907681 ] Chris Nauroth commented on HDFS-5957: - bq. mmap regions don't consume physical memory. They do consume virtual memory. YARN has checks on both physical and virtual memory. I reviewed the logs from the application, and it is in fact the physical memory threshold that was exceeded. YARN calculates this by checking /proc/pid/stat for the RSS and multiplying by page size. The process was well within the virtual memory threshold, so virtual address space was not a problem. {code} containerID=container_1392067467498_0193_01_000282] is running beyond physical memory limits. Current usage: 4.5 GB of 4 GB physical memory used; 9.4 GB of 40 GB virtual memory used. Killing container. Dump of the process-tree for container_1392067467498_0193_01_000282 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 27095 27015 27015 27015 (java) 8640 1190 9959014400 1189585 /grid/0/jdk/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -server -Xmx3584m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/grid/4/cluster/yarn/logs/application_1392067467498_0193/container_1392067467498_0193_01_000282 -Dtez.root.logger=INFO,CLA -Djava.io.tmpdir=/grid/4/cluster/yarn/local/usercache/gopal/appcache/application_1392067467498_0193/container_1392067467498_0193_01_000282/tmp org.apache.hadoop.mapred.YarnTezDagChild 172.19.0.45 38627 container_1392067467498_0193_01_000282 application_1392067467498_0193 1 {code} bq. I don't think YARN should limit the consumption of virtual memory. virtual memory imposes almost no cost on the system and limiting it leads to problems like this one. I don't know the full history behind the virtual memory threshold. I've always assumed that it was in place to guard against virtual address space exhaustion and possible intervention by the OOM killer. So far, the virtual memory threshold doesn't appear to be a factor in this case. bq. It should be possible to limit the consumption of actual memory (not virtual address space) and solve this problem that way. What do you think? Yes, I agree that the issue here is physical memory based on the logs. What we know at this point is that short-circuit reads were counted against the process's RSS, eventually triggering YARN's physical memory check. Then, downtuning {{dfs.client.mmap.cache.timeout.ms}} made the problem go away. I think we can come up with a minimal repro that demonstrates it. Gopal might even already have this. bq. In our tests, mmap provided no performance advantage unless it was reused. If Gopal needs to purge mmaps immediately after using them, the correct thing is simply not to use zero-copy reads. Yes, something doesn't quite jive here. [~gopalv], can you comment on whether or not you're seeing a performance benefit with zero-copy read after down-tuning {{dfs.client.mmap.cache.timeout.ms}} like I advised? If so, then did I miss something in the description of your application's access pattern? > Provide support for different mmap cache retention policies in > ShortCircuitCache. > - > > Key: HDFS-5957 > URL: https://issues.apache.org/jira/browse/HDFS-5957 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Chris Nauroth > > Currently, the {{ShortCircuitCache}} retains {{mmap}} regions for reuse by > multiple reads of the same block or by multiple threads. The eventual > {{munmap}} executes on a background thread after an expiration period. Some > client usage patterns would prefer strict bounds on this cache and > deterministic cleanup by calling {{munmap}}. This issue proposes additional > support for different caching policies that better fit these usage patterns. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5988) Bad fsimage always generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-5988: -- Summary: Bad fsimage always generated after upgrade (was: Bad fsimage generated after upgrade) > Bad fsimage always generated after upgrade > -- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Work started] (HDFS-5988) Bad fsimage generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-5988 started by Andrew Wang. > Bad fsimage generated after upgrade > --- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5988) Bad fsimage generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-5988: -- Attachment: hdfs-5988-1.patch {{FSImageFormat#Loader}} was incorrectly basing the decision to populate the {{FSDirectory#inodeMap}} based on if the old fsimage layout version supported inodes. We only see the error with the new PB-based image, since it iterates through {{inodeMap}}, while the old fsimage saver would traverse the directory structure instead. I also added a bunch of trace/debug logging to OIV, which was helpful in tracking down this issue. Trust me, lot of effort for a one-line fix :) > Bad fsimage generated after upgrade > --- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5988) Bad fsimage generated after upgrade
[ https://issues.apache.org/jira/browse/HDFS-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-5988: -- Status: Patch Available (was: In Progress) > Bad fsimage generated after upgrade > --- > > Key: HDFS-5988 > URL: https://issues.apache.org/jira/browse/HDFS-5988 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Blocker > Attachments: hdfs-5988-1.patch > > > Internal testing revealed an issue where, after upgrading from an earlier > release, we always fail to save a correct PB-based fsimage (namely, missing > inodes leading to an inconsistent namespace). This results in substantial > data loss, since the upgraded fsimage is broken, as well as the fsimages > generated by saveNamespace and checkpointing. > This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5988) Bad fsimage generated after upgrade
Andrew Wang created HDFS-5988: - Summary: Bad fsimage generated after upgrade Key: HDFS-5988 URL: https://issues.apache.org/jira/browse/HDFS-5988 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Internal testing revealed an issue where, after upgrading from an earlier release, we always fail to save a correct PB-based fsimage (namely, missing inodes leading to an inconsistent namespace). This results in substantial data loss, since the upgraded fsimage is broken, as well as the fsimages generated by saveNamespace and checkpointing. This ended up being a bug in the old fsimage loading code, patch coming. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5944: - Fix Version/s: 1.3.0 > LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and > cause SecondaryNameNode failed do checkpoint > > > Key: HDFS-5944 > URL: https://issues.apache.org/jira/browse/HDFS-5944 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 1.2.0, 2.2.0 >Reporter: zhaoyunjiong >Assignee: zhaoyunjiong > Fix For: 1.3.0, 2.4.0 > > Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, > HDFS-5944.test.txt, HDFS-5944.trunk.patch > > > In our cluster, we encountered error like this: > java.io.IOException: saveLeases found path > /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) > What happened: > Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. > And Client A continue refresh it's lease. > Client B deleted /XXX/20140206/04_30/ > Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write > Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log > Then secondaryNameNode try to do checkpoint and failed due to failed to > delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. > The reason is a bug in findLeaseWithPrefixPath: > int srclen = prefix.length(); > if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { > entries.put(entry.getKey(), entry.getValue()); > } > Here when prefix is /XXX/20140206/04_30/, and p is > /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. > The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5939) WebHdfs returns misleading error code and logs nothing if trying to create a file with no DNs in cluster
[ https://issues.apache.org/jira/browse/HDFS-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907661#comment-13907661 ] Tsz Wo (Nicholas), SZE commented on HDFS-5939: -- Users won't see the log. I think we don't need to add the log statement. This is the same as we won't log when a file is not found. I also suggest not to add the new NoDatanodeException - simply use IOException, put the detail message there and put InvalidTopologyException as the cause. > WebHdfs returns misleading error code and logs nothing if trying to create a > file with no DNs in cluster > > > Key: HDFS-5939 > URL: https://issues.apache.org/jira/browse/HDFS-5939 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-5939.001.patch, HDFS-5939.002.patch > > > When trying to access hdfs via webhdfs, and when datanode is dead, user will > see an exception below without any clue that it's caused by dead datanode: > $ curl -i -X PUT > ".../webhdfs/v1/t1?op=CREATE&user.name=&overwrite=false" > ... > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"n > must be positive"}} > Need to fix the report to give user hint about dead datanode. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5951) Provide diagnosis information in the Web UI
[ https://issues.apache.org/jira/browse/HDFS-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907653#comment-13907653 ] Suresh Srinivas commented on HDFS-5951: --- I think the scope of this jira is probably misunderstood. The proposal is not to do away the monitoring systems. Frequently I see many issues that can be flagged from HDFS itself. To name a few: # Configuration issues #* Using /tmp for storage #* For a given size of the cluster getting ipc handler count wrong, number of datanode transceivers wrong, and ulimit for daemons wrong etc. #* JVM heap size misconfiguration for the size of the cluster and for the number of the objects etc. # Flag issues that need to be addressed, which sometimes is missed even with monitoring in place, where alerts are categorized incorrectly or were ignored. #* Checkpoints not happening (I know instances where missing this has resulted in startup times of clusters over 18 hours!) #* Growth in editlog size or editlog. #* Corruption in fsimage and editlog checkpointing silently ignored. Some of these are covered in best practices documents that vendors put out or in hadoop operations related tech talks. Some of them can be covered in this WebUI where issues described can be flagged, with information on why it needs to be addressed and how to address it. > Provide diagnosis information in the Web UI > --- > > Key: HDFS-5951 > URL: https://issues.apache.org/jira/browse/HDFS-5951 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-5951.000.patch, diagnosis-failure.png, > diagnosis-succeed.png > > > HDFS should provide operation statistics in its UI. it can go one step > further by leveraging the information to diagnose common problems. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5944: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and > cause SecondaryNameNode failed do checkpoint > > > Key: HDFS-5944 > URL: https://issues.apache.org/jira/browse/HDFS-5944 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 1.2.0, 2.2.0 >Reporter: zhaoyunjiong >Assignee: zhaoyunjiong > Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, > HDFS-5944.test.txt, HDFS-5944.trunk.patch > > > In our cluster, we encountered error like this: > java.io.IOException: saveLeases found path > /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) > What happened: > Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. > And Client A continue refresh it's lease. > Client B deleted /XXX/20140206/04_30/ > Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write > Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log > Then secondaryNameNode try to do checkpoint and failed due to failed to > delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. > The reason is a bug in findLeaseWithPrefixPath: > int srclen = prefix.length(); > if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { > entries.put(entry.getKey(), entry.getValue()); > } > Here when prefix is /XXX/20140206/04_30/, and p is > /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. > The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5944: - Fix Version/s: 2.4.0 > LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and > cause SecondaryNameNode failed do checkpoint > > > Key: HDFS-5944 > URL: https://issues.apache.org/jira/browse/HDFS-5944 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 1.2.0, 2.2.0 >Reporter: zhaoyunjiong >Assignee: zhaoyunjiong > Fix For: 2.4.0 > > Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, > HDFS-5944.test.txt, HDFS-5944.trunk.patch > > > In our cluster, we encountered error like this: > java.io.IOException: saveLeases found path > /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) > What happened: > Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. > And Client A continue refresh it's lease. > Client B deleted /XXX/20140206/04_30/ > Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write > Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log > Then secondaryNameNode try to do checkpoint and failed due to failed to > delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. > The reason is a bug in findLeaseWithPrefixPath: > int srclen = prefix.length(); > if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { > entries.put(entry.getKey(), entry.getValue()); > } > Here when prefix is /XXX/20140206/04_30/, and p is > /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. > The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907644#comment-13907644 ] Brandon Li commented on HDFS-5944: -- I've committed the patch. > LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and > cause SecondaryNameNode failed do checkpoint > > > Key: HDFS-5944 > URL: https://issues.apache.org/jira/browse/HDFS-5944 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 1.2.0, 2.2.0 >Reporter: zhaoyunjiong >Assignee: zhaoyunjiong > Fix For: 2.4.0 > > Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, > HDFS-5944.test.txt, HDFS-5944.trunk.patch > > > In our cluster, we encountered error like this: > java.io.IOException: saveLeases found path > /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) > What happened: > Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. > And Client A continue refresh it's lease. > Client B deleted /XXX/20140206/04_30/ > Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write > Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log > Then secondaryNameNode try to do checkpoint and failed due to failed to > delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. > The reason is a bug in findLeaseWithPrefixPath: > int srclen = prefix.length(); > if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { > entries.put(entry.getKey(), entry.getValue()); > } > Here when prefix is /XXX/20140206/04_30/, and p is > /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. > The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5939) WebHdfs returns misleading error code and logs nothing if trying to create a file with no DNs in cluster
[ https://issues.apache.org/jira/browse/HDFS-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907640#comment-13907640 ] Yongjun Zhang commented on HDFS-5939: - HI Haohui. At least we got a report from the field that we need to provide better message so user can quickly tell what's going on. I wonder if a WARN instead of an ERROR is more acceptable? Thanks. > WebHdfs returns misleading error code and logs nothing if trying to create a > file with no DNs in cluster > > > Key: HDFS-5939 > URL: https://issues.apache.org/jira/browse/HDFS-5939 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-5939.001.patch, HDFS-5939.002.patch > > > When trying to access hdfs via webhdfs, and when datanode is dead, user will > see an exception below without any clue that it's caused by dead datanode: > $ curl -i -X PUT > ".../webhdfs/v1/t1?op=CREATE&user.name=&overwrite=false" > ... > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"n > must be positive"}} > Need to fix the report to give user hint about dead datanode. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HDFS-5977) FSImageFormatPBINode does not respect "-renameReserved" upgrade flag
[ https://issues.apache.org/jira/browse/HDFS-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai resolved HDFS-5977. -- Resolution: Later > FSImageFormatPBINode does not respect "-renameReserved" upgrade flag > > > Key: HDFS-5977 > URL: https://issues.apache.org/jira/browse/HDFS-5977 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang > Labels: protobuf > > HDFS-5709 added a new upgrade flag "-renameReserved" which can be used to > automatically rename reserved paths like "/.reserved" encountered during > upgrade. The new protobuf loading code does not have a similar facility, so > future reserved paths cannot be automatically renamed via "-renameReserved". -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5987) Fix findbugs warnings in Rolling Upgrade branch
[ https://issues.apache.org/jira/browse/HDFS-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5987: - Attachment: h5987_20140220.patch h5987_20140220.patch: fixes the findbugs warning and adds more cases to TestRollingUpgrade.testRollback(). > Fix findbugs warnings in Rolling Upgrade branch > --- > > Key: HDFS-5987 > URL: https://issues.apache.org/jira/browse/HDFS-5987 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, ha, hdfs-client, namenode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE >Priority: Minor > Attachments: h5987_20140220.patch > > > {noformat} > RV > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.restoreBlockFilesFromTrash(File) > ignores exceptional return value of java.io.File.mkdirs() > RV > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.restoreBlockFilesFromTrash(File) > ignores exceptional return value of java.io.File.renameTo(File) > RV > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService$ReplicaFileDeleteTask.moveFiles() > ignores exceptional return value of java.io.File.mkdirs() > ISInconsistent synchronization of > org.apache.hadoop.hdfs.qjournal.server.Journal.committedTxnId; locked 92% of > time > NPDereference of the result of readLine() without nullcheck in > org.apache.hadoop.hdfs.util.MD5FileUtils.renameMD5File(File, File) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5986) Capture the number of blocks pending deletion on namenode webUI
[ https://issues.apache.org/jira/browse/HDFS-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907636#comment-13907636 ] Suresh Srinivas commented on HDFS-5986: --- Yes. It is the PendingDeletetionBlocksCount from invalidateBlocks. I like what @atm suggested as well. I do not think there is a metrics corresponding to this. > Capture the number of blocks pending deletion on namenode webUI > --- > > Key: HDFS-5986 > URL: https://issues.apache.org/jira/browse/HDFS-5986 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Suresh Srinivas > > When a directory that has large number of directories and files are deleted, > the namespace deletes the corresponding inodes immediately. However it is > hard to to know when the invalidated blocks are actually deleted on the > datanodes, which could take a while. > I propose adding on namenode webUI, along with under replicated blocks, the > number of blocks that are pending deletion. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5987) Fix findbugs warnings in Rolling Upgrade branch
[ https://issues.apache.org/jira/browse/HDFS-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5987: - Description: {noformat} RV org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.restoreBlockFilesFromTrash(File) ignores exceptional return value of java.io.File.mkdirs() RV org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.restoreBlockFilesFromTrash(File) ignores exceptional return value of java.io.File.renameTo(File) RV org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService$ReplicaFileDeleteTask.moveFiles() ignores exceptional return value of java.io.File.mkdirs() IS Inconsistent synchronization of org.apache.hadoop.hdfs.qjournal.server.Journal.committedTxnId; locked 92% of time NP Dereference of the result of readLine() without nullcheck in org.apache.hadoop.hdfs.util.MD5FileUtils.renameMD5File(File, File) {noformat} > Fix findbugs warnings in Rolling Upgrade branch > --- > > Key: HDFS-5987 > URL: https://issues.apache.org/jira/browse/HDFS-5987 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, ha, hdfs-client, namenode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE >Priority: Minor > > {noformat} > RV > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.restoreBlockFilesFromTrash(File) > ignores exceptional return value of java.io.File.mkdirs() > RV > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.restoreBlockFilesFromTrash(File) > ignores exceptional return value of java.io.File.renameTo(File) > RV > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService$ReplicaFileDeleteTask.moveFiles() > ignores exceptional return value of java.io.File.mkdirs() > ISInconsistent synchronization of > org.apache.hadoop.hdfs.qjournal.server.Journal.committedTxnId; locked 92% of > time > NPDereference of the result of readLine() without nullcheck in > org.apache.hadoop.hdfs.util.MD5FileUtils.renameMD5File(File, File) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5939) WebHdfs returns misleading error code and logs nothing if trying to create a file with no DNs in cluster
[ https://issues.apache.org/jira/browse/HDFS-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907615#comment-13907615 ] Haohui Mai commented on HDFS-5939: -- bq. The case reported in this bug is about no datanode is running, which is about unhealthy cluster and definitely need to catch operator's attention. So I think it makes sense to log a message in server log. Do you still think we don't need to log an error there? It could save the operator time to investigate the problem. Personally I think it is an overkill. Note that if this happens, it means that either (1) all datanodes are dead, or (2) there at least one block is missing (i.e., no datanodes can serve it) in HDFS. Both the web UI and the monitoring applications (e.g., Ambari / CDH) would catch it much earlier before the operator looks into the log. The log has little value since it cannot flag the error at the first place, and it provides sufficient information to reproduce the error (in this case only the client can reproduce it in a reliable way). > WebHdfs returns misleading error code and logs nothing if trying to create a > file with no DNs in cluster > > > Key: HDFS-5939 > URL: https://issues.apache.org/jira/browse/HDFS-5939 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.3.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-5939.001.patch, HDFS-5939.002.patch > > > When trying to access hdfs via webhdfs, and when datanode is dead, user will > see an exception below without any clue that it's caused by dead datanode: > $ curl -i -X PUT > ".../webhdfs/v1/t1?op=CREATE&user.name=&overwrite=false" > ... > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"n > must be positive"}} > Need to fix the report to give user hint about dead datanode. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5987) Fix findbugs warnings in Rolling Upgrade branch
Tsz Wo (Nicholas), SZE created HDFS-5987: Summary: Fix findbugs warnings in Rolling Upgrade branch Key: HDFS-5987 URL: https://issues.apache.org/jira/browse/HDFS-5987 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5986) Capture the number of blocks pending deletion on namenode webUI
[ https://issues.apache.org/jira/browse/HDFS-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-5986: - Issue Type: Improvement (was: Bug) Seems like a decent idea to me. We should expose this as a metric as well, if not also in the NN web UI. > Capture the number of blocks pending deletion on namenode webUI > --- > > Key: HDFS-5986 > URL: https://issues.apache.org/jira/browse/HDFS-5986 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Suresh Srinivas > > When a directory that has large number of directories and files are deleted, > the namespace deletes the corresponding inodes immediately. However it is > hard to to know when the invalidated blocks are actually deleted on the > datanodes, which could take a while. > I propose adding on namenode webUI, along with under replicated blocks, the > number of blocks that are pending deletion. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5986) Capture the number of blocks pending deletion on namenode webUI
[ https://issues.apache.org/jira/browse/HDFS-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907627#comment-13907627 ] Kihwal Lee commented on HDFS-5986: -- The jmx on NN already has {{PendingDeletionBlocks}} and [~wheat9] made NN webUI render on the client-side using the jmx data, so it should be relatively a simple change. Is {{PendingDeletionBlocks}} what we want, or is it something else? > Capture the number of blocks pending deletion on namenode webUI > --- > > Key: HDFS-5986 > URL: https://issues.apache.org/jira/browse/HDFS-5986 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Suresh Srinivas > > When a directory that has large number of directories and files are deleted, > the namespace deletes the corresponding inodes immediately. However it is > hard to to know when the invalidated blocks are actually deleted on the > datanodes, which could take a while. > I propose adding on namenode webUI, along with under replicated blocks, the > number of blocks that are pending deletion. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5944: - Summary: LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint (was: LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint) > LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and > cause SecondaryNameNode failed do checkpoint > > > Key: HDFS-5944 > URL: https://issues.apache.org/jira/browse/HDFS-5944 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 1.2.0, 2.2.0 >Reporter: zhaoyunjiong >Assignee: zhaoyunjiong > Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, > HDFS-5944.test.txt, HDFS-5944.trunk.patch > > > In our cluster, we encountered error like this: > java.io.IOException: saveLeases found path > /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) > What happened: > Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. > And Client A continue refresh it's lease. > Client B deleted /XXX/20140206/04_30/ > Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write > Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log > Then secondaryNameNode try to do checkpoint and failed due to failed to > delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. > The reason is a bug in findLeaseWithPrefixPath: > int srclen = prefix.length(); > if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { > entries.put(entry.getKey(), entry.getValue()); > } > Here when prefix is /XXX/20140206/04_30/, and p is > /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. > The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5977) FSImageFormatPBINode does not respect "-renameReserved" upgrade flag
[ https://issues.apache.org/jira/browse/HDFS-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907623#comment-13907623 ] Haohui Mai commented on HDFS-5977: -- Thanks [~andrew.wang] and [~sureshms] for the info. Let me resolve this jira as later and keep it around. We can reopen this jira if we need to add another reserved path. > FSImageFormatPBINode does not respect "-renameReserved" upgrade flag > > > Key: HDFS-5977 > URL: https://issues.apache.org/jira/browse/HDFS-5977 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Andrew Wang > Labels: protobuf > > HDFS-5709 added a new upgrade flag "-renameReserved" which can be used to > automatically rename reserved paths like "/.reserved" encountered during > upgrade. The new protobuf loading code does not have a similar facility, so > future reserved paths cannot be automatically renamed via "-renameReserved". -- This message was sent by Atlassian JIRA (v6.1.5#6160)