[jira] [Updated] (HDFS-6135) In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling back
[ https://issues.apache.org/jira/browse/HDFS-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6135: Attachment: HDFS-6135.002.patch Another option is to completely ignore the future layouversion check in Journal Node. Since now JN no longer decode the edits (but using the length field to scan through), JN has some capability to handle the edits with future layoutVersion. However, this capability is limited, e.g., in the future if something more significant (e.g., the edits segment-based maintenance mechanism) gets updated, old software will not be able to process edits generated by new software. But maybe a better solution is to split the NameNode layouversion into NameNode layoutversion and JournalNode layoutversion, just like what we did for NN and DN. > In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump > when rolling back > -- > > Key: HDFS-6135 > URL: https://issues.apache.org/jira/browse/HDFS-6135 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Blocker > Attachments: HDFS-6135.000.patch, HDFS-6135.001.patch, > HDFS-6135.002.patch, HDFS-6135.test.txt > > > While doing HDFS upgrade with HA setup, if the layoutversion gets changed in > the upgrade, the rollback may trigger the following exception in JournalNodes > (suppose the new software bumped the layoutversion from -55 to -56): > {code} > 14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join > org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if > roll back possible for one or more JournalNodes. 1 exceptions thrown: > Unexpected version of storage directory /grid/1/tmp/journal/mycluster. > Reported: -56. Expecting = -55. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:203) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:156) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:135) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.(JNStorage.java:73) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.(Journal.java:142) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:304) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228) > {code} > Looks like for rollback JN with old software cannot handle future > layoutversion brought by new software. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6135) In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling back
[ https://issues.apache.org/jira/browse/HDFS-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6135: Attachment: HDFS-6135.001.patch > In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump > when rolling back > -- > > Key: HDFS-6135 > URL: https://issues.apache.org/jira/browse/HDFS-6135 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Blocker > Attachments: HDFS-6135.000.patch, HDFS-6135.001.patch, > HDFS-6135.test.txt > > > While doing HDFS upgrade with HA setup, if the layoutversion gets changed in > the upgrade, the rollback may trigger the following exception in JournalNodes > (suppose the new software bumped the layoutversion from -55 to -56): > {code} > 14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join > org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if > roll back possible for one or more JournalNodes. 1 exceptions thrown: > Unexpected version of storage directory /grid/1/tmp/journal/mycluster. > Reported: -56. Expecting = -55. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:203) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:156) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:135) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.(JNStorage.java:73) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.(Journal.java:142) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:304) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228) > {code} > Looks like for rollback JN with old software cannot handle future > layoutversion brought by new software. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6135) In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling back
[ https://issues.apache.org/jira/browse/HDFS-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944706#comment-13944706 ] Hadoop QA commented on HDFS-6135: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636271/HDFS-6135.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.shell.TestHdfsTextCommand org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength org.apache.hadoop.hdfs.server.namenode.TestNameNodeJspHelper org.apache.hadoop.hdfs.server.namenode.TestAclConfigFlag org.apache.hadoop.hdfs.TestClose org.apache.hadoop.hdfs.TestShortCircuitLocalRead org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs org.apache.hadoop.hdfs.server.namenode.TestStorageRestore org.apache.hadoop.hdfs.TestFSInputChecker org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.TestDataTransferProtocol org.apache.hadoop.hdfs.server.namenode.TestHDFSConcat org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestInterDatanodeProtocol org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting org.apache.hadoop.hdfs.server.namenode.TestFSNamesystemMBean org.apache.hadoop.hdfs.TestDFSClientFailover org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotNameWithInvalidCharacters org.apache.hadoop.tools.TestJMXGet org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.server.namenode.TestFSImage org.apache.hadoop.hdfs.security.TestDelegationToken org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication org.apache.hadoop.fs.permission.TestStickyBit org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.server.datanode.TestCachingStrategy org.apache.hadoop.hdfs.server.namenode.snapshot.TestCheckpointsWithSnapshots org.apache.hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks org.apache.hadoop.fs.TestFcHdfsCreateMkdir org.apache.hadoop.hdfs.TestCrcCorruption org.apache.hadoop.hdfs.TestAppendDifferentChecksum org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration org.apache.hadoop.fs.viewfs.TestViewFileSystemHdfs org.apache.hadoop.hdfs.server.namenode.TestParallelImageWrite org.apache.hadoop.hdfs.security.TestDelegationTokenForProxyUser org.apache.hadoop.hdfs.server.namenode.TestSequentialBlockId org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap org.apache.hadoop.hdfs.TestDFSPermission org.apache.hadoop.hdfs.server.namenode.snapshot.TestDisallowModifyROSnapshot org.apache.hadoop.hdfs.TestDFSUpgradeFromImage org.apache.hadoop.hdfs.TestListFilesInFileContext org.apache.hadoop.hdfs.server.namenode.TestNameNodeResourceChecker org.apache.hadoop.fs.viewfs.TestViewFsDefaultValue org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics
[jira] [Updated] (HDFS-5138) Support HDFS upgrade in HA
[ https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-5138: Attachment: HDFS-5138.branch-2.001.patch Rebase the patch for branch-2. > Support HDFS upgrade in HA > -- > > Key: HDFS-5138 > URL: https://issues.apache.org/jira/browse/HDFS-5138 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.1.1-beta >Reporter: Kihwal Lee >Assignee: Aaron T. Myers >Priority: Blocker > Fix For: 3.0.0 > > Attachments: HDFS-5138.branch-2.001.patch, HDFS-5138.patch, > HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, > HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, > HDFS-5138.patch, HDFS-5138.patch, hdfs-5138-branch-2.txt > > > With HA enabled, NN wo't start with "-upgrade". Since there has been a layout > version change between 2.0.x and 2.1.x, starting NN in upgrade mode was > necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way > to get around this was to disable HA and upgrade. > The NN and the cluster cannot be flipped back to HA until the upgrade is > finalized. If HA is disabled only on NN for layout upgrade and HA is turned > back on without involving DNs, things will work, but finaliizeUpgrade won't > work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade > snapshots won't get removed. > We will need a different ways of doing layout upgrade and upgrade snapshot. > I am marking this as a 2.1.1-beta blocker based on feedback from others. If > there is a reasonable workaround that does not increase maintenance window > greatly, we can lower its priority from blocker to critical. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6135) In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling back
[ https://issues.apache.org/jira/browse/HDFS-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6135: Affects Version/s: (was: 2.4.0) 3.0.0 > In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump > when rolling back > -- > > Key: HDFS-6135 > URL: https://issues.apache.org/jira/browse/HDFS-6135 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Blocker > Attachments: HDFS-6135.000.patch, HDFS-6135.test.txt > > > While doing HDFS upgrade with HA setup, if the layoutversion gets changed in > the upgrade, the rollback may trigger the following exception in JournalNodes > (suppose the new software bumped the layoutversion from -55 to -56): > {code} > 14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join > org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if > roll back possible for one or more JournalNodes. 1 exceptions thrown: > Unexpected version of storage directory /grid/1/tmp/journal/mycluster. > Reported: -56. Expecting = -55. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:203) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:156) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:135) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.(JNStorage.java:73) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.(Journal.java:142) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:304) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228) > {code} > Looks like for rollback JN with old software cannot handle future > layoutversion brought by new software. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6135) In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling back
[ https://issues.apache.org/jira/browse/HDFS-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6135: Status: Patch Available (was: Open) > In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump > when rolling back > -- > > Key: HDFS-6135 > URL: https://issues.apache.org/jira/browse/HDFS-6135 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Blocker > Attachments: HDFS-6135.000.patch, HDFS-6135.test.txt > > > While doing HDFS upgrade with HA setup, if the layoutversion gets changed in > the upgrade, the rollback may trigger the following exception in JournalNodes > (suppose the new software bumped the layoutversion from -55 to -56): > {code} > 14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join > org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if > roll back possible for one or more JournalNodes. 1 exceptions thrown: > Unexpected version of storage directory /grid/1/tmp/journal/mycluster. > Reported: -56. Expecting = -55. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:203) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:156) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:135) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.(JNStorage.java:73) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.(Journal.java:142) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:304) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228) > {code} > Looks like for rollback JN with old software cannot handle future > layoutversion brought by new software. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6135) In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling back
[ https://issues.apache.org/jira/browse/HDFS-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6135: Attachment: HDFS-6135.000.patch Upload a simple fix which lets JN understand the corresponding StartupOption, and bypass the layoutversion check when creating Journal object if the StartupOption is Rollback. > In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump > when rolling back > -- > > Key: HDFS-6135 > URL: https://issues.apache.org/jira/browse/HDFS-6135 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Blocker > Attachments: HDFS-6135.000.patch, HDFS-6135.test.txt > > > While doing HDFS upgrade with HA setup, if the layoutversion gets changed in > the upgrade, the rollback may trigger the following exception in JournalNodes > (suppose the new software bumped the layoutversion from -55 to -56): > {code} > 14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join > org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if > roll back possible for one or more JournalNodes. 1 exceptions thrown: > Unexpected version of storage directory /grid/1/tmp/journal/mycluster. > Reported: -56. Expecting = -55. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:203) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:156) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:135) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.(JNStorage.java:73) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.(Journal.java:142) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:304) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228) > {code} > Looks like for rollback JN with old software cannot handle future > layoutversion brought by new software. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6135) In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling back
[ https://issues.apache.org/jira/browse/HDFS-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao reassigned HDFS-6135: --- Assignee: Jing Zhao > In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump > when rolling back > -- > > Key: HDFS-6135 > URL: https://issues.apache.org/jira/browse/HDFS-6135 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Blocker > Attachments: HDFS-6135.test.txt > > > While doing HDFS upgrade with HA setup, if the layoutversion gets changed in > the upgrade, the rollback may trigger the following exception in JournalNodes > (suppose the new software bumped the layoutversion from -55 to -56): > {code} > 14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join > org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if > roll back possible for one or more JournalNodes. 1 exceptions thrown: > Unexpected version of storage directory /grid/1/tmp/journal/mycluster. > Reported: -56. Expecting = -55. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:203) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:156) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:135) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.(JNStorage.java:73) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.(Journal.java:142) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:304) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228) > {code} > Looks like for rollback JN with old software cannot handle future > layoutversion brought by new software. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5138) Support HDFS upgrade in HA
[ https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944591#comment-13944591 ] Suresh Srinivas commented on HDFS-5138: --- [~acmurthy], I have HDFS-6135 and HDFS-5840 as blockers. > Support HDFS upgrade in HA > -- > > Key: HDFS-5138 > URL: https://issues.apache.org/jira/browse/HDFS-5138 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.1.1-beta >Reporter: Kihwal Lee >Assignee: Aaron T. Myers >Priority: Blocker > Fix For: 3.0.0 > > Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, > HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, > HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, > hdfs-5138-branch-2.txt > > > With HA enabled, NN wo't start with "-upgrade". Since there has been a layout > version change between 2.0.x and 2.1.x, starting NN in upgrade mode was > necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way > to get around this was to disable HA and upgrade. > The NN and the cluster cannot be flipped back to HA until the upgrade is > finalized. If HA is disabled only on NN for layout upgrade and HA is turned > back on without involving DNs, things will work, but finaliizeUpgrade won't > work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade > snapshots won't get removed. > We will need a different ways of doing layout upgrade and upgrade snapshot. > I am marking this as a 2.1.1-beta blocker based on feedback from others. If > there is a reasonable workaround that does not increase maintenance window > greatly, we can lower its priority from blocker to critical. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures
[ https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-5840: -- Priority: Blocker (was: Major) > Follow-up to HDFS-5138 to improve error handling during partial upgrade > failures > > > Key: HDFS-5840 > URL: https://issues.apache.org/jira/browse/HDFS-5840 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers >Priority: Blocker > Fix For: 3.0.0 > > Attachments: HDFS-5840.patch > > > Suresh posted some good comment in HDFS-5138 after that patch had already > been committed to trunk. This JIRA is to address those. See the first comment > of this JIRA for the full content of the review. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6135) In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling back
[ https://issues.apache.org/jira/browse/HDFS-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-6135: -- Priority: Blocker (was: Major) > In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump > when rolling back > -- > > Key: HDFS-6135 > URL: https://issues.apache.org/jira/browse/HDFS-6135 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jing Zhao >Priority: Blocker > Attachments: HDFS-6135.test.txt > > > While doing HDFS upgrade with HA setup, if the layoutversion gets changed in > the upgrade, the rollback may trigger the following exception in JournalNodes > (suppose the new software bumped the layoutversion from -55 to -56): > {code} > 14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join > org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if > roll back possible for one or more JournalNodes. 1 exceptions thrown: > Unexpected version of storage directory /grid/1/tmp/journal/mycluster. > Reported: -56. Expecting = -55. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:203) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:156) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:135) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.(JNStorage.java:73) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.(Journal.java:142) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:304) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228) > {code} > Looks like for rollback JN with old software cannot handle future > layoutversion brought by new software. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures
[ https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944588#comment-13944588 ] Suresh Srinivas commented on HDFS-5840: --- [~atm], any updates on this. [~jingzhao], found some issues and posted comments on HDFS-5138 as well. > Follow-up to HDFS-5138 to improve error handling during partial upgrade > failures > > > Key: HDFS-5840 > URL: https://issues.apache.org/jira/browse/HDFS-5840 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 3.0.0 > > Attachments: HDFS-5840.patch > > > Suresh posted some good comment in HDFS-5138 after that patch had already > been committed to trunk. This JIRA is to address those. See the first comment > of this JIRA for the full content of the review. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6143) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-6143: -- Summary: HftpFileSystem open should throw FileNotFoundException for non-existing paths (was: HftpFileSystem open should through FileNotFoundException for non-existing paths) > HftpFileSystem open should throw FileNotFoundException for non-existing paths > - > > Key: HDFS-6143 > URL: https://issues.apache.org/jira/browse/HDFS-6143 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov >Priority: Blocker > Attachments: HDFS-6143.v01.patch > > > HftpFileSystem.open incorrectly handles non-existing paths. > - 'open', does not really open anything, i.e., it does not contact the > server, and therefore cannot discover FileNotFound, it's deferred until next > read. It's counterintuitive and not how local FS or HDFS work. In POSIX you > get ENOENT on open. > [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] > is an example of the code that's broken because of this. > - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST > instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) HftpFileSystem open should through FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944532#comment-13944532 ] Steve Loughran commented on HDFS-6143: -- This is important, everything expects `open(path)` to fail if the path isn't there (FWIW, it'd be a great optimisation for HTTP filesystems if there was an `open(offset)` operation as if you look at the logs, it's usually a pair of open & seek.) # Could you change the code to go {{e.toString()}} and not {{e.getMessage()}}? I know some exceptions have null messages -and others may include more detail in their string operations. # The tests that match for string error messages will be brittle and increase maintenance costs. Could you use some String.format() operations to create both the error text in the exception, and the strings to search for in the exceptions? This would guarantee that a change in the source would be reflected in the destination. Sorry to add more work, it's just I think we shouldn't copy bad practises of the past when writing new code & tests > HftpFileSystem open should through FileNotFoundException for non-existing > paths > --- > > Key: HDFS-6143 > URL: https://issues.apache.org/jira/browse/HDFS-6143 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov >Priority: Blocker > Attachments: HDFS-6143.v01.patch > > > HftpFileSystem.open incorrectly handles non-existing paths. > - 'open', does not really open anything, i.e., it does not contact the > server, and therefore cannot discover FileNotFound, it's deferred until next > read. It's counterintuitive and not how local FS or HDFS work. In POSIX you > get ENOENT on open. > [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] > is an example of the code that's broken because of this. > - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST > instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944517#comment-13944517 ] Steve Loughran commented on HDFS-6134: -- Alejandro, Sorry what I meant to say is that the PDF refers to other JIRAs -they should be added as links to this JIRA. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataAtRestEncryption.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthÂcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6146) retry if cannot report Bad Blocks to namenode
Ding Yuan created HDFS-6146: --- Summary: retry if cannot report Bad Blocks to namenode Key: HDFS-6146 URL: https://issues.apache.org/jira/browse/HDFS-6146 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.2.0 Reporter: Ding Yuan Line: 255, File: "org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java" {noformat} 249: try { 250: bpNamenode.reportBadBlocks(blocks); 251: } catch (IOException e){ 252: /* One common reason is that NameNode could be in safe mode. 253:* Should we keep on retrying in that case? 254:*/ 255: LOG.warn("Failed to report bad block " + block + " to namenode : " 256: + " Exception", e); 257: } {noformat} The comment seems to suggest this should be retried. A similar case is at: Line: 1430, File: "org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java" -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6145) Stopping unexpected exception from propagating to avoid serious consequences
Ding Yuan created HDFS-6145: --- Summary: Stopping unexpected exception from propagating to avoid serious consequences Key: HDFS-6145 URL: https://issues.apache.org/jira/browse/HDFS-6145 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.2.0 Reporter: Ding Yuan There are a few cases where an exception should never have occurred, but the code simply logged it and let the execution continue. Since they shouldn't have occurred, a safer way may be to simply terminate the execution and stop them from propagating into some unexpected consequences. == Case 1: Line: 336, File: "org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java" {noformat} 325: try { 326: Quota.Counts counts = cleanSubtree(snapshot, prior, collectedBlocks, 327: removedINodes, true); 328: INodeDirectory parent = getParent(); .. .. 335: } catch(QuotaExceededException e) { 336: LOG.error("BUG: removeSnapshot increases namespace usage.", e); 337: } {noformat} Since this shouldn't have occurred unless some unexpected bugs occur, should the NN simply stop the execution to prevent bad things from propagation? Similar handling of QuotaExceededException can be found at: Line: 544, File: "org/apache/hadoop/hdfs/server/namenode/INodeReference.java" Line: 657, File: "org/apache/hadoop/hdfs/server/namenode/INodeReference.java" Line: 669, File: "org/apache/hadoop/hdfs/server/namenode/INodeReference.java" == == Case 2: Line: 601, File: "org/apache/hadoop/hdfs/server/namenode/JournalSet.java" {noformat} 591: public synchronized RemoteEditLogManifest getEditLogManifest(long fromTxId, .. 595:for (JournalAndStream j : journals) { .. 598: try { 599: allLogs.addAll(fjm.getRemoteEditLogs(fromTxId, forReading, false)); 600: } catch (Throwable t) { 601: LOG.warn("Cannot list edit logs in " + fjm, t); 602: } {noformat} An exception from addAll will result in some edit log files not considered, and not included in the checkpoint, which may result in dataloss. == == Case 3: Line: 4029, File: "org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java" {noformat} 4010: try { 4011: while (fsRunning && shouldNNRmRun) { 4012: checkAvailableResources(); 4013: if(!nameNodeHasResourcesAvailable()) { 4014: String lowResourcesMsg = "NameNode low on available disk space. "; 4015: if (!isInSafeMode()) { 4016: FSNamesystem.LOG.warn(lowResourcesMsg + "Entering safe mode."); 4017: } else { 4018: FSNamesystem.LOG.warn(lowResourcesMsg + "Already in safe mode."); 4019: } 4020: enterSafeMode(true); 4021: } .. .. 4027: } 4028: } catch (Exception e) { 4029: FSNamesystem.LOG.error("Exception in NameNodeResourceMonitor: ", e); 4030: } {noformat} enterSafeMode might thrown exception. In the case of not being able to entering safe mode, should the execution simply terminate? == -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6144) Failing to delete balancer.id in close could prevent balancer from starting the next time
Ding Yuan created HDFS-6144: --- Summary: Failing to delete balancer.id in close could prevent balancer from starting the next time Key: HDFS-6144 URL: https://issues.apache.org/jira/browse/HDFS-6144 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Ding Yuan Line: 215, File: "org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java" {noformat} 199: void close() { 212: try { 213: fs.delete(BALANCER_ID_PATH, true); 214: } catch(IOException ioe) { 215: LOG.warn("Failed to delete " + BALANCER_ID_PATH, ioe); 216: } {noformat} If the FS cannot delete this file for some reason, it will prevent any balancers from running in the future. Should at least retry in this case (or warn the users about this and suggesting them to manually delete this file)? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) HftpFileSystem open should through FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944397#comment-13944397 ] Hadoop QA commented on HDFS-6143: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636242/HDFS-6143.v01.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6467//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6467//console This message is automatically generated. > HftpFileSystem open should through FileNotFoundException for non-existing > paths > --- > > Key: HDFS-6143 > URL: https://issues.apache.org/jira/browse/HDFS-6143 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov >Priority: Blocker > Attachments: HDFS-6143.v01.patch > > > HftpFileSystem.open incorrectly handles non-existing paths. > - 'open', does not really open anything, i.e., it does not contact the > server, and therefore cannot discover FileNotFound, it's deferred until next > read. It's counterintuitive and not how local FS or HDFS work. In POSIX you > get ENOENT on open. > [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] > is an example of the code that's broken because of this. > - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST > instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6110) adding more slow action log in critical write path
[ https://issues.apache.org/jira/browse/HDFS-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944387#comment-13944387 ] Liang Xie commented on HDFS-6110: - Seems not HBase only, i just saw hdfs-6139, MR application should be benefit if this patch is in. > adding more slow action log in critical write path > -- > > Key: HDFS-6110 > URL: https://issues.apache.org/jira/browse/HDFS-6110 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.0, 2.3.0 >Reporter: Liang Xie >Assignee: Liang Xie > Attachments: HDFS-6110-v2.txt, HDFS-6110.txt > > > After digging a HBase write spike issue caused by slow buffer io in our > cluster, just realize we'd better to add more abnormal latency warning log in > write flow, such that if other guys hit HLog sync spike, we could know more > detail info from HDFS side at the same time. > Patch will be uploaded soon. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6139) JobTracker blocked on TIMED_WAITING DFSOutputStream.waitForAckedSeqno() running
[ https://issues.apache.org/jira/browse/HDFS-6139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944386#comment-13944386 ] Liang Xie commented on HDFS-6139: - If HDFS-6110 was in, it should be very helpful for diagnoses. > JobTracker blocked on TIMED_WAITING DFSOutputStream.waitForAckedSeqno() > running > --- > > Key: HDFS-6139 > URL: https://issues.apache.org/jira/browse/HDFS-6139 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.0.0-alpha >Reporter: Muga Nishizawa > > We're using CDH 4.2.1. The following is a part of threaddump on our > JobTracker. > {code} > "IPC Server handler 249 on 8021" daemon prio=10 tid=0x7fce80e2c000 > nid=0x718e in Object.wait() [0x7fc92afe6000] >java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x7fc95f5ffba8> (a java.util.LinkedList) > at > org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:1708) > - locked <0x7fc95f5ffba8> (a java.util.LinkedList) > at > org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:1694) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1778) > - locked <0x7fc95f5ff898> (a > org.apache.hadoop.hdfs.DFSOutputStream) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:66) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:99) > at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3562) > - locked <0x7fc9652787e0> (a org.apache.hadoop.mapred.JobTracker) > at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3475) > at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.ipc.WritableRpcEngine$Server$WritableRpcInvoker.call(WritableRpcEngine.java:474) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689) > ... ... > "Thread-9489990" daemon prio=10 tid=0x7fce6c01b000 nid=0x14e1 runnable > [0x7fc8f38f7000] >java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) > - locked <0x7fc9631201d0> (a sun.nio.ch.Util$1) > - locked <0x7fc9631201e8> (a > java.util.Collections$UnmodifiableSet) > - locked <0x7fc963120158> (a sun.nio.ch.EPollSelectorImpl) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) > at > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:158) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:156) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:129) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:117) > at java.io.FilterInputStream.read(FilterInputStream.java:66) > at java.io.FilterInputStream.read(FilterInputStream.java:66) > at > org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:169) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1105) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6143) HftpFileSystem open should through FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated HDFS-6143: Attachment: HDFS-6143.v01.patch v01 of the patch for review > HftpFileSystem open should through FileNotFoundException for non-existing > paths > --- > > Key: HDFS-6143 > URL: https://issues.apache.org/jira/browse/HDFS-6143 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov >Priority: Blocker > Attachments: HDFS-6143.v01.patch > > > HftpFileSystem.open incorrectly handles non-existing paths. > - 'open', does not really open anything, i.e., it does not contact the > server, and therefore cannot discover FileNotFound, it's deferred until next > read. It's counterintuitive and not how local FS or HDFS work. In POSIX you > get ENOENT on open. > [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] > is an example of the code that's broken because of this. > - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST > instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6143) HftpFileSystem open should through FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated HDFS-6143: Status: Patch Available (was: Open) > HftpFileSystem open should through FileNotFoundException for non-existing > paths > --- > > Key: HDFS-6143 > URL: https://issues.apache.org/jira/browse/HDFS-6143 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov >Priority: Blocker > Attachments: HDFS-6143.v01.patch > > > HftpFileSystem.open incorrectly handles non-existing paths. > - 'open', does not really open anything, i.e., it does not contact the > server, and therefore cannot discover FileNotFound, it's deferred until next > read. It's counterintuitive and not how local FS or HDFS work. In POSIX you > get ENOENT on open. > [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] > is an example of the code that's broken because of this. > - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST > instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6143) HftpFileSystem open should through FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated HDFS-6143: Priority: Blocker (was: Major) Target Version/s: 2.4.0 > HftpFileSystem open should through FileNotFoundException for non-existing > paths > --- > > Key: HDFS-6143 > URL: https://issues.apache.org/jira/browse/HDFS-6143 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov >Priority: Blocker > > HftpFileSystem.open incorrectly handles non-existing paths. > - 'open', does not really open anything, i.e., it does not contact the > server, and therefore cannot discover FileNotFound, it's deferred until next > read. It's counterintuitive and not how local FS or HDFS work. In POSIX you > get ENOENT on open. > [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] > is an example of the code that's broken because of this. > - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST > instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6143) HftpFileSystem open should through FileNotFoundException for non-existing paths
Gera Shegalov created HDFS-6143: --- Summary: HftpFileSystem open should through FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)