[jira] [Created] (HDFS-15464) ViewFsOverloadScheme should work when -fs option pointing to remote cluster without mount links
Uma Maheswara Rao G created HDFS-15464: -- Summary: ViewFsOverloadScheme should work when -fs option pointing to remote cluster without mount links Key: HDFS-15464 URL: https://issues.apache.org/jira/browse/HDFS-15464 Project: Hadoop HDFS Issue Type: Sub-task Components: viewfsOverloadScheme Affects Versions: 3.2.1 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G When users try to connect to remote cluster from the cluster env where you enabled ViewFSOverloadScheme, it expects to have at least one mount link make fs init success. Unfortunately you might not have configured any mount links with that remote cluster in your current env. You would have configured only with your local clusters mount points. In this case fs init will fail with no mount points configured the mount table if that remote cluster uri's authority. One idea is that, when there are no mount links configured, we should just consider that as default cluster, that can be achieved by considering it as fallback option automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14498) LeaseManager can loop forever on the file for which create has failed
[ https://issues.apache.org/jira/browse/HDFS-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155075#comment-17155075 ] Xiaoqiao He commented on HDFS-14498: Hi [~weichiu] and every watchers, Any other review comments here? I would like to pending double days then commit. [~sodonnell] just check all active branches, it seems all of them have the issue, we may need backport to them, [^HDFS-14498.002.patch] is clean to cherry-pick at local fortunately. Would you like to have double check? > LeaseManager can loop forever on the file for which create has failed > -- > > Key: HDFS-14498 > URL: https://issues.apache.org/jira/browse/HDFS-14498 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.9.0 >Reporter: Sergey Shelukhin >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-14498.001.patch, HDFS-14498.002.patch > > > The logs from file creation are long gone due to infinite lease logging, > however it presumably failed... the client who was trying to write this file > is definitely long dead. > The version includes HDFS-4882. > We get this log pattern repeating infinitely: > {noformat} > 2019-05-16 14:00:16,893 INFO > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease. Holder: > DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1] has expired hard > limit > 2019-05-16 14:00:16,893 INFO > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease. > Holder: DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1], src= > 2019-05-16 14:00:16,893 WARN > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.internalReleaseLease: > Failed to release lease for file . Committed blocks are waiting to be > minimally replicated. Try again later. > 2019-05-16 14:00:16,893 WARN > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.server.namenode.LeaseManager: Cannot release the path > in the lease [Lease. Holder: DFSClient_NONMAPREDUCE_-20898906_61, > pending creates: 1]. It will be retried. > org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: DIR* > NameSystem.internalReleaseLease: Failed to release lease for file . > Committed blocks are waiting to be minimally replicated. Try again later. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3357) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:573) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:509) > at java.lang.Thread.run(Thread.java:745) > $ grep -c "Recovering.*DFSClient_NONMAPREDUCE_-20898906_61, pending creates: > 1" hdfs_nn* > hdfs_nn.log:1068035 > hdfs_nn.log.2019-05-16-14:1516179 > hdfs_nn.log.2019-05-16-15:1538350 > {noformat} > Aside from an actual bug fix, it might make sense to make LeaseManager not > log so much, in case if there are more bugs like this... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15462) Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml
[ https://issues.apache.org/jira/browse/HDFS-15462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15462: --- Fix Version/s: 3.1.5 3.2.2 > Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml > - > > Key: HDFS-15462 > URL: https://issues.apache.org/jira/browse/HDFS-15462 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: configuration, viewfs, viewfsOverloadScheme >Affects Versions: 3.2.1 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5 > > > HDFS-15394 added the existing impls in core-default.xml except ofs. Let's add > ofs to core-default here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15462) Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml
[ https://issues.apache.org/jira/browse/HDFS-15462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-15462: --- Fix Version/s: 3.3.1 > Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml > - > > Key: HDFS-15462 > URL: https://issues.apache.org/jira/browse/HDFS-15462 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: configuration, viewfs, viewfsOverloadScheme >Affects Versions: 3.2.1 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Fix For: 3.3.1, 3.4.0 > > > HDFS-15394 added the existing impls in core-default.xml except ofs. Let's add > ofs to core-default here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15463) Add a tool to validate FsImage
[ https://issues.apache.org/jira/browse/HDFS-15463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated HDFS-15463: -- Attachment: FsImageValidation20200709.patch > Add a tool to validate FsImage > -- > > Key: HDFS-15463 > URL: https://issues.apache.org/jira/browse/HDFS-15463 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Attachments: FsImageValidation20200709.patch > > > Due to some snapshot related bugs, a fsimage may become corrupted. Using a > corrupted fsimage may further result in data loss. > In some cases, we found that reference counts are incorrect in some corrupted > FsImage. One of the goals of the validation tool is to check reference > counts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15463) Add a tool to validate FsImage
[ https://issues.apache.org/jira/browse/HDFS-15463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated HDFS-15463: -- Description: Due to some snapshot related bugs, a fsimage may become corrupted. Using a corrupted fsimage may further result in data loss. In some cases, we found that reference counts are incorrect in some corrupted FsImage. One of the goals of the validation tool is to check reference counts. was: In some cases, we found that reference counts in some corrupted FsImage > Add a tool to validate FsImage > -- > > Key: HDFS-15463 > URL: https://issues.apache.org/jira/browse/HDFS-15463 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > > Due to some snapshot related bugs, a fsimage may become corrupted. Using a > corrupted fsimage may further result in data loss. > In some cases, we found that reference counts are incorrect in some corrupted > FsImage. One of the goals of the validation tool is to check reference > counts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15463) Add a tool to validate FsImage
[ https://issues.apache.org/jira/browse/HDFS-15463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated HDFS-15463: -- Attachment: FsImageValidation20200708.patch > Add a tool to validate FsImage > -- > > Key: HDFS-15463 > URL: https://issues.apache.org/jira/browse/HDFS-15463 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > > In some cases, we found that reference counts in some corrupted FsImage -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15463) Add a tool to validate FsImage
[ https://issues.apache.org/jira/browse/HDFS-15463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated HDFS-15463: -- Description: In some cases, we found that reference counts in some corrupted FsImage > Add a tool to validate FsImage > -- > > Key: HDFS-15463 > URL: https://issues.apache.org/jira/browse/HDFS-15463 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > > In some cases, we found that reference counts in some corrupted FsImage -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15463) Add a tool to validate FsImage
[ https://issues.apache.org/jira/browse/HDFS-15463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated HDFS-15463: -- Attachment: (was: FsImageValidation20200708.patch) > Add a tool to validate FsImage > -- > > Key: HDFS-15463 > URL: https://issues.apache.org/jira/browse/HDFS-15463 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > > In some cases, we found that reference counts in some corrupted FsImage -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15463) Add a tool to validate FsImage
Tsz-wo Sze created HDFS-15463: - Summary: Add a tool to validate FsImage Key: HDFS-15463 URL: https://issues.apache.org/jira/browse/HDFS-15463 Project: Hadoop HDFS Issue Type: New Feature Components: namenode Reporter: Tsz-wo Sze Assignee: Tsz-wo Sze -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15462) Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml
[ https://issues.apache.org/jira/browse/HDFS-15462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154941#comment-17154941 ] Hudson commented on HDFS-15462: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18423 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18423/]) HDFS-15462. Add fs.viewfs.overload.scheme.target.ofs.impl to (github: rev 0e694b20b9d59cc46882df506dcea386020b1e4d) * (edit) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestCommonConfigurationFields.java > Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml > - > > Key: HDFS-15462 > URL: https://issues.apache.org/jira/browse/HDFS-15462 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: configuration, viewfs, viewfsOverloadScheme >Affects Versions: 3.2.1 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Fix For: 3.4.0 > > > HDFS-15394 added the existing impls in core-default.xml except ofs. Let's add > ofs to core-default here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15462) Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml
[ https://issues.apache.org/jira/browse/HDFS-15462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng updated HDFS-15462: -- Fix Version/s: 3.4.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml > - > > Key: HDFS-15462 > URL: https://issues.apache.org/jira/browse/HDFS-15462 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: configuration, viewfs, viewfsOverloadScheme >Affects Versions: 3.2.1 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Fix For: 3.4.0 > > > HDFS-15394 added the existing impls in core-default.xml except ofs. Let's add > ofs to core-default here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13082) cookieverf mismatch error over NFS gateway on Linux
[ https://issues.apache.org/jira/browse/HDFS-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154055#comment-17154055 ] Daniel Howard edited comment on HDFS-13082 at 7/9/20, 6:16 PM: --- I found a comment that implies that Linux doesn't handle cookie verification:[1] {quote}This discussion comes up pretty much every time someone writes a new NFS server and/or filesystem. The thing that neither RFC1813, RFC3530, or RFC5661 have done is come up with sane semantics for how a NFS client is supposed to recover from the above scenario. What do I do with things like telldir()/seekdir() cookies? How do I recover my 'current position' in the readdir() stream? IOW: how do I fake up POSIX semantics to the applications? Until the recovery question is answered, the Linux client will continue to ignore the whole "cookie verifier" junk... {quote} [1]: [https://linuxlists.cc/l/17/linux-nfs/t/2933109/readdir_from_linux_nfs4_client_when_cookieverf_is_no_longer_valid] Here is a reference to cookieverf being removed from an Android kernel: [https://gitlab.incom.co/CM-Shield/android_kernel_nvidia_shieldtablet/commit/c3f52af3e03013db5237e339c817beaae5ec9e3a] was (Author: dannyman): I found a comment that implies that Linux doesn't handle cookie verification:[1] bq. This discussion comes up pretty much every time someone writes a new NFS server and/or filesystem. The thing that neither RFC1813, RFC3530, or RFC5661 have done is come up with sane semantics for how a NFS client is supposed to recover from the above scenario. What do I do with things like telldir()/seekdir() cookies? How do I recover my 'current position' in the readdir() stream? bq. IOW: how do I fake up POSIX semantics to the applications? bq. bq. Until the recovery question is answered, the Linux client will continue to ignore the whole "cookie verifier" junk... [1]: https://linuxlists.cc/l/17/linux-nfs/t/2933109/readdir_from_linux_nfs4_client_when_cookieverf_is_no_longer_valid Here is a reference to cookieverf being removed from an Android kernel: https://gitlab.incom.co/CM-Shield/android_kernel_nvidia_shieldtablet/commit/c3f52af3e03013db5237e339c817beaae5ec9e3a > cookieverf mismatch error over NFS gateway on Linux > --- > > Key: HDFS-13082 > URL: https://issues.apache.org/jira/browse/HDFS-13082 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3 >Reporter: Dan Moraru >Priority: Minor > > Running 'ls' on some directories over an HDFS-NFS gateway sometimes fails to > list the contents of those directories. Running 'ls' on those same > directories mounted via FUSE works. The NFS gateway logs errors like the > following: > 2018-01-29 11:53:01,130 ERROR org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3: > cookieverf mismatch. request cookieverf: 1513390944415 dir cookieverf: > 1516920857335 > Reviewing > hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java > suggested that these errors can be avoided by setting > nfs.aix.compatibility.mode.enabled=true, and that is indeed the case. The > documentation lists https://issues.apache.org/jira/browse/HDFS-6549 as a > known issue, but also goes on to say that "regular, non-AIX clients should > NOT enable AIX compatibility mode. The work-arounds implemented by AIX > compatibility mode effectively disable safeguards to ensure that listing of > directory contents via NFS returns consistent results, and that all data sent > to the NFS server can be assured to have been committed." Server and client > is this case are one and the same, running Scientific Linux 7.4. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14498) LeaseManager can loop forever on the file for which create has failed
[ https://issues.apache.org/jira/browse/HDFS-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154657#comment-17154657 ] Wei-Chiu Chuang commented on HDFS-14498: It looks very similar to HDFS-8406 / HDFS-11817. I thought this class of lease leakage bugs are settled. Surprised to see there are still corner cases! > LeaseManager can loop forever on the file for which create has failed > -- > > Key: HDFS-14498 > URL: https://issues.apache.org/jira/browse/HDFS-14498 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.9.0 >Reporter: Sergey Shelukhin >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-14498.001.patch, HDFS-14498.002.patch > > > The logs from file creation are long gone due to infinite lease logging, > however it presumably failed... the client who was trying to write this file > is definitely long dead. > The version includes HDFS-4882. > We get this log pattern repeating infinitely: > {noformat} > 2019-05-16 14:00:16,893 INFO > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease. Holder: > DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1] has expired hard > limit > 2019-05-16 14:00:16,893 INFO > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease. > Holder: DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1], src= > 2019-05-16 14:00:16,893 WARN > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.internalReleaseLease: > Failed to release lease for file . Committed blocks are waiting to be > minimally replicated. Try again later. > 2019-05-16 14:00:16,893 WARN > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.server.namenode.LeaseManager: Cannot release the path > in the lease [Lease. Holder: DFSClient_NONMAPREDUCE_-20898906_61, > pending creates: 1]. It will be retried. > org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: DIR* > NameSystem.internalReleaseLease: Failed to release lease for file . > Committed blocks are waiting to be minimally replicated. Try again later. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3357) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:573) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:509) > at java.lang.Thread.run(Thread.java:745) > $ grep -c "Recovering.*DFSClient_NONMAPREDUCE_-20898906_61, pending creates: > 1" hdfs_nn* > hdfs_nn.log:1068035 > hdfs_nn.log.2019-05-16-14:1516179 > hdfs_nn.log.2019-05-16-15:1538350 > {noformat} > Aside from an actual bug fix, it might make sense to make LeaseManager not > log so much, in case if there are more bugs like this... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15425) Review Logging of DFSClient
[ https://issues.apache.org/jira/browse/HDFS-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongbing Wang reassigned HDFS-15425: Assignee: Hongbing Wang > Review Logging of DFSClient > --- > > Key: HDFS-15425 > URL: https://issues.apache.org/jira/browse/HDFS-15425 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsclient >Reporter: Hongbing Wang >Assignee: Hongbing Wang >Priority: Minor > Fix For: 3.4.0 > > Attachments: HDFS-15425.001.patch, HDFS-15425.002.patch, > HDFS-15425.003.patch > > > Review use of SLF4J for DFSClient.LOG. > Make the code more concise and readable. > Less is more ! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15425) Review Logging of DFSClient
[ https://issues.apache.org/jira/browse/HDFS-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangzhaohui reassigned HDFS-15425: -- Assignee: (was: Hongbing Wang) > Review Logging of DFSClient > --- > > Key: HDFS-15425 > URL: https://issues.apache.org/jira/browse/HDFS-15425 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsclient >Reporter: Hongbing Wang >Priority: Minor > Fix For: 3.4.0 > > Attachments: HDFS-15425.001.patch, HDFS-15425.002.patch, > HDFS-15425.003.patch > > > Review use of SLF4J for DFSClient.LOG. > Make the code more concise and readable. > Less is more ! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14396) Failed to load image from FSImageFile when downgrade from 3.x to 2.x
[ https://issues.apache.org/jira/browse/HDFS-14396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154478#comment-17154478 ] fengwu edited comment on HDFS-14396 at 7/9/20, 12:10 PM: - Hi, Found in my test roll downgrade from 3.1.3 to 2.7.2, namenode successful , but datanode failed (2.8+ successfully ). because different datanode layout versions is -56 in hdfs 2.7 So, Is there a way to solve datanode roll downgrade from layout version -57 to -56 ? {code:java} // code placeholder 2020-07-06 14:45:01,313 WARN org.apache.hadoop.hdfs.server.common.Storage: org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /data/hadoop/dfs. Reported: -57. Expecting = -56. 2020-07-06 14:45:01,315 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/hadoop/dfs/in_use.lock acquired by nodename 21258@test-v03 2020-07-06 14:45:01,315 WARN org.apache.hadoop.hdfs.server.common.Storage: org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /data/hadoop/dfs. Reported: -57. Expecting = -56. 2020-07-06 14:45:01,315 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to test-v01/10.110.228.21:8020. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1358) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1323) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802) at java.lang.Thread.run(Thread.java:748) {code} was (Author: fengwu99): Hi, Found in my test roll downgrade from 3.1.3 to 2.7.2, namenode successful , but datanode failed (2.8+ successfully ). because different datanode layout versions is -56 in hdfs 2.7. So, Is there a way to solve datanode roll downgrade from layout version -57 to -56 ? // code placeholder2020-07-06 14:45:01,313 WARN org.apache.hadoop.hdfs.server.common.Storage: org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /data/hadoop/dfs. Reported: -57. Expecting = -56. 2020-07-06 14:45:01,315 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/hadoop/dfs/in_use.lock acquired by nodename 21258@test-v03 2020-07-06 14:45:01,315 WARN org.apache.hadoop.hdfs.server.common.Storage: org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /data/hadoop/dfs. Reported: -57. Expecting = -56. 2020-07-06 14:45:01,315 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to test-v01/10.110.228.21:8020. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1358) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1323) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802) at java.lang.Thread.run(Thread.java:748) > Failed to load image from FSImageFile when downgrade from 3.x to 2.x > > > Key: HDFS-14396 > URL: https://issues.apache.org/jira/browse/HDFS-14396 > Project: Hadoop HDFS > Issue Type: Bug > Components: rolling upgrades >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14396.001.patch, HDFS-14396.002.patch > > > After fixing HDFS-13596, try to downgrade from 3.x to 2.x. But namenode can't > start because exception occurs. The message follows > {code:java} > 2019-01-23 17:22:18,730 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: > Failed to load image from > FSImageFile(file=/data1/hadoopdata/hadoop-namenode/current/fsimage_0025310, > cpktTxId=00 > 25310) > java.lang.NullPointerException > at >
[jira] [Commented] (HDFS-14396) Failed to load image from FSImageFile when downgrade from 3.x to 2.x
[ https://issues.apache.org/jira/browse/HDFS-14396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154478#comment-17154478 ] fengwu commented on HDFS-14396: --- Hi, Found in my test roll downgrade from 3.1.3 to 2.7.2, namenode successful , but datanode failed (2.8+ successfully ). because different datanode layout versions is -56 in hdfs 2.7. So, Is there a way to solve datanode roll downgrade from layout version -57 to -56 ? // code placeholder2020-07-06 14:45:01,313 WARN org.apache.hadoop.hdfs.server.common.Storage: org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /data/hadoop/dfs. Reported: -57. Expecting = -56. 2020-07-06 14:45:01,315 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/hadoop/dfs/in_use.lock acquired by nodename 21258@test-v03 2020-07-06 14:45:01,315 WARN org.apache.hadoop.hdfs.server.common.Storage: org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /data/hadoop/dfs. Reported: -57. Expecting = -56. 2020-07-06 14:45:01,315 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to test-v01/10.110.228.21:8020. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1358) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1323) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802) at java.lang.Thread.run(Thread.java:748) > Failed to load image from FSImageFile when downgrade from 3.x to 2.x > > > Key: HDFS-14396 > URL: https://issues.apache.org/jira/browse/HDFS-14396 > Project: Hadoop HDFS > Issue Type: Bug > Components: rolling upgrades >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14396.001.patch, HDFS-14396.002.patch > > > After fixing HDFS-13596, try to downgrade from 3.x to 2.x. But namenode can't > start because exception occurs. The message follows > {code:java} > 2019-01-23 17:22:18,730 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: > Failed to load image from > FSImageFile(file=/data1/hadoopdata/hadoop-namenode/current/fsimage_0025310, > cpktTxId=00 > 25310) > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:243) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:179) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:885) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:869) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:742) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:673) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:290) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:998) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:612) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:672) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:839) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1517) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1583) > 2019-01-23 17:22:19,023 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: Failed to load FSImage file, see error(s) above for more > info. > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:688) > at >
[jira] [Comment Edited] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154413#comment-17154413 ] Vinayakumar B edited comment on HDFS-15098 at 7/9/20, 11:19 AM: Thanks [~zZtai] for the contribution Overall changes looks good. Following are my comments. Please check. 1. Adding this provider should be configurable. And update the document as required. As already mentioned by [~lindongdong] no need to add to JDK dirs. May be Issue descreption can be updated. so, following addition of Provider needs to be done only if its configured. Because direct adding of {{BounctCatleProvider}} seems to change the existing default behavior in some cases. Ex: {{TestKeyShell#createInvalidKeySize()}} suppose to fail with keysize 56. But it passes when provider is BC. So it should be used only on user's demand. So making it configurable would be wise choise. {code:java} + Security.addProvider(new BouncyCastleProvider()); {code} In KeyProvider.java it can be added as below. {code:java} String jceProvider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY); if (BouncyCastleProvider.PROVIDER_NAME.equals(jceProvider)) { Security.addProvider(new BouncyCastleProvider()); } {code} In JceSm4CtrCryptoCodec.java should add on setConf() instead of constructor. {code:java} provider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY, BouncyCastleProvider.PROVIDER_NAME); final String secureRandomAlg = conf.get( HADOOP_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY, HADOOP_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT); if (BouncyCastleProvider.PROVIDER_NAME.equals(provider)) { Security.addProvider(new BouncyCastleProvider()); } {code} 2. With Above change, {{TestKeyShell#testInvalidKeySize()}} will not fail anymore, as BC provider will not be added by default. So changes in {{TestKeyShell}} can be reverted. 3. In {{TestCryptoCodec.java}} Remove these lines from every test. {code:java} try { KeyGenerator keyGenerator = KeyGenerator.getInstance("SM4"); } catch (Exception e) { Assume.assumeTrue(false); } {code} 4. In {{TestCryptoCodec#testJceSm4CtrCryptoCodec}} change this config as below. {code:java} conf.set(HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_SM4_CTR_NOPADDING_KEY, JceSm4CtrCryptoCodec.class.getName());{code} Uncomment following lines {code:java} //cryptoCodecTest(conf, seed, count, //jceSm4CodecClass, opensslSm4CodecClass, iv); {code} {code:java} //cryptoCodecTest(conf, seed, count, //jceSm4CodecClass, opensslSm4CodecClass, iv); {code} 5. Avoid import statements with * in all classes. import only required classes directly. 6. {{HdfsKMSUtil.getCryptoCodec()}} is not logging {{JceSm4CTRCodec}}. May be can log all classnames, when its not null without checking the instanceof ? 7. I can see lot of code is same between AES and SM4 codecs, except the classnames and algorithm names. May be refactoring would help to reduce the duplicate code. 8. I think in {{hdfs.proto}} SM4 enum value can be changed to 3 directly. {code}enum CipherSuiteProto { UNKNOWN = 1; AES_CTR_NOPADDING = 2; SM4_CTR_NOPADDING = 3; }{code} 9. In {{OpenSecureRandom.c}} following functions' declarations and definitions can be kept within {{OPENSSL_VERSION_NUMBER < 0x1010L}} block. i.e. following fuctions should be used only when {{OPENSSL_VERSION_NUMBER < 0x1010L}} is true: {code} static void locks_setup(void) static void locks_cleanup(void) static void pthreads_locking_callback(int mode, int type, char *file, int line) static unsigned long pthreads_thread_id(void) {code} was (Author: vinayrpet): Thanks [~zZtai] for the contribution Overall changes looks good. Following are my comments. Please check. 1. Adding this provider should be configurable. And update the document as required. As already mentioned by [~lindongdong] no need to add to JDK dirs. May be Issue descreption can be updated. so, following addition of Provider needs to be done only if its configured. Because direct adding of {{BounctCatleProvider}} seems to change the existing default behavior in some cases. Ex: {{TestKeyShell#createInvalidKeySize()}} suppose to fail with keysize 56. But it passes when provider is BC. So it should be used only on user's demand. So making it configurable would be wise choise. {code:java} + Security.addProvider(new BouncyCastleProvider()); {code} In KeyProvider.java it can be added as below. {code:java} String jceProvider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY); if (BouncyCastleProvider.PROVIDER_NAME.equals(jceProvider)) { Security.addProvider(new BouncyCastleProvider()); } {code} In JceSm4CtrCryptoCodec.java should add on setConf() instead of constructor.
[jira] [Commented] (HDFS-15392) DistrbutedFileSystem#concat api can create large number of small blocks
[ https://issues.apache.org/jira/browse/HDFS-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154422#comment-17154422 ] jianghua zhu commented on HDFS-15392: - [~weichiu] , I very much agree with your suggestion. > DistrbutedFileSystem#concat api can create large number of small blocks > --- > > Key: HDFS-15392 > URL: https://issues.apache.org/jira/browse/HDFS-15392 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Lokesh Jain >Priority: Major > > DistrbutedFileSystem#concat moves blocks from source to target. If the api is > repeatedly used on small files it can create large number of small blocks in > the target file. The Jira aims to optimize the api to avoid the issue of > small blocks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x
[ https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154414#comment-17154414 ] fengwu commented on HDFS-13596: --- [~_ph] , At first I had the same view as you, when I saw this comment :[https://issues.apache.org/jira/browse/HDFS-13596?focusedCommentId=16911102=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16911102] > NN restart fails after RollingUpgrade from 2.x to 3.x > - > > Key: HDFS-13596 > URL: https://issues.apache.org/jira/browse/HDFS-13596 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, > HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, > HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch, > HDFS-13596.009.patch, HDFS-13596.010.patch > > > After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails > while replaying edit logs. > * After NN is started with rollingUpgrade, the layoutVersion written to > editLogs (before finalizing the upgrade) is the pre-upgrade layout version > (so as to support downgrade). > * When writing transactions to log, NN writes as per the current layout > version. In 3.x, erasureCoding bits are added to the editLog transactions. > * So any edit log written after the upgrade and before finalizing the > upgrade will have the old layout version but the new format of transactions. > * When NN is restarted and the edit logs are replayed, the NN reads the old > layout version from the editLog file. When parsing the transactions, it > assumes that the transactions are also from the previous layout and hence > skips parsing the erasureCoding bits. > * This cascades into reading the wrong set of bits for other fields and > leads to NN shutting down. > Sample error output: > {code:java} > java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected > length 16 > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86) > at > org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163) > at > org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710) > 2018-05-17 19:10:06,522 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: java.lang.IllegalStateException: Cannot skip to less > than the current value (=16389), where newValue=16388 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at >
[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154413#comment-17154413 ] Vinayakumar B commented on HDFS-15098: -- Thanks [~zZtai] for the contribution Overall changes looks good. Following are my comments. Please check. 1. Adding this provider should be configurable. And update the document as required. As already mentioned by [~lindongdong] no need to add to JDK dirs. May be Issue descreption can be updated. so, following addition of Provider needs to be done only if its configured. Because direct adding of {{BounctCatleProvider}} seems to change the existing default behavior in some cases. Ex: {{TestKeyShell#createInvalidKeySize()}} suppose to fail with keysize 56. But it passes when provider is BC. So it should be used only on user's demand. So making it configurable would be wise choise. {code:java} + Security.addProvider(new BouncyCastleProvider()); {code} In KeyProvider.java it can be added as below. {code:java} String jceProvider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY); if (BouncyCastleProvider.PROVIDER_NAME.equals(jceProvider)) { Security.addProvider(new BouncyCastleProvider()); } {code} In JceSm4CtrCryptoCodec.java should add on setConf() instead of constructor. {code:java} provider = conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY, BouncyCastleProvider.PROVIDER_NAME); final String secureRandomAlg = conf.get( HADOOP_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY, HADOOP_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT); if (BouncyCastleProvider.PROVIDER_NAME.equals(provider)) { Security.addProvider(new BouncyCastleProvider()); } {code} 2. With Above change, {{TestKeyShell#testInvalidKeySize()}} will not fail anymore, as BC provider will not be added by default. So changes in {{TestKeyShell}} can be reverted. 3. In {{TestCryptoCodec.java}} Remove these lines from every test. {code:java} try { KeyGenerator keyGenerator = KeyGenerator.getInstance("SM4"); } catch (Exception e) { Assume.assumeTrue(false); } {code} 4. In {{TestCryptoCodec#testJceSm4CtrCryptoCodec}} change this config as below. {code:java} conf.set(HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_SM4_CTR_NOPADDING_KEY, JceSm4CtrCryptoCodec.class.getName());{code} Uncomment following lines {code:java} //cryptoCodecTest(conf, seed, count, //jceSm4CodecClass, opensslSm4CodecClass, iv); {code} {code:java} //cryptoCodecTest(conf, seed, count, //jceSm4CodecClass, opensslSm4CodecClass, iv); {code} 5. Avoid import statements with * in all classes. import only required classes directly. 6. {{HdfsKMSUtil.getCryptoCodec()}} is not logging {{JceSm4CTRCodec}}. May be can log all classnames, when its not null without checking the instanceof ? 7. I can see lot of code is same between AES and SM4 codecs, except the classnames and algorithm names. May be refactoring would help to reduce the duplicate code. > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: zZtai >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch > > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.download Bouncy Castle Crypto APIs from bouncycastle.org > [https://bouncycastle.org/download/bcprov-ext-jdk15on-165.jar] > 2.Configure JDK > Place bcprov-ext-jdk15on-165.jar in $JAVA_HOME/jre/lib/ext directory, > add "security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider" > to $JAVA_HOME/jre/lib/security/java.security file > 3.Configure Hadoop KMS > 4.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 > 2.configure Bouncy Castle Crypto on JDK -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Commented] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x
[ https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154401#comment-17154401 ] Yuriy Malygin commented on HDFS-13596: -- [~fengwu99] downgrade is not possible between major versions, because will be changing layout version (in your case 57/56 layout versions) Note from official documentation: {quote} A newer release is downgradable to the pre-upgrade release only if both the namenode layout version and the datanode layout version are not changed between these two releases. {quote} URL: https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html#Downgrade > NN restart fails after RollingUpgrade from 2.x to 3.x > - > > Key: HDFS-13596 > URL: https://issues.apache.org/jira/browse/HDFS-13596 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, > HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, > HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch, > HDFS-13596.009.patch, HDFS-13596.010.patch > > > After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails > while replaying edit logs. > * After NN is started with rollingUpgrade, the layoutVersion written to > editLogs (before finalizing the upgrade) is the pre-upgrade layout version > (so as to support downgrade). > * When writing transactions to log, NN writes as per the current layout > version. In 3.x, erasureCoding bits are added to the editLog transactions. > * So any edit log written after the upgrade and before finalizing the > upgrade will have the old layout version but the new format of transactions. > * When NN is restarted and the edit logs are replayed, the NN reads the old > layout version from the editLog file. When parsing the transactions, it > assumes that the transactions are also from the previous layout and hence > skips parsing the erasureCoding bits. > * This cascades into reading the wrong set of bits for other fields and > leads to NN shutting down. > Sample error output: > {code:java} > java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected > length 16 > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86) > at > org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163) > at > org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710) > 2018-05-17 19:10:06,522 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: java.lang.IllegalStateException: Cannot skip to less > than the current value (=16389), where newValue=16388 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at >
[jira] [Comment Edited] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x
[ https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154336#comment-17154336 ] fengwu edited comment on HDFS-13596 at 7/9/20, 9:12 AM: [~_ph] Thanks. In your case, used rollback not downgrade . Rollback will lost data, I want a way to roll downgrade do you think it works? was (Author: fengwu99): [~_ph] Thanks. In your case, used rollback not downgrade . Rollback will lost data, I want a way to roll downgrade . > NN restart fails after RollingUpgrade from 2.x to 3.x > - > > Key: HDFS-13596 > URL: https://issues.apache.org/jira/browse/HDFS-13596 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, > HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, > HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch, > HDFS-13596.009.patch, HDFS-13596.010.patch > > > After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails > while replaying edit logs. > * After NN is started with rollingUpgrade, the layoutVersion written to > editLogs (before finalizing the upgrade) is the pre-upgrade layout version > (so as to support downgrade). > * When writing transactions to log, NN writes as per the current layout > version. In 3.x, erasureCoding bits are added to the editLog transactions. > * So any edit log written after the upgrade and before finalizing the > upgrade will have the old layout version but the new format of transactions. > * When NN is restarted and the edit logs are replayed, the NN reads the old > layout version from the editLog file. When parsing the transactions, it > assumes that the transactions are also from the previous layout and hence > skips parsing the erasureCoding bits. > * This cascades into reading the wrong set of bits for other fields and > leads to NN shutting down. > Sample error output: > {code:java} > java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected > length 16 > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86) > at > org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163) > at > org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710) > 2018-05-17 19:10:06,522 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: java.lang.IllegalStateException: Cannot skip to less > than the current value (=16389), where newValue=16388 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at >
[jira] [Commented] (HDFS-15444) mkdir should not create dir in fallback if the dir already in mount Path
[ https://issues.apache.org/jira/browse/HDFS-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154339#comment-17154339 ] jianghua zhu commented on HDFS-15444: - [~umamaheswararao], can you tell me which version of the phenomenon happened? Is there a more detailed scene? > mkdir should not create dir in fallback if the dir already in mount Path > > > Key: HDFS-15444 > URL: https://issues.apache.org/jira/browse/HDFS-15444 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x
[ https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154336#comment-17154336 ] fengwu commented on HDFS-13596: --- [~_ph] Thanks. In your case, used rollback not downgrade . Rollback will lost data, I want a way to roll downgrade . > NN restart fails after RollingUpgrade from 2.x to 3.x > - > > Key: HDFS-13596 > URL: https://issues.apache.org/jira/browse/HDFS-13596 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, > HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, > HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch, > HDFS-13596.009.patch, HDFS-13596.010.patch > > > After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails > while replaying edit logs. > * After NN is started with rollingUpgrade, the layoutVersion written to > editLogs (before finalizing the upgrade) is the pre-upgrade layout version > (so as to support downgrade). > * When writing transactions to log, NN writes as per the current layout > version. In 3.x, erasureCoding bits are added to the editLog transactions. > * So any edit log written after the upgrade and before finalizing the > upgrade will have the old layout version but the new format of transactions. > * When NN is restarted and the edit logs are replayed, the NN reads the old > layout version from the editLog file. When parsing the transactions, it > assumes that the transactions are also from the previous layout and hence > skips parsing the erasureCoding bits. > * This cascades into reading the wrong set of bits for other fields and > leads to NN shutting down. > Sample error output: > {code:java} > java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected > length 16 > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86) > at > org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163) > at > org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710) > 2018-05-17 19:10:06,522 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: java.lang.IllegalStateException: Cannot skip to less > than the current value (=16389), where newValue=16388 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at >
[jira] [Commented] (HDFS-15452) Dynamically initialize the capacity of BlocksMap
[ https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154329#comment-17154329 ] jianghua zhu commented on HDFS-15452: - OK. Thank you very much for your suggestion, [~hexiaoqiao]. > Dynamically initialize the capacity of BlocksMap > > > Key: HDFS-15452 > URL: https://issues.apache.org/jira/browse/HDFS-15452 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.3 >Reporter: jianghua zhu >Assignee: jianghua zhu >Priority: Major > Attachments: HDFS-15452.001.patch > > > The default value for initializing BlocksMap in the BlockManager class is 2. > This can be set to a dynamic value. > BlockManager#BlockManager() { > .. > // Compute the map capacity by allocating 2% of total memory > blocksMap = new BlocksMap( > LightWeightGSet.computeCapacity(2.0, "BlocksMap")); > .. > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x
[ https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154318#comment-17154318 ] Yuriy Malygin edited comment on HDFS-13596 at 7/9/20, 8:21 AM: --- [~fengwu99] yes, in my case (https://issues.apache.org/jira/browse/HDFS-13596?focusedCommentId=16826162=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16826162 + https://issues.apache.org/jira/browse/HDFS-13596?focusedCommentId=16826939=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16826939) downgrade to 2.7.3 was successfully from 3.1.2 and 3.3.0-SNAPSHOT versions. was (Author: _ph): [~fengwu99] yes, in my case (https://issues.apache.org/jira/browse/HDFS-13596?focusedCommentId=16826162=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16826162) downgrade to 2.7.3 was successfully from 3.1.2 and 3.3.0-SNAPSHOT versions. > NN restart fails after RollingUpgrade from 2.x to 3.x > - > > Key: HDFS-13596 > URL: https://issues.apache.org/jira/browse/HDFS-13596 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, > HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, > HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch, > HDFS-13596.009.patch, HDFS-13596.010.patch > > > After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails > while replaying edit logs. > * After NN is started with rollingUpgrade, the layoutVersion written to > editLogs (before finalizing the upgrade) is the pre-upgrade layout version > (so as to support downgrade). > * When writing transactions to log, NN writes as per the current layout > version. In 3.x, erasureCoding bits are added to the editLog transactions. > * So any edit log written after the upgrade and before finalizing the > upgrade will have the old layout version but the new format of transactions. > * When NN is restarted and the edit logs are replayed, the NN reads the old > layout version from the editLog file. When parsing the transactions, it > assumes that the transactions are also from the previous layout and hence > skips parsing the erasureCoding bits. > * This cascades into reading the wrong set of bits for other fields and > leads to NN shutting down. > Sample error output: > {code:java} > java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected > length 16 > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86) > at > org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163) > at > org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710) > 2018-05-17 19:10:06,522 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: java.lang.IllegalStateException: Cannot skip to less > than the current value (=16389), where newValue=16388 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945) > at >
[jira] [Commented] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x
[ https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154318#comment-17154318 ] Yuriy Malygin commented on HDFS-13596: -- [~fengwu99] yes, in my case (https://issues.apache.org/jira/browse/HDFS-13596?focusedCommentId=16826162=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16826162) downgrade to 2.7.3 was successfully from 3.1.2 and 3.3.0-SNAPSHOT versions. > NN restart fails after RollingUpgrade from 2.x to 3.x > - > > Key: HDFS-13596 > URL: https://issues.apache.org/jira/browse/HDFS-13596 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, > HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, > HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch, > HDFS-13596.009.patch, HDFS-13596.010.patch > > > After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails > while replaying edit logs. > * After NN is started with rollingUpgrade, the layoutVersion written to > editLogs (before finalizing the upgrade) is the pre-upgrade layout version > (so as to support downgrade). > * When writing transactions to log, NN writes as per the current layout > version. In 3.x, erasureCoding bits are added to the editLog transactions. > * So any edit log written after the upgrade and before finalizing the > upgrade will have the old layout version but the new format of transactions. > * When NN is restarted and the edit logs are replayed, the NN reads the old > layout version from the editLog file. When parsing the transactions, it > assumes that the transactions are also from the previous layout and hence > skips parsing the erasureCoding bits. > * This cascades into reading the wrong set of bits for other fields and > leads to NN shutting down. > Sample error output: > {code:java} > java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected > length 16 > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86) > at > org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163) > at > org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710) > 2018-05-17 19:10:06,522 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: java.lang.IllegalStateException: Cannot skip to less > than the current value (=16389), where newValue=16388 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at >
[jira] [Commented] (HDFS-15452) Dynamically initialize the capacity of BlocksMap
[ https://issues.apache.org/jira/browse/HDFS-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154314#comment-17154314 ] Xiaoqiao He commented on HDFS-15452: Thanks involve me here. Sorry I don't get what issues do you meet. IIUC, tuning the percentage of memory capacity will impact performance obviously (GSet collision ratio or heap occupy). I don't trace why this is 2% rather than other percentage since it is very historical logic. About patch [^HDFS-15452.001.patch], a. it is better to give value section, rather than any value will pass now. b. does `INodeMap`, `RetryCache` and `cachedBlocks` also need to config? I prefer to keep this static value, in my practice it runs well for over 600GB heap memory. I don't find any heap occupy cost or performance issue. > Dynamically initialize the capacity of BlocksMap > > > Key: HDFS-15452 > URL: https://issues.apache.org/jira/browse/HDFS-15452 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.3 >Reporter: jianghua zhu >Assignee: jianghua zhu >Priority: Major > Attachments: HDFS-15452.001.patch > > > The default value for initializing BlocksMap in the BlockManager class is 2. > This can be set to a dynamic value. > BlockManager#BlockManager() { > .. > // Compute the map capacity by allocating 2% of total memory > blocksMap = new BlocksMap( > LightWeightGSet.computeCapacity(2.0, "BlocksMap")); > .. > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x
[ https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154238#comment-17154238 ] fengwu edited comment on HDFS-13596 at 7/9/20, 7:02 AM: Hi, [~_ph] ! Yes, downgrade before namenodes and not finalize upgrade. Have you tested downgrade to 2.x from 3.x ? Datanode downgrade successfully ? was (Author: fengwu99): Hi, [~_ph] ! Yes, downgrade before namenodes and not finalize upgrade. > NN restart fails after RollingUpgrade from 2.x to 3.x > - > > Key: HDFS-13596 > URL: https://issues.apache.org/jira/browse/HDFS-13596 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, > HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, > HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch, > HDFS-13596.009.patch, HDFS-13596.010.patch > > > After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails > while replaying edit logs. > * After NN is started with rollingUpgrade, the layoutVersion written to > editLogs (before finalizing the upgrade) is the pre-upgrade layout version > (so as to support downgrade). > * When writing transactions to log, NN writes as per the current layout > version. In 3.x, erasureCoding bits are added to the editLog transactions. > * So any edit log written after the upgrade and before finalizing the > upgrade will have the old layout version but the new format of transactions. > * When NN is restarted and the edit logs are replayed, the NN reads the old > layout version from the editLog file. When parsing the transactions, it > assumes that the transactions are also from the previous layout and hence > skips parsing the erasureCoding bits. > * This cascades into reading the wrong set of bits for other fields and > leads to NN shutting down. > Sample error output: > {code:java} > java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected > length 16 > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86) > at > org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163) > at > org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710) > 2018-05-17 19:10:06,522 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: java.lang.IllegalStateException: Cannot skip to less > than the current value (=16389), where newValue=16388 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at >
[jira] [Commented] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x
[ https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154238#comment-17154238 ] fengwu commented on HDFS-13596: --- Hi, [~_ph] ! Yes, downgrade before namenodes and not finalize upgrade. > NN restart fails after RollingUpgrade from 2.x to 3.x > - > > Key: HDFS-13596 > URL: https://issues.apache.org/jira/browse/HDFS-13596 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, > HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, > HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch, > HDFS-13596.009.patch, HDFS-13596.010.patch > > > After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails > while replaying edit logs. > * After NN is started with rollingUpgrade, the layoutVersion written to > editLogs (before finalizing the upgrade) is the pre-upgrade layout version > (so as to support downgrade). > * When writing transactions to log, NN writes as per the current layout > version. In 3.x, erasureCoding bits are added to the editLog transactions. > * So any edit log written after the upgrade and before finalizing the > upgrade will have the old layout version but the new format of transactions. > * When NN is restarted and the edit logs are replayed, the NN reads the old > layout version from the editLog file. When parsing the transactions, it > assumes that the transactions are also from the previous layout and hence > skips parsing the erasureCoding bits. > * This cascades into reading the wrong set of bits for other fields and > leads to NN shutting down. > Sample error output: > {code:java} > java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected > length 16 > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86) > at > org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163) > at > org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710) > 2018-05-17 19:10:06,522 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: java.lang.IllegalStateException: Cannot skip to less > than the current value (=16389), where newValue=16388 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at >
[jira] [Commented] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x
[ https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154228#comment-17154228 ] Yuriy Malygin commented on HDFS-13596: -- Hi [~fengwu99]! Did you downgrade datanodes before namenodes? > NN restart fails after RollingUpgrade from 2.x to 3.x > - > > Key: HDFS-13596 > URL: https://issues.apache.org/jira/browse/HDFS-13596 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Fei Hui >Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, > HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, > HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch, > HDFS-13596.009.patch, HDFS-13596.010.patch > > > After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails > while replaying edit logs. > * After NN is started with rollingUpgrade, the layoutVersion written to > editLogs (before finalizing the upgrade) is the pre-upgrade layout version > (so as to support downgrade). > * When writing transactions to log, NN writes as per the current layout > version. In 3.x, erasureCoding bits are added to the editLog transactions. > * So any edit log written after the upgrade and before finalizing the > upgrade will have the old layout version but the new format of transactions. > * When NN is restarted and the edit logs are replayed, the NN reads the old > layout version from the editLog file. When parsing the transactions, it > assumes that the transactions are also from the previous layout and hence > skips parsing the erasureCoding bits. > * This cascades into reading the wrong set of bits for other fields and > leads to NN shutting down. > Sample error output: > {code:java} > java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected > length 16 > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74) > at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86) > at > org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163) > at > org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710) > 2018-05-17 19:10:06,522 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > java.io.IOException: java.lang.IllegalStateException: Cannot skip to less > than the current value (=16389), where newValue=16388 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714) > at >