[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17253445#comment-17253445 ] Hemanth Boyina commented on HDFS-15569: --- thanks for everyone involved here thanks [~brahmareddy] for the review committed to trunk > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17253082#comment-17253082 ] Hadoop QA commented on HDFS-15569: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 42s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 47s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 17s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 1s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 59s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 64 unchanged - 3 fixed = 64 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 34s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green}{c
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250983#comment-17250983 ] Brahma Reddy Battula commented on HDFS-15569: - [~hemanthboyina] thanks for reporting and working on this. changes are LGTM. > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17227012#comment-17227012 ] Stephen O'Donnell commented on HDFS-15569: -- I just thought it would be almost as easy to solve this for the general case of multiple failed upgrade attempts, rather than assume there can only be one failed directory to remove. You could have CM auto-restarting, or someone may not increase the memory enough and need a further restart. However, but the time the DN fails again, there is a good chance the current.tmp was already removed, as it does not fail immediately on startup, but after running for some time. I am happy to go ahead with proposal here (renaming to current.tmp) as it will solve the problem in most cases where a single restart would be enough. > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226918#comment-17226918 ] Wei-Chiu Chuang commented on HDFS-15569: This is not directly related to Hadoop, but Cloudera Manager has a feature to attempt restart a failed node multiple times. I am not sure if that behavior also applies to upgrade. [~sodonnell] is that where your concern comes from? > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216188#comment-17216188 ] Hemanth Boyina commented on HDFS-15569: --- [~weichiu] [~brahmareddy] can we move this forward ? > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208165#comment-17208165 ] Hemanth Boyina commented on HDFS-15569: --- thanks for the comment [~sodonnell] {quote}Do you think it would be better, to rename the folder to current. and then the async delete thread would simply delete all folders {quote} your suggestion makes sense , though i think the upgrade failure here is mostly because of memory configurations , which i believe if once the upgrade is failed we adjust the configurations so that the upgrade will be success , so i think there wont be multiple restarts of the DN and hence there wont be repetitive deletions for Current.tmp > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201450#comment-17201450 ] Stephen O'Donnell commented on HDFS-15569: -- If there are multiple restarts of the DN, you would get current.tmp after the first restart. Then the second restart would need to wait on it to be deleted. Do you think it would be better, to rename the folder to current. and then the async delete thread would simply delete all folders named in that pattern one by one? This delete may have an impact on the disks while the upgrade step is attempting to create the hardlinks in the new directory, as the delete will be fighting for disk bandwidth too. I wonder if this delete would be better delayed until after the hard link creation has completed? One possible negative of this, is that the overhead of the delete is postponed until when the DN is actually in service, which might impact workloads. > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201411#comment-17201411 ] Hadoop QA commented on HDFS-15569: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 9s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 41s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 22s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 4s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 7s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 6s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 64 unchanged - 3 fixed = 64 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 57s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} |
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199698#comment-17199698 ] Hadoop QA commented on HDFS-15569: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 47s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 21s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 19s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 64 unchanged - 3 fixed = 64 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 33s{color}
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199503#comment-17199503 ] Hemanth Boyina commented on HDFS-15569: --- thanks for the review [~weichiu] {quote}and why log rootPath instead of the curTmp? {quote} for any upgrade failure , in Storage#doRecover ,we are having log info with root path , so just to be identical have kept the same kind of log reason {code:java} case RECOVER_ROLLBACK: // mv removed.tmp -> current LOG.info("Recovering storage directory {} from previous rollback", rootPath); {code} though your point makes sense to me , have updated the patch fixing your comments , please review > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198849#comment-17198849 ] Wei-Chiu Chuang commented on HDFS-15569: Nit: {code} LOG.info("Deleting storage directory {} from previous upgrade", rootPath); {code} Better be a LOG.warn(). Also, the message looks misleading. It should say something like "deleting storage ... failed" and why log rootPath instead of the curTmp? The thread would take a while to complete. It would be great to set a meaningful name of the thread. > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197173#comment-17197173 ] Hadoop QA commented on HDFS-15569: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 34m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 36s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 9s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 64 unchanged - 3 fixed = 64 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}130m 55s{color}
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17196554#comment-17196554 ] Hadoop QA commented on HDFS-15569: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 20m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 4m 13s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 9s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 51s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 64 unchanged - 3 fixed = 65 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 34s{co
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17194527#comment-17194527 ] Hadoop QA commented on HDFS-15569: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 3s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 20m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 38s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 43s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 48s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 64 unchanged - 3 fixed = 65 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}117m 30s{co
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17194457#comment-17194457 ] Hemanth Boyina commented on HDFS-15569: --- attached patch , please review > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193701#comment-17193701 ] Hemanth Boyina commented on HDFS-15569: --- One solution can be rename the current directory to current.tmp directory and delete the current.tmp in parallel , as the current directory doesn't exists now datanode upgrade can be proceeded > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in Recover_Upgrade state -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org