[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175932#comment-17175932 ] Chengwei Wang commented on HDFS-15493: -- Thank you [~sodonnell], it's my pleasure to be a contributor of HDFS project :D > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, HDFS-15493.008.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175913#comment-17175913 ] Hadoop QA commented on HDFS-15493: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 42s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 2s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 59s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 2 extant findbugs warnings. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 49s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175872#comment-17175872 ] Stephen O'Donnell commented on HDFS-15493: -- The find bugs warnings have been introduced by HDFS-15520. The checkstyle issues are due to my modifications to the test. I will upload a new patch with the changes to trigger the CI run again. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175759#comment-17175759 ] Hadoop QA commented on HDFS-15493: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 3m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 12s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 25s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 23s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 2 extant findbugs warnings. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 46s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 44 unchanged - 0 fixed = 46 total (was 44) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 34s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} |
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175584#comment-17175584 ] Hemanth Boyina commented on HDFS-15493: --- good work here [~smarthan] [~sodonnell] > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175562#comment-17175562 ] Stephen O'Donnell commented on HDFS-15493: -- +1 on 007 patch. I will commit it later pending the CI results coming back. I re-ran the benchmark tests on the final patch: With Patch + Parallel loading: 202 / 203 seconds. (73612009 blocks) No patch + parallel loading: 237 / 233 seconds Approx 14% improvement. With Patch (parallel load disabled): 345 / 340 seconds. No patch (parallel load disabled): 400 / 384 seconds. Approx 13% improvement. The above image has significant snapshots present. [~smarthan] saw about a 20% improvement in a large image with no snapshots. Thanks for all the work on this [~smarthan]! > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175422#comment-17175422 ] Chengwei Wang commented on HDFS-15493: -- Submit patch v007 [^HDFS-15493.007.patch]. Hi [~sodonnell], thanks for your advice. I have removed the blank line and added the unit test which worked in my test. Please help to review this patch. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, HDFS-15493.007.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174438#comment-17174438 ] Stephen O'Donnell commented on HDFS-15493: -- [~smarthan]Thanks for the update. I think we are mostly good now. Just 2 more things: 1) You have a blank like at line 309 in FSImageFormatPBINode.java: {code} private void addToCacheAndBlockMap(final ArrayList inodeList) { >> This line is blank final ArrayList inodes = new ArrayList<>(inodeList); nameCacheUpdateExecutor.submit( {code} 2. I discussed this change with one of my colleagues, and he suggested we extend the unit test you added to take some snapshots and rename some files, as this will create some inodeReference objects, and hence test that code path too. Then we can dump the filesystem tree before and after saving the namespace and ensure they are identical. I have adjusted your test to do this: {code} @Test public void testUpdateBlocksMapAndNameCacheAsync() throws IOException { Configuration conf = new Configuration(); MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf).build(); cluster.waitActive(); DistributedFileSystem fs = cluster.getFileSystem(); FSDirectory fsdir = cluster.getNameNode().namesystem.getFSDirectory(); File workingDir = GenericTestUtils.getTestDir(); File preRestartTree = new File(workingDir,"preRestartTree"); File postRestartTree = new File(workingDir,"postRestartTree"); Path baseDir = new Path("/user/foo"); fs.mkdirs(baseDir); fs.allowSnapshot(baseDir); for (int i = 0; i < 5; i++) { Path dir = new Path(baseDir, Integer.toString(i)); fs.mkdirs(dir); for (int j = 0; j < 5; j++) { Path file = new Path(dir, Integer.toString(j)); FSDataOutputStream os = fs.create(file); os.write((byte) j); os.close(); } fs.createSnapshot(baseDir, "snap_"+i); fs.rename(new Path(dir, "0"), new Path(dir, "renamed")); } SnapshotTestHelper.dumpTree2File(fsdir, preRestartTree); // checkpoint fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER); fs.saveNamespace(); fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE); cluster.restartNameNode(); cluster.waitActive(); fs = cluster.getFileSystem(); fsdir = cluster.getNameNode().namesystem.getFSDirectory(); // Ensure all the files created above exist, and blocks is correct. for (int i = 0; i < 5; i++) { Path dir = new Path(baseDir, Integer.toString(i)); assertTrue(fs.getFileStatus(dir).isDirectory()); for (int j = 0; j < 5; j++) { Path file = new Path(dir, Integer.toString(j)); if (j == 0) { file = new Path(dir, "renamed"); } FSDataInputStream in = fs.open(file); int n = in.readByte(); assertEquals(j, n); in.close(); } } SnapshotTestHelper.dumpTree2File(fsdir, postRestartTree); SnapshotTestHelper.compareDumpedTreeInFile( preRestartTree, postRestartTree, true); } {code} If you could fix the blank line add in the above unit test I am +1 to commit this. Thanks for all your work on this. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174108#comment-17174108 ] Hadoop QA commented on HDFS-15493: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 54s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 3s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 32s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 43s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}124m 45s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} |
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174042#comment-17174042 ] Chengwei Wang commented on HDFS-15493: -- Submit patch v006 [^HDFS-15493.006.patch] Removed the redundant `fillUpInodeList(...)` call and fixed the problems about checkstyle and JavaDoc. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > HDFS-15493.006.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174019#comment-17174019 ] Chengwei Wang commented on HDFS-15493: -- Hi [~sodonnell], thanks for your detailed review. {quote}Before this change, there may have been some inodes where were only referred to in references, but those INodes must be in the inode section of the image. Therefore we will handle all inodes mentioned in the references already with the change in `loadINodesInSection(...)` and hence should remove the `fillUpInodeList(...)` call below - do you think that is correct? {quote} I think you are absolutely right, the INodeReference was constructed with an inode get by inodeMap which had been loaded during `loadINodesInSection(...)` , so it makes no sense to call `fillUpInodeList(...)` again. I will remove the redundant call and fix the problems about checkstyle and JavaDoc. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173653#comment-17173653 ] Stephen O'Donnell commented on HDFS-15493: -- Hi [~smarthan] Thanks for the latest patch, and sorry for the slow review. I missed the email telling me you had uploaded the latest patch. Just a few minor points to address and one more difficult one: 1. There is a checkstyle warning on line 34 in FSImageFormatPBINode.java for an unused import. 2. In waitExecutorTerminated, I think the line `long start = System.currentTimeMillis();` should be above the while loop so start is not reset on each iteration? {code} private void waitExecutorTerminated(ExecutorService executorService) throws IOException { executorService.shutdown(); while (!executorService.isTerminated()) { long start = System.currentTimeMillis(); try { executorService.awaitTermination(1, TimeUnit.SECONDS); if (LOG.isDebugEnabled()) { LOG.debug("Waiting to executor service terminated duration {}ms.", (System.currentTimeMillis() - start)); } } catch (InterruptedException e) { LOG.error("Interrupted waiting for executor terminated.", e); throw new IOException(e); } {code} 3. Our change is loading the inode cache and block map for all inodes in the image when loading the inode section. However, when we are loading the inodeDirectory section, we are still calling `fillUpInodeList(...)` for the `RefChildrenList`. Looking at the code, an inodeReference always refers to another inode, and it is that referred inode which is added to the cache and blocks map. Before this change, there may have been some inodes where were only referred to in references, but those INodes must be in the inode section of the image. Therefore we will handle all inodes mentioned in the references already with the change in `loadINodesInSection(...)` and hence should remove the `fillUpInodeList(...)` call below - do you think that is correct? Note that I don't think these extra calls will do any harm, but they will probably result in doing more work on an image with lots of references (ie snapshots). {code} for (int refId : e.getRefChildrenList()) { INodeReference ref = refList.get(refId); if (addToParent(p, ref)) { fillUpInodeList(inodeList, ref); } else { LOG.warn("Failed to add the inode reference {} to the directory {}", ref.getId(), p.getId()); } } {code} 4. Please add a short JavaDoc to `addToCacheInternal(...)` and `updateBlockMapInternal(...)` indicating they can only be run by a single thread as they modify non-thread safe data structures. This will help ensure someone does not make a mistake using them in the future. 5. Maybe add a comment above the two Executor definitions, eg: "These executors must be single threaded, as they are used to modify structures which are not thread safe", again to warn someone in the future not to change this: {code} blocksMapUpdateExecutor = Executors.newSingleThreadExecutor(); nameCacheUpdateExecutor = Executors.newSingleThreadExecutor(); {code} After addressing these few points, I think this change will be good to commit. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173604#comment-17173604 ] Chengwei Wang commented on HDFS-15493: -- Hi [~sodonnell], can you help to review this new patch? > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169722#comment-17169722 ] Hadoop QA commented on HDFS-15493: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 46s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 2m 57s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 44 unchanged - 0 fixed = 45 total (was 44) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 42s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}158m 53s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.server.datanode.TestBPOfferService | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/28/artifact/out/Dockerfile | | JIRA Issue | HDFS-15493 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13008908/HDFS-15493.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8c7d76282d14 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | |
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169672#comment-17169672 ] Chengwei Wang commented on HDFS-15493: -- Submit a new patch [^HDFS-15493.005.patch] Removed the switch of this feature and simplified some related code. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, HDFS-15493.005.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169665#comment-17169665 ] Chengwei Wang commented on HDFS-15493: -- Hi [~sodonnell], thanks for your review. {quote}I think we should create a new patch where the feature cannot be enabled / disabled - just have it always on, as I cannot think of a good reason someone should turn it off, and it will make the code simpler if we just remove the switch. What do you think? {quote} I agree with you. It's better to remove the switch of this feature. I will remove it and submit a new patch soon. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169228#comment-17169228 ] Hadoop QA commented on HDFS-15493: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 55s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 3s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 42s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 17s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}160m 40s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.tools.TestHdfsConfigFields | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/22/artifact/out/Dockerfile | | JIRA Issue | HDFS-15493 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13008825/HDFS-15493.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1c92a7f6c99f 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / a7fda2e38f2 | | Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 | | unit |
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169199#comment-17169199 ] Hadoop QA commented on HDFS-15493: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 10s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 49s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}179m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader | | | hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes | | | hadoop.hdfs.TestGetFileChecksum | | | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap | | | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover | | | hadoop.tools.TestHdfsConfigFields | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/17/artifact/out/Dockerfile | | JIRA Issue | HDFS-15493 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13008825/HDFS-15493.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux de273de581b4 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168917#comment-17168917 ] Stephen O'Donnell commented on HDFS-15493: -- I tested with the 004 patch: * Parallel Load on + this feature on - 209 / 207 seconds * Parallel Load on + this feature off - 225 / 228 seconds * Parallel Load off + this feature off = 370 / 408 seconds * Parallel Load off + this feature on = 325 / 341 seconds This new patch improves things significantly, so I think we should go forward with this technique. I think we should create a new patch where the feature cannot be enabled / disabled - just have it always on, as I cannot think of a good reason someone should turn it off, and it will make the code simpler if we just remove the switch. What do you think? > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168670#comment-17168670 ] Stephen O'Donnell commented on HDFS-15493: -- {quote} After reviewed code about update blocks map and name cache carefully,I found that it's feasible to start to do these when started loading INodeSection, and shutdown the executors when completed loading INodeDirectorySection {quote} That is a good idea - I had not thought of doing this. Both the cache and block map is working with inodes, so its strange the existing code performed these steps in the Directory section. I will try to test performance on trunk today. One suggestion / question - can you think of any reason someone would want to disable this new feature? It makes the code slightly more complex to make it optional, and I cannot really think of a reason why it would make sense to disable it (assuming it has no bugs). I would be to remove the configuration switch and just make it always on. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168635#comment-17168635 ] Chengwei Wang commented on HDFS-15493: -- After reviewed code about update blocks map and name cache carefully,I found that it's feasible to start to do these when started loading INodeSection, and shutdown the executors when completed loading INodeDirectorySection. So that, it taken almost no time cost to wait executor terminated. Submit a patch [^HDFS-15493.004.patch] base on this means. It uses two single thread executors and updates without lock. Tested this patch twice. {code:java} Test1. 20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 1 20/07/31 18:27:51 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 367 seconds. Test2. 20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 1 20/07/31 18:48:04 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 363 seconds.{code} It takes about 20% speed up base my tests and reduces the time cost from 460s+ to 360s+. I think this patch may be the best choice, [~sodonnell] can you help me test it on trunk. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168560#comment-17168560 ] Chengwei Wang commented on HDFS-15493: -- Submit v003 patch [^HDFS-15493.003.patch] Base on two single thread executors, removed update lock. Tested this patch twice: {code:java} Test1. 20/07/31 16:12:17 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 16:12:36 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 18615 20/07/31 16:12:37 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 431 seconds. Test2. 20/07/31 16:39:20 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 16:39:27 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, total waiting duration: 7151 20/07/31 16:39:28 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 425 seconds. {code} > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > HDFS-15493.003.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168506#comment-17168506 ] Chengwei Wang commented on HDFS-15493: -- Submit v002 patch [^HDFS-15493.002.patch]. Base on one executor with 4 threads, added a unit test, refactor code to shutdown executor and added waiting time logging. I had tested this patch twice: {code:java} Test 1. 20/07/31 14:21:17 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 14:21:22 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, waiting timeduration(ms): 5161 20/07/31 14:21:23 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 409 seconds. Test 2. 20/07/31 16:00:03 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 16:00:16 INFO namenode.FSImageFormatPBINode: Completed update blocks map and name cache, waiting time duration(ms): 12105 20/07/31 16:00:17 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 424 seconds. {code} > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, > fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168347#comment-17168347 ] Chengwei Wang commented on HDFS-15493: -- {quote}Therefore setting it to 500 or 1000ms and logging a message each time around the loop should not give any time penalty, but should give us some information about what is happening. {quote} Yes, you are exactly right! The more waiting time and logging would be useful, I would add these. {quote}How long does the shutdown take with the single 4 thread executor? {quote} I just assmued the waiting time was the time cost from `completed loading all INodeDirectory sub-sections` to loading fsimage finished. {code:java} 20/07/31 10:25:59 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections 20/07/31 10:26:22 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 431 seconds. {code} {quote}Are you testing this on the trunk code + this patch, or a different version plus this patch? {quote} I tested this patch on our dev branch which was based on CDH5.10.0 with many patches, the version should be 2.6.0~2.8.0. {quote}Could you try testing 2 executors with 2 threads each? {quote} I had tested this after tested two single thread executors, the time cost was betweent 420s and 430s. I will submit 3 new patches: # one executor with 4 threads with waiting time logging # two single thread executor with waiting time logging and without lock # two fixed 2 thread executors with lock and waiting time logging Let's we test which one would preform best. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167831#comment-17167831 ] Stephen O'Donnell commented on HDFS-15493: -- {quote} So, awaitTermination 1 ms would make executor shutdown quickly. {quote} I believe if you specify a timeout of 500ms, and the threads all finish in 5ms, the call will return. Therefore setting it to 500 or 1000ms and logging a message each time around the loop should not give any time penalty, but should give us some information about what is happening. {quote} with the same fsimage, the time cost would increase to 430s with about 10s+ time to wait two executors shutdown. {quote} How long does the shutdown take with the single 4 thread executor? I cannot see how multiple threads help, as both the methods have a lock right at the start. If multiple threads make it faster, then it would suggest the time taken to pick the task from the queue and start it running is significant. Are you testing this on the trunk code + this patch, or a different version plus this patch? Could you try testing 2 executors with 2 threads each? > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167604#comment-17167604 ] Chengwei Wang commented on HDFS-15493: -- Hi [~sodonnell],sorry for missing some messages. {quote}I mis-understood how it worked, as I thought `awaitTermination(...)` threw an exception after the timeout, which is not the case. {quote} I guess you got a misunderstanding about `awaitTermination(...)` . It just like object.wait(long time), wolud just stop blocking rather than throw a InterruptedException. {code:java} /** * Blocks until all tasks have completed execution after a shutdown * request, or the timeout occurs, or the current thread is * interrupted, whichever happens first. * * @param timeout the maximum time to wait * @param unit the time unit of the timeout argument * @return {@code true} if this executor terminated and * {@code false} if the timeout elapsed before termination * @throws InterruptedException if interrupted while waiting */ boolean awaitTermination(long timeout, TimeUnit unit) throws InterruptedException; {code} So, awaitTermination 1 ms would make executor shutdown quickly. {quote}Did you find the runtime was about the same with a single executor with 4 threads and two executors with a single thread?As my testing showed a small improvement with the two single threaded executors case, and as locking prevents more than one thread to run concurrently, I think it would be better to go with the two executors with a single thread. {quote} I understand what you mean about the runtime. Intuitively, using two single thread executors would perform better than one fiexd threads executor. But I had tested update blocks map and cache name by two single thread executor and removed the lock yesterday after reply, with the same fsimage, the time cost would increase to 430s with about 10s+ time to wait two executors shutdown. So, I'm not sure that using two single thread executors would perfrom better. For more info, our fsimage had few snapshot, loading fsimage finished as soon as loadINodeDirectorySection finished. In other word, delay to shutdown the executors wouldn't work better. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167143#comment-17167143 ] Stephen O'Donnell commented on HDFS-15493: -- {quote} I had tested loading the caches and blocks by two single thread executors, same to your test result, there would be a long time to wait the executors terminated, so the time cost was not better than the one executor with four threads. {quote} Did you find the runtime was about the same with a single executor with 4 threads and two executors with a single thread? As my testing showed a small improvement with the two single threaded executors case, and as locking prevents more than one thread to run concurrently, I think it would be better to go with the two executors with a single thread. This think the time required for the executors to shutdown should be about the same in both cases. I also made an earlier comment on this code: {code} if (blocksMapUpdateExecutor != null) { blocksMapUpdateExecutor.shutdown(); Try { while (!blocksMapUpdateExecutor.isTerminated()) { blocksMapUpdateExecutor.awaitTermination(1, TimeUnit.MILLISECONDS); } } catch (InterruptedException e) { LOG.error("Interrupted waiting for blocksMap update threads.", e); throw new IOException(e); } } {code} I mis-understood how it worked, as I thought `awaitTermination(...)` threw an exception after the timeout, which is not the case. However, I think it makes sense to wait 500 or 1000ms rather than 1ms, and log a message indicating the executor is not yet shutdown. Or, we could time how long it takes to shutdown and log a message after the shutdown completes. That means we will get some visibility into how long the executors take to catch up. Also, for info, I ran my tests on trunk and the image also had some snapshots which will have extended the load time. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166818#comment-17166818 ] Chengwei Wang commented on HDFS-15493: -- Hi [~sodonnell], thanks for your detailed review and testing. {quote}When you tested, are you sure the parallel loading in HDFS-14617 was enabled correctly, by first saving the image to create the sub-sections in the image index? If it is working correctly, you should see log messages like: {quote} I'm sure that the parallel loading was eabled correctly, and I had tested again yesterday as your test suggestions, and submit a summary log here.[^fsimage-loading.log] In my tests, (240M inode + 220M blcoks) when update blocks async enabled, the time cost of loading fsimage reduce from 467s to 420s. So, I guess if the scale of fsimage make the loading improment not obvious. {quote}It would be very interesting to check the performance of my earlier suggestion with two single threaded executors and see how it performs. {quote} I had tested loading the caches and blocks by two single thread executors, same to your test result, there would be a long time to wait the executors terminated, so the time cost was not better than the one executor with four threads. {quote}If we could move the executor shutdown to the end of image loading, rather than wait on it, we would see a good improvement in the parallel case too. However, I am not sure if that is a safe thing to do - other sections may depend on the block map / cache being loaded fully when the inode directory section has completed. {quote} I agree this idea is a better way, I will try to check if it is safe and give a test result. By the way, I will refactor some code as your suggestions, and submit a patch soon. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch, fsimage-loading.log > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166393#comment-17166393 ] Stephen O'Donnell commented on HDFS-15493: -- I did a bit more testing: 1. Changed the code to have two single threaded executors - one for cache Map and one for Block Map 2. Added a debug message to let me know how long the executor service is taking to shutdown. With the parallel image loading disabled - the runtime is about the same or marginally better with the two single thread executors vs 1 executor with 4 threads. With parallel on and: * Two single threaded executors: 229 / 226 seconds (about 26 seconds waiting on executors to shutdown) * One executor with 4 threads: 243 / 238 seconds (this is a small performance degradation) * Feature disabled: 235 / 230 seconds There are two times for each run, as I ran each option twice. >From this, I believe two single threaded executors are the best choice. An interesting point from the parallel case with the single thread executors - the threadpools are taking about 25 - 30 seconds to shutdown. This means that the single thread cannot keep up with processing the number tasks. Adding more threads will not help due to locking. In the serial case the executors shutdown almost immediately, indicating they can keep up. If we could move the executor shutdown to the end of image loading, rather than wait on it, we would see a good improvement in the parallel case too. However, I am not sure if that is a safe thing to do - other sections may depend on the block map / cache being loaded fully when the inode directory section has completed. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165980#comment-17165980 ] Stephen O'Donnell commented on HDFS-15493: -- I tested this change with at image of 9GB, 86M inodes and 74M blocks. My load time with parallel loading off and this new async loading off, is about 384 seconds. Turning on only the new async block map loading, the load time is reduced to about 337 seconds. With parallel loading on - 4 threads and 12 sub-sections, any the async block map off, the load time is about 236 seconds. Finally turning on parallel loading and async block map, the load time increased to about 245 seconds. Therefore on my tests, this change slows down the parallel load slightly, but it does provide about 13% speed up with serial loading. When you tested, are you sure the parallel loading in HDFS-14617 was enabled correctly, by first saving the image to create the sub-sections in the image index? If it is working correctly, you should see log messages like: {code} 2020-07-27 20:21:06,566 INFO namenode.FSImageFormatProtobuf: The fsimage will be loaded in parallel using 4 threads 2020-07-27 20:21:06,611 INFO namenode.FSImageFormatPBINode: Loading the INode section in parallel with 12 sub-sections 2020-07-27 20:21:06,613 INFO namenode.FSImageFormatPBINode: Loading 86398618 INodes. 2020-07-27 20:21:10,855 INFO util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3674ms GC pool 'ParNew' had collection(s): count=1 time=4150ms 2020-07-27 20:22:49,827 INFO namenode.FSImageFormatPBINode: Completed loading all INode sections. Loaded 86398618 inodes. 2020-07-27 20:22:51,141 INFO namenode.FSImageFormatPBINode: Loading the INodeDirectory section in parallel with 12 sub-sections 2020-07-27 20:23:23,373 INFO namenode.FSImageFormatPBINode: Completed loading all INodeDirectory sub-sections {code} It would be very interesting to check the performance of my earlier suggestion with two single threaded executors and see how it performs. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165674#comment-17165674 ] Stephen O'Donnell commented on HDFS-15493: -- Hi [~smarthan]. Thanks for this patch. I think it is a good idea - I have some thoughts on things we should try, which might improve things further below. Only one thread can update the Cache map at a time and one can update the Block Map due to locking. The calls to these methods already process a batch, so they can hold the lock for a relatively long time. With that in mind, I wonder if the default of 4 threads makes sense - only 2 can ever be active at any time, and I think it would be possible for all 4 threads to be attempting to update the cacheMap when none are updating the blockMap. That means 2 or 3 threads will always be blocked. I think it would be would be worth testing two single threaded executor pools - one for the cacheMap and one for BlockMap and see if that performs the same or better - what do you think? I am not sure if waiting only 1ms before failing would give enough time for the executor to complete pending tasks. It may be possible for there to be a lot of queued requests which take a few seconds to finish processing: {code} if (blocksMapUpdateExecutor != null) { blocksMapUpdateExecutor.shutdown(); Try { while (!blocksMapUpdateExecutor.isTerminated()) { blocksMapUpdateExecutor.awaitTermination(1, TimeUnit.MILLISECONDS); } } catch (InterruptedException e) { LOG.error("Interrupted waiting for blocksMap update threads.", e); throw new IOException(e); } } {code} We could wait 5 seconds, and if there is a timeout, log a warning, and then wait again, perhaps 10 times before failing? This would also let us know if the load iNodeDirectory Section is having to wait on the new background tasks before the next stage can start. I would like to avoid the changes in FSImageFormatProtobuf.loadInternal() and passing all the null values to `inodeLoader.loadINodeDirectorySection(...)` if we can. I understand those changes are needed to shutdown the new executor. Therefore, lets wait and see how two single threaded executors work, and whether we need to wait on the thread pool to shutdown as that may influence how we shutdown the executors. If there is a delay in the threadpools shutting down, then we could consider moving the `blocksMapUpdateExecutor.shutdown()` call into a Loader.shutdownExecutors() method which we call after loading all sections. Don't make this change until we see the what happens with the other experiments above. Can I also ask: 1. Did you try HDFS-13693 and did it make any further speed improvement? 2. Could you try my suggestion with two single threaded executors and see what difference it makes to the runtime? 3. Would you be able to run a test with HDFS-14617 disabled to give us an idea of how much HDFS-14617 improves things on its own? > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165525#comment-17165525 ] Chengwei Wang commented on HDFS-15493: -- Hi [~sodonnell][~hexiaoqiao][~weichiu], can you help to review this patch? > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165305#comment-17165305 ] Hadoop QA commented on HDFS-15493: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 0s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 58s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 10s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 38s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}170m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.tools.TestHdfsConfigFields | | | hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader | | | hadoop.hdfs.TestGetFileChecksum | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/1/artifact/out/Dockerfile | | JIRA Issue | HDFS-15493 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13008339/HDFS-15493.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux de69aac05d90 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164356#comment-17164356 ] Chengwei Wang commented on HDFS-15493: -- Thanks Stephen O'Donnell for your info about HDFS-13693, I will try to apply and test it. I'd really appreciate it if you can help me review this patch. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164336#comment-17164336 ] Stephen O'Donnell commented on HDFS-15493: -- This looks like another good speed improvement. I will try to review this in the next day or two. For info, there is also HDFS-13693 which may give you some additional improvement. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164312#comment-17164312 ] Hadoop QA commented on HDFS-15493: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 28s{color} | {color:red} Docker failed to build yetus/hadoop:cce5a6f6094. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-15493 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13008339/HDFS-15493.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/29551/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > reduc to 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.
[ https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164307#comment-17164307 ] Chengwei Wang commented on HDFS-15493: -- submit patch v001. > Update block map and name cache in parallel while loading fsimage. > -- > > Key: HDFS-15493 > URL: https://issues.apache.org/jira/browse/HDFS-15493 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chengwei Wang >Priority: Major > Attachments: HDFS-15493.001.patch > > > While loading INodeDirectorySection of fsimage, it will update name cache and > block map after added inode file to inode directory. It would reduce time > cost of fsimage loading to enable these steps run in parallel. > In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load > fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost > is 410s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org