[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267388#comment-14267388 ] Hadoop QA commented on YARN-2996: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12690501/YARN-2996.003.patch against trunk revision 788ee35. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6266//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6266//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6266//console This message is automatically generated. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch, YARN-2996.002.patch, YARN-2996.003.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267991#comment-14267991 ] Zhijie Shen commented on YARN-2996: --- Almost good to me. Just one nit: can we reuse getFileStatus() method here too? {code} FileStatus status; try { status = fs.getFileStatus(amrmTokenSecretManagerStateDataDir); assert status.isFile(); } catch (FileNotFoundException ex) { return; } {code} Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch, YARN-2996.002.patch, YARN-2996.003.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265835#comment-14265835 ] Yi Liu commented on YARN-2996: -- Test failure and findbugs are not related. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch, YARN-2996.002.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267065#comment-14267065 ] Yi Liu commented on YARN-2996: -- Yes, Zhijie {quote} Good catch! It seems that MemoryRMStateStore#storeOrUpdateAMRMTokenSecretManagerState needs to be fixed too. {quote} {{.002}} patch already includes this fix. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch, YARN-2996.002.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267322#comment-14267322 ] Zhijie Shen commented on YARN-2996: --- My bad. I mean MemoryRMStateStore#updateRMDelegationTokenState. It contains two other synchronized methods, but it's better to keep them atomic, and not interpolated by other operations. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch, YARN-2996.002.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266536#comment-14266536 ] Zhijie Shen commented on YARN-2996: --- bq. we missed synchronized for updateRMDelegationTokenState Good catch! It seems that MemoryRMStateStore#storeOrUpdateAMRMTokenSecretManagerState needs to be fixed too. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch, YARN-2996.002.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265760#comment-14265760 ] Hadoop QA commented on YARN-2996: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12690261/YARN-2996.002.patch against trunk revision 4cd66f7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6249//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6249//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6249//console This message is automatically generated. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch, YARN-2996.002.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265707#comment-14265707 ] Yi Liu commented on YARN-2996: -- Thanks [~zjshen] for review. You are right, for *.new* and *.tmp* file, the existing code uses them for some check. But actually the incompatible issue you mentioned is really rare and it's not a big issue. {{checkAndResumeUpdateOperation}} exists because we write state to *.tmp* file, then rename it to *.new* file, and finally rename to _output\_file_. If we remove step of renaming to *.new* file, we can remove this function too. Anyway, I will revert this modification. So in the new patch, I only keep the #1 described in description. I add two new fixes in the new patch: *1.* we missed *synchronized* for {{updateRMDelegationTokenState}} *2.* Add fix of YARN-3004 to this patch, since {{MemoryRMStateStore}} is only used in test and we can fix them in this patch too. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264901#comment-14264901 ] Zhijie Shen commented on YARN-2996: --- bq. we can merge them to save one RPC call It sounds a good idea bq. we can reduce one rename operation. This will affect the mechanism that we use *.new* file to recover the actual state file when recovering RM. It needs to be take care of, too. Perhaps we can't simply remove the logic to recover actual state file from *.new* file, and I can think of a rare incompatible issue. See the following procedure: 1. Old FS RMStateStore writes state file and fails after .new file is created. 2. RM stops. 3. RM is upgraded and so does FS RMStateStore. 4. RM starts again. 5. New FS RMStateStore will not recover *.new* file, and may mistake it as a normal file. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263743#comment-14263743 ] Yi Liu commented on YARN-2996: -- The 3 tests failures are not related. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance
[ https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260891#comment-14260891 ] Hadoop QA commented on YARN-2996: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689441/YARN-2996.001.patch against trunk revision 249cc90. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart org.apache.hadoop.yarn.server.resourcemanager.TestRM Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6213//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6213//console This message is automatically generated. Refine some fs operations in FileSystemRMStateStore to improve performance -- Key: YARN-2996 URL: https://issues.apache.org/jira/browse/YARN-2996 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Yi Liu Assignee: Yi Liu Attachments: YARN-2996.001.patch In {{FileSystemRMStateStore}}, we can refine some fs operations to improve performance: *1.* There are several places invoke {{fs.exists}}, then {{fs.getFileStatus}}, we can merge them to save one RPC call {code} if (fs.exists(versionNodePath)) { FileStatus status = fs.getFileStatus(versionNodePath); {code} *2.* {code} protected void updateFile(Path outputPath, byte[] data) throws Exception { Path newPath = new Path(outputPath.getParent(), outputPath.getName() + .new); // use writeFile to make sure .new file is created atomically writeFile(newPath, data); replaceFile(newPath, outputPath); } {code} The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then rename to _output\_file_.new, then rename it to _output\_file_, we can reduce one rename operation. Also there is one unnecessary import, we can remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)