[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893127#comment-16893127 ] CR Hota commented on HDFS-12748: [~cheersyang] We deployed this change in our clusters and this helped resolve NN mem leak issue. I can work on the GETFILEBLOCKLOCATIONS issue on branch-2. Was a new ticket created? I can create one if not. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888445#comment-16888445 ] Weiwei Yang commented on HDFS-12748: Agree, we need to get those fixed too. We can track in another issue, I can help to review but I don't have bandwith to work on patches. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888442#comment-16888442 ] Greg Senia commented on HDFS-12748: --- This will cause a Denial of Service by causing NN to OOM just as with GETHOMEDIRECTORY on any version less than Hadoop 3.x #!/bin/bash while true do curl -s -i --negotiate -u : "http://nn1.tech.hdp.example.com:50070/webhdfs/v1/tmp?op=GETFILEBLOCKLOCATIONS; > /dev/null 2>&1 done > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888016#comment-16888016 ] Greg Senia commented on HDFS-12748: --- [~xkrogen] and [~cheersyang] are there plans to fix getTrashRoot in 2.9/3.x branches and GETFILEBLOCKLOCATIONS in the 2.x branches? case GETFILEBLOCKLOCATIONS: { final long offsetValue = offset.getValue(); final Long lengthValue = length.getValue(); FileSystem fs = FileSystem.get(conf != null ? conf : new Configuration()); BlockLocation[] locations = fs.getFileBlockLocations( new org.apache.hadoop.fs.Path(fullpath), offsetValue, lengthValue != null? lengthValue: Long.MAX_VALUE); final String js = JsonUtil.toJsonString("BlockLocations", JsonUtil.toJsonMap(locations)); return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); } private static String getTrashRoot(String fullPath, Configuration conf) throws IOException { FileSystem fs = FileSystem.get(conf != null ? conf : new Configuration()); return fs.getTrashRoot( new org.apache.hadoop.fs.Path(fullPath)).toUri().getPath(); } > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882594#comment-16882594 ] Weiwei Yang commented on HDFS-12748: Thanks [~xkrogen], I've committed this to branch-3.1, and cherry picked to branch-3.0, branch-2, branch-2.9 and branch-2.8. Now it is fixed on all major branches. Closing it. Thanks for the review [~xkrogen], [~hanishakoneru], [~daryn]. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 3.2.1, 2.9.3, 3.1.3 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882234#comment-16882234 ] Erik Krogen commented on HDFS-12748: +1 on branch-3.1 backport > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881715#comment-16881715 ] Weiwei Yang commented on HDFS-12748: Hi [~xkrogen] Can you help to review the patch for branch-3.1? Thanks > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880951#comment-16880951 ] Hadoop QA commented on HDFS-12748: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 47s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 39s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 41s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} branch-3.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 3s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 38s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 54s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}204m 31s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.TestLeaseRecovery2 | | | hadoop.hdfs.server.diskbalancer.TestDiskBalancer | | | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:080e9d0 | | JIRA Issue | HDFS-12748 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12973998/HDFS-12748-branch-3.1.01.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9fded123dd9b 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880600#comment-16880600 ] Erik Krogen commented on HDFS-12748: Sorry for my late reply; I was on vacation last week. The changes seem fine to me. Thanks for handling this [~cheersyang]! I am happy to help with reviews for backports. Can we go at least back to {{branch-2}}? > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch, HDFS-12748.004.patch, HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878698#comment-16878698 ] Weiwei Yang commented on HDFS-12748: Committed to trunk and cherry-picked to branch-3.2. Looks like it needs some refactoring for earlier branches. I'll pause here, once [~xkrogen] confirms this is all OK. I can create patches for backport. Hope that makes sense. Thanks > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch, HDFS-12748.004.patch, HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878548#comment-16878548 ] Hudson commented on HDFS-12748: --- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16861 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16861/]) HDFS-12748. NameNode memory leak when accessing webhdfs (wwei: rev 729cb3aefe71d7f728c7edea78ce7f268a1fdecb) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch, HDFS-12748.004.patch, HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878265#comment-16878265 ] Weiwei Yang commented on HDFS-12748: Thanks for the +1, [~hanishakoneru], I am gonna commit this if no further comments from others. [~xkrogen], please take a look once you have time, we can still revisit this afterward. Thanks [~hanishakoneru], [~xkrogen]. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch, HDFS-12748.004.patch, HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878204#comment-16878204 ] Hanisha Koneru commented on HDFS-12748: --- Hi [~xkrogen], does patch v05 look good to you? > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch, HDFS-12748.004.patch, HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877160#comment-16877160 ] Hanisha Koneru commented on HDFS-12748: --- Thank you for working on this [~cheersyang]. LGTM. +1. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch, HDFS-12748.004.patch, HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877042#comment-16877042 ] Weiwei Yang commented on HDFS-12748: There are some existing issues on branch-3.0. I will create another issue to track, for the patch itself, it is good to go. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch, HDFS-12748.004.patch, HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876637#comment-16876637 ] Weiwei Yang commented on HDFS-12748: Ping [~xkrogen], could you please help to review this? Thank you. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch, HDFS-12748.004.patch, HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875409#comment-16875409 ] Hadoop QA commented on HDFS-12748: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 48s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 29s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}153m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-12748 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12973229/HDFS-12748.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7f11c27406e2 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb 13 15:00:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk /
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875377#comment-16875377 ] Hadoop QA commented on HDFS-12748: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 2s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 50s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 48s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}155m 55s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client | | | Unread field:DistributedFileSystem.java:[line 135] | | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.TestReconstructStripedFile | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-12748 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12973225/HDFS-12748.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875343#comment-16875343 ] Weiwei Yang commented on HDFS-12748: Hi [~xkrogen] Thanks for helping to review this, and sorry about the late response. I got pinged internally and users are running into this issue too. Let's work together to get this fixed. Your comments make sense to me, I have fixed both of them in v4 patch except the first {quote}I think rather than having a possibility of a null configuration and thus requiring a null check, it would be simpler to just supply a default conf object like what is done now. {quote} Are you suggesting we should have the null check before calling {{DFSUtilClient#getHomeDirectory}}? Why that is simpler? Thanks > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch, HDFS-12748.004.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871776#comment-16871776 ] Erik Krogen commented on HDFS-12748: [~cheersyang] are you still interested in working on this? The approach of the v3 patch seems good, I just have a few comments: * I think rather than having a possibility of a null configuration and thus requiring a null check, it would be simpler to just supply a default conf object like what is done now. * Is there actually a possibility of {{ugi}} being null? The {{System.getProperty()}} here doesn't seem great * {{new Path}} instead of {{new org.apache.hadoop.fs.Path}} ? * I don't think we need the {{toUri().toPath()}}; we can just call {{path.toString()}} directly since {{getHomeDirectory()}} doesn't return a fully qualified path > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868665#comment-16868665 ] Hadoop QA commented on HDFS-12748: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 9s{color} | {color:red} HDFS-12748 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-12748 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12896338/HDFS-12748.003.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27018/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868657#comment-16868657 ] Greg Senia commented on HDFS-12748: --- Should this be considered a DOS like attack?? [~daryn] is there any movement to get this fixed? > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868649#comment-16868649 ] Axton Grams commented on HDFS-12748: We're running into this issue. We can bring down our namenode with the following simple shell script: {code:java} #!/bin/bash while true do curl -i --negotiate -u : "http://namenode.fqdn:50070/webhdfs/v1/?op=GETHOMEDIRECTORY; done {code} This will cause the heap to grow until all memory is consumed and GC cannot reclaim memory. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16255283#comment-16255283 ] Weiwei Yang commented on HDFS-12748: [~daryn] any comments? > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241621#comment-16241621 ] Weiwei Yang commented on HDFS-12748: Hi [~daryn] Does v3 patch look good to you? Please let me know, thanks. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch, > HDFS-12748.003.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241598#comment-16241598 ] Hadoop QA commented on HDFS-12748: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 39s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 8s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 35s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}123m 19s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}192m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | | | hadoop.hdfs.server.namenode.ha.TestHASafeMode | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.balancer.TestBalancerRPCDelay | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12748 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12896338/HDFS-12748.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7df8d37fcc0d 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233761#comment-16233761 ] Weiwei Yang commented on HDFS-12748: Thanks [~daryn], your comment makes sense to me. Just uploaded v2 patch, this patch pulls some common methods out for re-use, and remove the FileSystem call for GETHOMEDIRECTORY, please help to review, thanks. Note, GETTRASHROOT has same issue, but it requires more refactor (related to EC) to make it work consistent in webhdfs and HDFS, I think we need a separate JIRA to fix. Please let me know if this makes sense, thanks. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: HDFS-12748.001.patch, HDFS-12748.002.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226859#comment-16226859 ] Daryn Sharp commented on HDFS-12748: No. Absolutely not. While it "fixes" the reported issue, it causes the filesystem to always be removed from the cache which is wrong in the general case. The real problem is the NN should _never_ be using a filesystem instance for itself. In this case, it's creating a filesystem just to return /user/blah. The GETHOMEDIRECTORY method can just construct the string w/o creating any filesystem instances. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang > Attachments: HDFS-12748.001.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226836#comment-16226836 ] Hadoop QA commented on HDFS-12748: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 49s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 6s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 1s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}181m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.net.TestClusterTopology | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.TestEncryptionZonesWithKMS | | | hadoop.hdfs.server.namenode.TestStartup | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12748 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12894955/HDFS-12748.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ec35a5d4fb3a 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c02d2ba | | maven | version: Apache Maven 3.3.9 | | Default Java
[jira] [Commented] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226534#comment-16226534 ] Weiwei Yang commented on HDFS-12748: Hi [~yangjiandan] Thanks for reporting this issue, the leak must be fixed. I took a look at {{NamenodeWebHdfsMethods}}, there are currently 2 places calling FileSystem.get() without closing. Can we fix them both? bq. which contains 7,844,890 DistributedFileSystem First impression I was thinking the {{FileSystem#Cache}} is not working properly, but after looking into the code, I think it is working as expected. The cache stores FileSystem instances distinguished by {{FileSystem.Cache.Key}} which includes a UGI field. According to HADOOP-6670, UGI is distinguished by the {{Subject}}. In webhdfs, UGI is parsed by {{DataNodeUGIProvider}} per http request, this way {{Subject}} is an different instance every time so it will be new to the cache. That was why the cache has so many instances if not properly closing. > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org