[ https://issues.apache.org/jira/browse/HADOOP-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604697#comment-16604697 ]
Hadoop QA commented on HADOOP-15696: ------------------------------------ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 9s{color} | {color:red} HADOOP-15696 does not apply to branch-3.1. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HADOOP-15696 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12938506/HADOOP-15696.branch-3.1.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/15143/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > KMS performance regression due to too many open file descriptors after Jetty > migration > -------------------------------------------------------------------------------------- > > Key: HADOOP-15696 > URL: https://issues.apache.org/jira/browse/HADOOP-15696 > Project: Hadoop Common > Issue Type: Bug > Components: kms > Affects Versions: 3.0.0-alpha2 > Reporter: Wei-Chiu Chuang > Assignee: Wei-Chiu Chuang > Priority: Blocker > Fix For: 3.2.0 > > Attachments: HADOOP-15696.001.patch, HADOOP-15696.002.patch, > HADOOP-15696.003.patch, HADOOP-15696.branch-3.1.001.patch, Screen Shot > 2018-08-22 at 11.36.16 AM.png, Screen Shot 2018-08-22 at 4.26.51 PM.png, > Screen Shot 2018-08-22 at 4.26.51 PM.png, Screen Shot 2018-08-22 at 4.27.02 > PM.png, Screen Shot 2018-08-22 at 4.30.32 PM.png, Screen Shot 2018-08-22 at > 4.30.39 PM.png, Screen Shot 2018-08-24 at 7.08.16 PM.png > > > We recently found KMS performance regressed in Hadoop 3.0, possibly linking > to the migration from Tomcat to Jetty in HADOOP-13597. > Symptoms: > # Hadoop 3.x KMS open file descriptors quickly rises to more than 10 thousand > under stress, sometimes even exceeds 32K, which is the system limit, causing > failures for any access to encryption zones. Our internal testing shows the > openfd number was in the range of a few hundred in Hadoop 2.x, and it > increases by almost 100x in Hadoop 3. > # Hadoop 3.x KMS as much as twice the heap size than in Hadoop 2.x. The same > heap size can go OOM in Hadoop 3.x. Jxray analysis suggests most of them are > temporary byte arrays associated with open SSL connections. > # Due to the heap usage, Hadoop 3.x KMS has more frequent GC activities, and > we observed up to 20% performance reduction due to GC. > A possible solution is to reduce the idle timeout setting in HttpServer2. It > is currently hard-coded 10 seconds. By setting it to 1 second, open fds > dropped from 20 thousand down to 3 thousand in my experiment. > File this jira to invite open discussion for a solution. > Credit: [~mi...@cloudera.com] for the proposed Jetty idle timeout remedy; > [~xiaochen] for digging into this problem. > Screenshots: > CDH5 (Hadoop 2) KMS CPU utilization, resident memory and file descriptor > chart. > !Screen Shot 2018-08-22 at 4.30.39 PM.png! > CDH6 (Hadoop 3) KMS CPU utilization, resident memory and file descriptor > chart. > !Screen Shot 2018-08-22 at 4.30.32 PM.png! > CDH5 (Hadoop 2) GC activities on the KMS process > !Screen Shot 2018-08-22 at 4.26.51 PM.png! > CDH6 (Hadoop 3) GC activities on the KMS process > !Screen Shot 2018-08-22 at 4.27.02 PM.png! > JXray report > !Screen Shot 2018-08-22 at 11.36.16 AM.png! > open fd drops from 20 k down to 3k after the proposed change. > !Screen Shot 2018-08-24 at 7.08.16 PM.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org