[ https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518375#comment-16518375 ]
Todd Lipcon commented on HADOOP-15549: -------------------------------------- I ran a simple program which just calls DefaultMetricsSystem.initialize against Hadoop 2.8.2 compared to 3.0.0 dist tarballs: *2.8.2: {code} 683.416696 task-clock (msec) # 1.793 CPUs utilized ( +- 2.32% ) 1,790 context-switches # 0.003 M/sec ( +- 1.07% ) 54 cpu-migrations # 0.080 K/sec ( +- 17.64% ) 13,688 page-faults # 0.020 M/sec ( +- 0.54% ) 2,216,866,739 cycles # 3.244 GHz ( +- 1.62% ) 2,299,332,469 instructions # 1.04 insn per cycle ( +- 1.21% ) 431,487,977 branches # 631.369 M/sec ( +- 1.17% ) 19,346,551 branch-misses # 4.48% of all branches ( +- 1.07% ) 0.381138028 seconds time elapsed ( +- 2.52% ) {code} *3.0.0:* {code} 924.881803 task-clock (msec) # 1.902 CPUs utilized ( +- 2.05% ) 1,962 context-switches # 0.002 M/sec ( +- 0.73% ) 44 cpu-migrations # 0.047 K/sec ( +- 11.15% ) 20,593 page-faults # 0.022 M/sec ( +- 0.55% ) 3,042,371,457 cycles # 3.289 GHz ( +- 1.67% ) 3,165,586,053 instructions # 1.04 insn per cycle ( +- 1.41% ) 592,945,118 branches # 641.104 M/sec ( +- 1.36% ) 25,735,278 branch-misses # 4.34% of all branches ( +- 1.30% ) 0.486354791 seconds time elapsed ( +- 2.04% ) {code} Not all of the regression is due to the metrics system initialization, but with a small patch that avoids the "builder" APIs, I can recover some of the regression. {code} 885.276567 task-clock (msec) # 2.009 CPUs utilized ( +- 1.45% ) 1,608 context-switches # 0.002 M/sec ( +- 2.02% ) 48 cpu-migrations # 0.055 K/sec ( +- 12.98% ) 18,949 page-faults # 0.021 M/sec ( +- 0.88% ) 2,908,533,684 cycles # 3.285 GHz ( +- 0.46% ) 3,045,577,520 instructions # 1.05 insn per cycle ( +- 0.66% ) 566,661,963 branches # 640.096 M/sec ( +- 0.67% ) 24,309,912 branch-misses # 4.29% of all branches ( +- 0.77% ) 0.440731241 seconds time elapsed ( +- 2.98% ) {code} It also loads fewer classes (1651 vs 1768) by eliminating usage of 'beanutil' and a bunch of ancillary classes in commons-configuration. > Upgrade to commons-configuration 2.1 regresses task CPU consumption > ------------------------------------------------------------------- > > Key: HADOOP-15549 > URL: https://issues.apache.org/jira/browse/HADOOP-15549 > Project: Hadoop Common > Issue Type: Bug > Components: metrics > Affects Versions: 3.0.2 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Major > > HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. > commons-configuration is used when parsing the metrics configuration > properties file. The new builder API used in the new version apparently makes > use of a bunch of very bloated reflection and classloading nonsense to > achieve the same goal, and this results in a regression of >100ms of CPU time > as measured by a program which simply initializes DefaultMetricsSystem. > This isn't a big deal for long-running daemons, but for MR tasks which might > only run a few seconds on poorly-tuned jobs, this can be noticeable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org