[ https://issues.apache.org/jira/browse/AMBARI-24758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647649#comment-16647649 ]
Hudson commented on AMBARI-24758: --------------------------------- SUCCESS: Integrated in Jenkins build Ambari-trunk-Commit #10206 (See [https://builds.apache.org/job/Ambari-trunk-Commit/10206/]) AMBARI-24758. Ambari-agent takes up too many cpu of perf (aonishuk) (aonishuk: [https://gitbox.apache.org/repos/asf?p=ambari.git&a=commit&h=9103295a7a8a8da3efe0f1b96cf6d915bb8cbb1d]) * (edit) ambari-agent/src/main/python/ambari_agent/Facter.py * (edit) ambari-agent/src/main/python/ambari_agent/HostCheckReportFileHandler.py * (edit) ambari-agent/src/main/python/ambari_agent/HostInfo.py * (edit) ambari-agent/src/main/python/ambari_agent/alerts/base_alert.py * (edit) ambari-agent/src/main/python/ambari_agent/AmbariAgent.py * (edit) ambari-agent/src/main/python/ambari_agent/alerts/ams_alert.py * (edit) ambari-agent/src/main/python/ambari_agent/alerts/metric_alert.py * (edit) ambari-agent/src/main/python/ambari_agent/HostCleanup.py * (edit) ambari-agent/src/main/python/ambari_agent/main.py * (edit) ambari-agent/src/main/python/ambari_agent/alerts/script_alert.py > Ambari-agent takes up too many cpu on perf > ------------------------------------------ > > Key: AMBARI-24758 > URL: https://issues.apache.org/jira/browse/AMBARI-24758 > Project: Ambari > Issue Type: Bug > Reporter: Andrew Onischuk > Assignee: Andrew Onischuk > Priority: Major > Labels: pull-request-available > Fix For: 2.8.0 > > Attachments: AMBARI-24758.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > ncalls tottime percall cumtime percall filename:lineno(function) > 14129 1426.122 0.101 1426.122 0.101 {time.sleep} > 1 0.337 0.337 1426.769 1426.769 main.py:358(run_threads) > 331 0.219 0.001 0.219 0.001 {method 'acquire' of > 'thread.lock' objects} > 11 0.181 0.016 0.181 0.016 {built-in method poll} > 1 0.108 0.108 0.108 0.108 {method 'do_handshake' of > '_ssl._SSLSocket' objects} > 14151 0.042 0.000 0.042 0.000 threading.py:571(isSet) > 125 0.028 0.000 0.028 0.000 {method 'flush' of 'file' > objects} > 5078 0.027 0.000 0.052 0.000 decoder.py:65(py_scanstring) > 15 0.020 0.001 0.020 0.001 {posix.read} > 5093 0.020 0.000 0.024 0.000 {method 'sub' of > '_sre.SRE_Pattern' objects} > 1 0.019 0.019 0.019 0.019 {method 'connect' of > '_socket.socket' objects} > 15365 0.018 0.000 0.018 0.000 {method 'match' of > '_sre.SRE_Pattern' objects} > 55424/13131 0.016 0.000 0.057 0.000 > encoder.py:332(_iterencode_dict) > 38241 0.014 0.000 0.014 0.000 {isinstance} > 21 0.013 0.001 0.022 0.001 collections.py:282(namedtuple) > 473/5 0.012 0.000 0.073 0.015 decoder.py:148(JSONObject) > 5078 0.009 0.000 0.034 0.000 > encoder.py:43(py_encode_basestring_ascii) > 5 0.006 0.001 0.070 0.014 __init__.py:122(dump) > 13251 0.005 0.000 0.005 0.000 {method 'write' of 'file' > objects} > 7 0.004 0.001 0.004 0.001 > {ambari_commons.libs.x86_64._posixsubprocess.fork_exec} > 5638 0.004 0.000 0.004 0.000 encoder.py:49(replace) > 3167/5 0.003 0.000 0.073 0.015 scanner.py:27(_scan_once) > 13353/9909 0.003 0.000 0.030 0.000 > encoder.py:279(_iterencode_list) > 75 0.003 0.000 0.003 0.000 {method 'read' of > '_ssl._SSLSocket' objects} > 3177/8 0.003 0.000 0.008 0.001 Utils.py:124(make_immutable) > 13131 0.003 0.000 0.060 0.000 encoder.py:409(_iterencode) > 11128 0.003 0.000 0.003 0.000 {method 'groups' of > '_sre.SRE_Match' objects} > 2202/45 0.003 0.000 0.004 0.000 Utils.py:135(get_mutable_copy) > 474/10 0.002 0.000 0.008 0.001 Utils.py:170(__init__) > 238 0.002 0.000 0.002 0.000 {time.localtime} > 119 0.002 0.000 0.004 0.000 __init__.py:242(__init__) > 3759 0.002 0.000 0.002 0.000 {method 'isalnum' of 'str' > objects} > 14752 0.002 0.000 0.002 0.000 {method 'end' of > '_sre.SRE_Match' objects} > 16854 0.002 0.000 0.002 0.000 {method 'append' of 'list' > objects} > 5098 0.002 0.000 0.002 0.000 {method 'join' of 'unicode' > objects} > 238 0.002 0.000 0.007 0.000 __init__.py:451(format) > 13 0.002 0.000 0.004 0.000 metric_alert.py:286(__init__) > 7 0.001 0.000 0.026 0.004 > subprocess32.py:1153(_execute_child) > 41 0.001 0.000 0.001 0.000 {open} > 616 0.001 0.000 0.001 0.000 {method 'format' of 'str' > objects} > 238 0.001 0.000 0.004 0.000 __init__.py:404(formatTime) > 18 0.001 0.000 0.001 0.000 {posix.listdir} > 5 0.001 0.000 0.072 0.014 > ClusterCache.py:131(persist_cache) > 4032 0.001 0.000 0.003 0.000 collections.py:323(<genexpr>) > 18 0.001 0.000 0.001 0.000 {method 'sort' of 'list' > objects} > 90 0.001 0.000 0.001 0.000 {built-in method now} > 238 0.001 0.000 0.001 0.000 {time.strftime} > 119 0.001 0.000 0.001 0.000 __init__.py:1215(findCaller) > 5706 0.001 0.000 0.001 0.000 {method 'group' of > '_sre.SRE_Match' objects} > 281 0.001 0.000 0.001 0.000 threading.py:146(acquire) > 281 0.001 0.000 0.001 0.000 threading.py:186(release) > 119 0.001 0.000 0.001 0.000 {method 'seek' of 'file' > objects} > 1 0.001 0.001 0.001 0.001 > ClusterTopologyCache.py:58(on_cache_update) > 119 0.001 0.000 0.005 0.000 handlers.py:144(shouldRollover) > 50/5 0.001 0.000 0.035 0.007 decoder.py:223(JSONArray) > Major cpu cosumers: > 1) Regexp operation: > As we can see a lot of time is took for regexp operations. This happens > because we use non-compiled regular expressions. > 2) Json operations: > Another major cpu consumer is json module, because _speedups.so is not > compiled for python2.7 currently. We have this situation. This is tackled by > other issue > 3) Main thread waking up/sleeping too often. > This seems to create quite a bit cpu usage. > The approach was implemented so agent can check for SIGTERM (ambari-agent > stop). A proper solution should be a usage signal.pause() instead of > sleep/wakeup. -- This message was sent by Atlassian JIRA (v7.6.3#76005)