Repository: incubator-eagle Updated Branches: refs/heads/master 327351b92 -> 2f4df34cf
[EAGLE-698] Collectd python plugin for gathering hadoop jmx information. <!-- {% comment %} Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[EAGLE-<Jira issue #>] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- â¦rmation. - hadooproleconfig.json is a configuration file for hadoop roles to be collect. - Manual_of_collectd_hadoop_plugin.md is a how-to document - https://issues.apache.org/jira/browse/EAGLE-698 Author: joe-hj <joe.h...@gmail.com> Closes #584 from joe-hj/EAGLE-698. Project: http://git-wip-us.apache.org/repos/asf/incubator-eagle/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-eagle/commit/2f4df34c Tree: http://git-wip-us.apache.org/repos/asf/incubator-eagle/tree/2f4df34c Diff: http://git-wip-us.apache.org/repos/asf/incubator-eagle/diff/2f4df34c Branch: refs/heads/master Commit: 2f4df34cfbcce2ab0cd72f52c45a7873ec689722 Parents: 327351b Author: joe-hj <joe.h...@gmail.com> Authored: Wed Nov 9 17:01:22 2016 +0800 Committer: Hao Chen <h...@apache.org> Committed: Wed Nov 9 17:01:22 2016 +0800 ---------------------------------------------------------------------- eagle-external/hadoop_jmx_collectd/README.md | 63 +++++++ eagle-external/hadoop_jmx_collectd/hadoop.py | 167 +++++++++++++++++++ .../hadoop_jmx_collectd/hadooproleconfig.json | 57 +++++++ 3 files changed, 287 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-eagle/blob/2f4df34c/eagle-external/hadoop_jmx_collectd/README.md ---------------------------------------------------------------------- diff --git a/eagle-external/hadoop_jmx_collectd/README.md b/eagle-external/hadoop_jmx_collectd/README.md new file mode 100755 index 0000000..beab2c9 --- /dev/null +++ b/eagle-external/hadoop_jmx_collectd/README.md @@ -0,0 +1,63 @@ +#**A manual for hadoop plugin of collectd** + +Collectd website referto [collectd.org](collectd.org) + +###Description +The plugin collect information from http://hostname:port/jmx , according to the hadoop role configuration, it support roles as below: + + HDFS NameNode + HDFS DataNode + HDFS JournalNode + HBase Master + Hbase RegionServer + Yarn NodeManager + Yarn ResourceManager + +###Install +>1) +Deploy hadoop.py in your collectd plugins path , in my environment, it is like : /opt/collectd/lib/collectd/plugins/hadoop.py(It assume that you install collectd in the directory of /opt/collectd), maybe you should create the plugins directory ahead. + +>2) +Configure the python plugin in file collectd.conf , and you should trun on the python switch first. + +>A snippet of collectd.conf for showing hadoop python plugin configuration:<br/> +> + LoadPlugin python + <Plugin "python"> + ModulePath "/opt/collectd/lib/collectd/plugins/" + LogTraces true + Import "hadoop" + <Module "hadoop"> + HDFSDatanodeHost "YourHostName" + Port "50075" + Verbose true + Instance "192.168.xxx.xxx" + JsonPath "/xxx/xxx/hadooproleconfig.json" + </Module> + <Module "hadoop"> + YARNResourceManager "YourHostName" + Port "8088" + Verbose true + Instance "192.168.xxx.xxx" + </Module> + </Plugin> + +>**Notification**: +>Instance , port and host(role) fileds must be set. + +>3)Two way to cite hadooproleconfig.json , either is ok.<br/> +>>a)Place hadooproleconfig.json path in one <Module "hadoop"> </Module> pair in collectd.conf.<br/> +>>b)Place hadooproleconfig.json file in you BaseDir which defined in collectd.conf. + +>4) +If you update your collectd.conf or hadooproleconfig.json, you should restart your collectd application. + +###Dependency + collectd: I test in version 5.6.0. + Hadoop: I test in version 2.6.0-cdh5.4.3. + Hbase: I test in version 1.0.0-chd5.4.3. + +###Testing +>You can use hadoop.py as a single independent python file for debugging jmx, and you can also use it as plugin for collectd, a variable MyDebug is used as a swtich when used in the two different environments. + + http://git-wip-us.apache.org/repos/asf/incubator-eagle/blob/2f4df34c/eagle-external/hadoop_jmx_collectd/hadoop.py ---------------------------------------------------------------------- diff --git a/eagle-external/hadoop_jmx_collectd/hadoop.py b/eagle-external/hadoop_jmx_collectd/hadoop.py new file mode 100755 index 0000000..8684e66 --- /dev/null +++ b/eagle-external/hadoop_jmx_collectd/hadoop.py @@ -0,0 +1,167 @@ +#! /usr/bin/python + +import urllib2 +import json +import os + +CurRoleTypeInfo= {} +HadoopConfigs= [] +LogSwitch= False +JsonPath = None + +def CfgCallback(conf): + global JsonPath + CurRoleType = None + Role = None + Port = None + Host = None + OutputFlag = LogSwitch + + for EachRoleConfig in conf.children: + collectd.info('hadoop pluginx : %s' % EachRoleConfig.key) + if EachRoleConfig.key == 'HDFSNamenodeConfigHost': + Host = EachRoleConfig.values[0] + CurRoleType = "ROLE_TYPE_NAMENODE" + elif EachRoleConfig.key == 'HDFSDatanodeConfigHost': + Host = EachRoleConfig.values[0] + CurRoleType = "ROLE_TYPE_DATANODE" + elif EachRoleConfig.key == 'YarnNodeManagerHost': + Host = EachRoleConfig.values[0] + CurRoleType = "ROLE_TYPE_NODEMANAGER" + elif EachRoleConfig.key == 'YarnResourceManagerHost': + Host = EachRoleConfig.values[0] + CurRoleType ="ROLE_TYPE_RESOURCEMANAGER" + elif EachRoleConfig.key == 'HbaseMasterHost': + Host = EachRoleConfig.values[0] + CurRoleType = "ROLE_TYPE_HBASE_MASTER" + elif EachRoleConfig.key == 'HbaseRegionserverHost': + Host = EachRoleConfig.values[0] + CurRoleType = "ROLE_TYPE_HBASE_REGIONSERVER" + elif EachRoleConfig.key == 'HDFSJournalEachRoleConfigHost': + Host = EachRoleConfig.values[0] + CurRoleType = "ROLE_TYPE_HDFS_JOURNALNODE" + elif EachRoleConfig.key == 'Port': + Port = EachRoleConfig.values[0] + elif EachRoleConfig.key == 'Instance': + Role = EachRoleConfig.values[0] + elif EachRoleConfig.key == 'Verbose': + OutputFlag = bool(EachRoleConfig.values[0]) + elif EachRoleConfig.key == 'JsonPath': + collectd.info('hadoop plugin cfg: %s' % EachRoleConfig.values[0]) + JsonPath = EachRoleConfig.values[0] + else: + collectd.warning('hadoop plugin: Unsupported key: %s.' % EachRoleConfig.key) + #collectd.info('hadoop plugin cfg: %s.' % JsonPath) + + if not Host or not Role or not CurRoleType or not Port: + collectd.error('hadoop plugin error: *Host, Port, and CurRoleType should not be empty.') + else: + CurrentConfigMap = { + 'RoleInstance': Role, + 'port': Port, + 'host': Host, + 'RoleType': CurRoleType , + 'OutputFlag': OutputFlag + } + + HadoopConfigs.append(CurrentConfigMap) + +def GetdataCallback(): + GetJsonConfig() + for EachConfig in HadoopConfigs: + Host = EachConfig['host'] + Port = EachConfig['port'] + RoleInstance = EachConfig['RoleInstance'] + RoleType = EachConfig['RoleType'] + OutputFlag = EachConfig['OutputFlag'] + + if isinstance(Port,int) == False: + MyLog("Host Port is not number error",True) + + JmxUrl = "http://" + Host + ":" + Port.__str__() + "/jmx" + + try: + Contents = json.load(urllib2.urlopen(JmxUrl, timeout=5)) + + except urllib2.URLError as e: + if MyDebug == 1: + print(JmxUrl,e) + else: + collectd.error('hadoop plugin: can not connect to %s - %r' % (JmxUrl, e)) + + if MyDebug == 1: + print(RoleType) + else: + collectd.info('hadoop pluginx [testheju]: %s' % RoleType) + + for RoleInfo in Contents["beans"]: + for RoleKey, RoleValue in CurRoleTypeInfo[RoleType].iteritems(): + if RoleInfo['name'].startswith(RoleValue): + for k, v in RoleInfo.iteritems(): +# Due to the limite of dispatch interface in collectd, the appropriate type is int and float + if isinstance(v, int) or isinstance(v, float): +# gauge is type defined in collectd + Submit2Collectd('gauge', '.'.join((RoleKey, k)), v, RoleInstance, RoleType, OutputFlag) + +def Submit2Collectd(type, name, value, instance, instance_type, OutputFlag): + if value is None: + #value = '' + collectd.warning('hadoop pluginx : value is None of the key %s' % name) + else: + plugin_instance = '.'.join((instance, instance_type)) + MyLog('%s [%s]: %s=%s' % (plugin_instance, type, name, value), OutputFlag) + + if MyDebug == 0: + SendValue = collectd.Values(plugin='hadoop') + SendValue.type = type + SendValue.type_instance = name + SendValue.values = [value] + SendValue.plugin_instance = plugin_instance + SendValue.meta = {'0': True} + SendValue.dispatch() + +def MyLog(msg, OutputFlag): + if OutputFlag: + if MyDebug == 1: + print(msg) + pass + else: + collectd.info('hadoop pluginx output: %s' % msg) + +def GetJsonConfig(): + global JsonPath + #JsonPath = "/opt/collectd/lib/collectd/plugins/hadooproleconfig.json" + MyLog("pwd:%s" % os.getcwd(),True) + if JsonPath == None or JsonPath == "": + with open("./hadooproleconfig.json",'r') as f: + data = json.load(f) + for k,v in data.iteritems(): + if k.startswith("ROLE_TYPE"): + CurRoleTypeInfo[k]=v + else: + with open(JsonPath,'r') as f: + data = json.load(f) + for k,v in data.iteritems(): + if k.startswith("ROLE_TYPE"): + CurRoleTypeInfo[k]=v + +if __name__ == "__main__": + MyDebug = 1 + +# You can config you Role configuration like example below + ConfigMap = { + 'RoleInstance': "instance", + 'port': 8042, + 'host': "lujian", + 'RoleType': "ROLE_TYPE_NODEMANAGER", + 'OutputFlag': bool(True) + } + + HadoopConfigs.append(ConfigMap) + #print HadoopConfigs + GetdataCallback() +else: + import collectd + MyDebug = 0 + collectd.register_config(CfgCallback) + collectd.register_read(GetdataCallback) http://git-wip-us.apache.org/repos/asf/incubator-eagle/blob/2f4df34c/eagle-external/hadoop_jmx_collectd/hadooproleconfig.json ---------------------------------------------------------------------- diff --git a/eagle-external/hadoop_jmx_collectd/hadooproleconfig.json b/eagle-external/hadoop_jmx_collectd/hadooproleconfig.json new file mode 100644 index 0000000..59bc8e1 --- /dev/null +++ b/eagle-external/hadoop_jmx_collectd/hadooproleconfig.json @@ -0,0 +1,57 @@ +{ + "ROLE_TYPE_NAMENODE": { + "FSNamesystem": "Hadoop:service=NameNode,name=FSNamesystemState", + "GCParNew": "java.lang:type=GarbageCollector,name=ParNew", + "GCCMSMarkSweep": "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep", + "Threading": "java.lang:type=Threading", + "JvmMetrics": "Hadoop:service=NameNode,name=JvmMetrics" + }, + "ROLE_TYPE_DATANODE": { + "DatanodeActivity": "Hadoop:service=DataNode,name=DataNodeActivity", + "GCParNew": "java.lang:type=GarbageCollector,name=ParNew", + "GCCMSMarkSweep": "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep", + "Threading": "java.lang:type=Threading", + "JvmMetrics": "Hadoop:service=DataNode,name=JvmMetrics" + }, + "ROLE_TYPE_HDFS_JOURNALNODE": { + "GCPSScavenge": "java.lang:type=GarbageCollector,name=PS Scavenge", + "GCPSMarkSweep": "java.lang:type=GarbageCollector,name=PS MarkSweep", + "Threading": "java.lang:type=Threading", + "JvmMetrics": "Hadoop:service=JournalNode,name=JvmMetrics" + }, + "ROLE_TYPE_HBASE_MASTER": { + "MasterBalancer": "Hadoop:service=HBase,name=Master,sub=Balancer", + "MasterAssignmentManager": "Hadoop:service=HBase,name=Master,sub=AssignmentManger", + "GCParNew": "java.lang:type=GarbageCollector,name=ParNew", + "GCCMSMarkSweep": "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep", + "MasterServer": "Hadoop:service=HBase,name=Master,sub=Server", + "Threading": "java.lang:type=Threading", + "JvmMetrics": "Hadoop:service=HBase,name=JvmMetrics" + }, + "ROLE_TYPE_HBASE_REGIONSERVER": { + "Regions": "Hadoop:service=HBase,name=RegionServer,sub=Regions", + "Replication": "Hadoop:service=HBase,name=RegionServer,sub=Replication", + "WAL": "Hadoop:service=HBase,name=RegionServer,sub=WAL", + "Server": "Hadoop:service=HBase,name=RegionServer,sub=Server", + "GCParNew": "java.lang:type=GarbageCollector,name=ParNew", + "GCCMSMarkSweep": "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep", + "Threading": "java.lang:type=Threading", + "JvmMetrics": "Hadoop:service=HBase,name=JvmMetrics" + }, + "ROLE_TYPE_NODEMANAGER": { + "GCParNew": "java.lang:type=GarbageCollector,name=ParNew", + "GCCMSMarkSweep": "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep", + "NodeManagerMetrics":"Hadoop:service=NodeManager,name=NodeManagerMetrics", + "NodeManagerShuffleMetrics":"Hadoop:service=NodeManager,name=ShuffleMetrics", + "Threading": "java.lang:type=Threading", + "JvmMetrics": "Hadoop:service=NodeManager,name=JvmMetrics" + }, + "ROLE_TYPE_RESOURCEMANAGER": { + "GCParNew": "java.lang:type=GarbageCollector,name=ParNew", + "GCCMSMarkSweep": "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep", + "UgiMetrics":"Hadoop:service=ResourceManager,name=UgiMetrics", + "ClusterMetrics":"Hadoop:service=ResourceManager,name=ClusterMetrics", + "Threading": "java.lang:type=Threading", + "JvmMetrics": "Hadoop:service=ResourceManager,name=JvmMetrics" + } +}