[
https://issues.apache.org/jira/browse/AMBARI-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507459#comment-13507459
]
Tom Beerbower commented on AMBARI-1044:
---------------------------------------
I don't see any related exceptions in the server log which means that either
its not attempting to get the metrics for this host or they are just not being
set on the host resource.
I think that I see what is happening. One of the arguments that can be
specified for the rrd query is the Ganglia cluster (HDPHBaseMaster,
HDPJobTracker, HDPNameNode or HDPSlaves). The question is, for a host level
query which Ganglia cluster should we specify?
Its hard to say since a host isn't necessarily with any of the services related
to those clusters... or maybe more than one. It turns out it doesn't really
matter. In this case I can see the system level rrd files that we use for host
level metrics for ip-10-224-42-108.ec2.internal under any of the Ganglia
cluster folders. For example ...
{code}
[root@ip-10-40-91-121 rrds]# ls ./HDPHBaseMaster/ip-10-224-42-108.ec2.internal
boottime.rrd bytes_out.rrd cpu_idle.rrd cpu_num.rrd cpu_system.rrd
cpu_wio.rrd disk_total.rrd load_five.rrd mem_buffers.rrd mem_free.rrd
mem_total.rrd pkts_in.rrd proc_run.rrd swap_free.rrd
bytes_in.rrd cpu_aidle.rrd cpu_nice.rrd cpu_speed.rrd cpu_user.rrd
disk_free.rrd load_fifteen.rrd load_one.rrd mem_cached.rrd mem_shared.rrd
part_max_used.rrd pkts_out.rrd proc_total.rrd swap_total.rrd
...
[root@ip-10-40-91-121 rrds]# ls HDPNameNode/ip-10-224-42-108.ec2.internal
boottime.rrd bytes_out.rrd cpu_idle.rrd cpu_num.rrd cpu_system.rrd
cpu_wio.rrd disk_total.rrd load_five.rrd mem_buffers.rrd mem_free.rrd
mem_total.rrd pkts_in.rrd proc_run.rrd swap_free.rrd
bytes_in.rrd cpu_aidle.rrd cpu_nice.rrd cpu_speed.rrd cpu_user.rrd
disk_free.rrd load_fifteen.rrd load_one.rrd mem_cached.rrd mem_shared.rrd
part_max_used.rrd pkts_out.rrd proc_total.rrd swap_total.rrd
{code}
The approach that I've been using is to look through the host components for
the host that we are interested in and try to map one of its component names
back to a Ganglia cluster. In this case it looks like the host with the missing
metrics is not associated with any component that would map back given the
mapping method that I am using.
Given what I am currently seeing with the system level metrics, I think that it
would be safe to simply use HDPSlaves as the Ganglia cluster for host level
queries.
> API is not returning Ganglia metrics for one of the hosts in the cluster
> ------------------------------------------------------------------------
>
> Key: AMBARI-1044
> URL: https://issues.apache.org/jira/browse/AMBARI-1044
> Project: Ambari
> Issue Type: Sub-task
> Reporter: Tom Beerbower
> Assignee: Tom Beerbower
>
> A cluster was deployed with 4 hosts, with Ambari Server running on a
> different host.
> Host graphs are showing for 3 of the hosts.
> For one of the hosts, API is not returning any temporal data.
> Ganglia is showing host-level metrics.
> UI:
> http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/#/main/hosts/ip-10-224-42-108.ec2.internal/summary
> Ganglia UI:
> http://ec2-174-129-70-110.compute-1.amazonaws.com/ganglia/mobile_helper.php?show_host_metrics=1&h=ip-10-224-42-108.ec2.internal&c=HDPNameNode&r=hour&cs=&ce=
> API response:
> {
> "href" :
> "http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/api/v1/clusters/C2/hosts/ip-10-224-42-108.ec2.internal?fields=metrics/cpu/cpu_user1354227417,1354231017,15,metrics/cpu/cpu_wio1354227417,1354231017,15,metrics/cpu/cpu_nice1354227417,1354231017,15,metrics/cpu/cpu_aidle1354227417,1354231017,15,metrics/cpu/cpu_system1354227417,1354231017,15,metrics/cpu/cpu_idle1354227417,1354231017,15",
> "Hosts" :
> { "cluster_name" : "C2", "host_name" : "ip-10-224-42-108.ec2.internal" }
> }
> We need to understand the root cause.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira