[jira] Commented: (WHIRR-238) Scaling Monitor/Coordinator

Andrei Savu (JIRA) Tue, 15 Feb 2011 06:32:24 -0800

    [ 
https://issues.apache.org/jira/browse/WHIRR-238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994804#comment-12994804
 ]


Andrei Savu commented on WHIRR-238:
-----------------------------------

David, did you consider using Ganglia for monitoring? I vote for using an 
existing system even if the proposed architecture sounds cool. I don't see why 
we should write a new tool from scratch. Plus we can reuse existing monitoring 
scripts:

http://wiki.apache.org/hadoop/GangliaMetrics
https://github.com/andreisavu/zookeeper-monitoring/tree/master/ganglia


> Scaling Monitor/Coordinator
> ---------------------------
>
>                 Key: WHIRR-238
>                 URL: https://issues.apache.org/jira/browse/WHIRR-238
>             Project: Whirr
>          Issue Type: New Feature
>          Components: core
>            Reporter: David Alves
>
> From the mailing list:
> General idea:
> Add an elastic scaling monitor and coordinator, i.e. a whirr process that 
> would be running on some or all of the nodes that:
>       - would collect load metrics (both generic and specific to each 
> application)
>       - would feed them through an elastic decision making engine (also 
> specific to each application as it depends on the specific metrics)
>       - would then act on those decisions by either expanding or contracting 
> the cluster.
>       Some specifics:
>       - it must not be completely distributed, i.e. it can have a specific 
> assigned node that will monitor/coordinate but this node must not be fixed, 
> i.e. it could/should change if the previous coordinator leaves the cluster.
>       - each application would define the set of metrics that it emits and 
> use a local monitor process to feed them to the coordinator.
>       - the monitor process should emit some standard metrics (Disk I/O, CPU 
> Load, Net I/O, memory)
>       - the coordinator would have a pluggable decision engine policy also 
> defined by the application that would consume metrics and make a decision.
>       - whirr would take care of requesting/releasing nodes and 
> adding/removing them from the relevant services.
>       Some implementation ideas:
>       - it could tun on top of zookeeper. zk is already a requirement for 
> several services and would allow to reliably store coordinator state so that 
> another node can pickup if the previous coordinator leaves the cluster.
>       - it could use Avro to serialize/deserialize metrics data 
>       - it should be optional, i.e. simply another service that the whirr cli 
> starts
>       - it would also be nice to have a monitor/coordinator web page that 
> would display metrics and view cluster status in an aggregated view.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (WHIRR-238) Scaling Monitor/Coordinator

Reply via email to