Hi all:

I don't have much to comment on at this stage except it is something
worthwhile to pursue.

Do other users have any thoughts/comments on this?

Thanks,

Bernard

On Friday, July 20, 2012, Peter Phaal wrote:

> I agree, the performance of the network fabric is a critical component
> of cluster performance and it would be great to figure out how to best
> include the data in Ganglia.
>
> A possible starting point would be to define <SWITCH> elements in the
> XML structure exported by gmond. A switch would contain multiple
> <INTERFACE> objects each of which contain standard SNMP MIB-II metrics
> (ifInOctets, ifOutOctets, ifInErrors, ifOutErrors, ifInDiscards,
> ifOutDiscards etc). The problem is that this wouldn't be backward
> compatible with tools accessing the XML interface. Another option
> would be to have the network data appear as a separate XML document,
> accessed on a different TCP port.
>
> The next challenge would be to figure out how to include this type of
> information in the Ganglia UI - rolled up errors and discards for the
> fabric would be a natural fit for the top level view, but to drill
> down, Ganglia would need to deal with the concept of multiple resource
> pools in the cluster (networking and computation). Extending the
> notion further, a storage resource pool might also be interesting. For
> virtual server pools, pooling the VMs and the hypervisors would also
> be useful.
>
> Peter
>
> On Fri, Jul 20, 2012 at 9:31 AM, Andreas Pflug
> <pgad...@pse-consulting.de <javascript:;>> wrote:
> > Well,
> >
> > for examining the overall health of a cluster the network fabric appears
> > equally important to me...
> > There seems no OS software for this combined?
> >
> > Regards
> > Andreas
> >
> >
> > Am 20.07.12 17:50, schrieb Peter Phaal:
> >> The sFlow standard defines a wide range of metrics from switches,
> >> servers and applications. Each device only exports the metrics that
> >> are relevant to its normal operation, so switches will report network
> >> metrics, servers will report cpu, memory, disk statistics and
> >> applications will report response times, URLs etc.
> >>
> >> http://blog.sflow.com/2010/08/sflow-host-structures.html
> >>
> >> The Dell switch is exporting sFlow metrics relating to its operation
> >> as a switch. Since it isn't a server, it won't export the server
> >> metrics that gmond is looking for. Ganglia is designed to monitor
> >> clusters of servers and it expects to receive a core set of server
> >> metrics from each member of the cluster and will ignore sFlow metrics
> >> that don't relate to that function.
> >>
> >> There are a number of other sFlow analysis tools listed on sFlow.org
> >> that are focused on sFlow switch metrics:
> >>
> >> http://sflow.org/products/collectors.php
> >>
> >> The following article describes some things to consider when
> >> evaluating sFlow analyzers for monitoring switches:
> >>
> >> http://blog.sflow.com/2009/05/choosing-sflow-analyzer.html
> >>
> >> Peter
> >>
> >> On Fri, Jul 20, 2012 at 7:46 AM, Andreas Pflug
> >> <pgad...@pse-consulting.de <javascript:;>> wrote:
> >>> I've configured some Dell switches (e.g. 6224, with recent 3.3.3.3
> >>> firmware) to emit SFLOW packets, and I see them happily arriving at my
> >>> gmond machine, but the switches aren't recognized.
> >>>
> >>> Digging into the sources, I found that the switch under investigation
> >>> never sends blocks tagged as SFLOW_COUNTERBLOCK_HOST,_HID only type 0
> >>> and 1. Consequently, all packets are dropped.
> >>>
> >>> Is this a Dell problem of incompletely implemented SFLOW, or is it a
> >>> gmond problem?
> >>>
> >>> Regards
> >>> Andreas
> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> Live Security Virtual Conference
> >>> Exclusive live event will cover all the ways today's security and
> >>> threat landscape has changed and how IT managers can respond.
> Discussions
> >>> will include endpoint security, mobile security and the latest in
> malware
> >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> >>> _______________________________________________
> >>> Ganglia-general mailing list
> >>> Ganglia-general@lists.sourceforge.net <javascript:;>
> >>> https://lists.sourceforge.net/lists/listinfo/ganglia-general
> >
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Ganglia-general mailing list
> Ganglia-general@lists.sourceforge.net <javascript:;>
> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to