Re: [Ganglia-general] Adding hosts to cluster in gmetad

2011-12-16 Thread Seth Graham

On Dec 16, 2011, at 10:28 AM, Maciek Lasyk wrote:

 I've been trying to make a basic ganglia configuration: one gmetad
 getting data from 2 clusters (11 sources and 1 source) via unicast.
 Unfortunately with attached configuration I see only the first host
 from data_source


It appears you're using the same port for both data_source lines, which is why 
you're having issues. Ganglia uses the port number to differentiate between 
clusters.



 host1: gmetad.conf
 ==
 data_source SR1 192.168.0.23:8649 192.168.0.26:8649,
 192.168.0.17:8649, 192.168.0.44:8649, 192.168.0.6:8649,
 192.168.0.7:8649, 192.168.0.10:8649, 192.168.0.9:8649,
 192.168.0.8:8649, 192.168.0.3:8649 192.23.1.22:8649
 data_source SR2 129.253.128.112:8649
 
 gridname GD
 case_sensitive_hostnames 1
 ==


--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] O'Reilly eBook on Ganglia

2011-12-12 Thread Seth Graham

I'd be glad to have a reference for all the settings and variables that are 
being used in the various .json files. As far as I can tell, the only 
documentation is the php code itself.

Some more verbosity about making custom reports (the stuff that lives in 
graph.d) would be nice too. Reports at least have a wiki page and the php 
scripts are well commented, but the json files are devoid of info.



On Dec 9, 2011, at 6:51 PM, Matt Massie wrote:

 We're in the process of pulling together a team to write an O'Reilly eBook on 
 Ganglia.
 
 Here's a rough idea of some of the topics we could cover
   • Ganglia's components and overall architecture
   • Typical deployment configurations including simple steps for 
 verifying an installation (e.g. unicast/multicast, single cluster/multiple 
 distributed clusters/datacenter)
   • Navigating and using the new web interface
   • Tips for extending ganglia's functionality (e.g. gmetric, modules)
   • Common integration points (e.g. Hadoop metrics, Nagios)
   • A simple step-by-step checklist for debugging common ganglia issues 
 with pointers to our web site, mailing lists, irc channel, etc.
   • Supported platforms and core metrics (e.g. Ganglia on AIX, Linux 
 Power systems)
   • Scaling to clusters  1000 nodes
   • Using Ganglia in mixed environments
   • Ganglia in the enterprise
   • Development of custom modules
 What are the things you would be most interested in?  Are there other topics 
 you'd like to see covered?
 
 -Matt
 --
 Learn Windows Azure Live!  Tuesday, Dec 13, 2011
 Microsoft is holding a special Learn Windows Azure training event for 
 developers. It will provide a great way to learn Windows Azure and what it 
 provides. You can attend the event by watching it streamed LIVE online.  
 Learn more at 
 http://p.sf.net/sfu/ms-windowsazure___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Scaling Ganglia

2011-11-04 Thread Seth Graham

On Nov 3, 2011, at 10:49 PM, Eytan Daniyalzade wrote:

 I am running a cluster with around 80 nodes, and ganglia-server is
 running on EC2 with 8G. Loading the main page or a host view on
 ganglia takes fairly long, ~20sec. It looks like this is taking as
 long as the view is making sequential loads all the graphs (images),
 and server takes longer than I would expect to respond to them. Could
 you advise any tuning to speed it up or possibly dive into what might
 be slowing things down? I am running Ganglia 2.1.8 web ui, and serve
 all files from web root. I am not really familiar with tuning
 apache/php for better performance.
 

The typical bottleneck is the rrd files. Usually this doesn't present a problem 
with viewing the web page, but rather updating the files, but it might still be 
worth looking in to.

The easy test is to move the rrds to a tmpfs and check for improvements. 
rrdcached is another common choice.

My installation uses tmpfs, and is able to load a page of 500 hosts in about 17 
seconds. This is without any performance related tweaking to apache or php.


--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia Cluster aggregated graphs

2011-10-13 Thread Seth Graham

On Oct 13, 2011, at 11:52 AM, Aidan Wong wrote:

 This is my first post on this Ganglia list =).  I'm using the new Ganglia web 
 2.1.8 .  Has anyone been able to create a graph that aggregates one common 
 metric for several hosts.

Try looking at the aggregate graphs tab on the web interface. It lets you use 
regular expressions to set up a graph showing many hosts at once. These graphs 
can also be added to views.


--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] Making y axis consistent?

2011-08-12 Thread Seth Graham

Has anyone come up with a clever hack for getting rrd graphs produced by 
ganglia to use the same axis clamp?

At this point, my main issue is with the new Views page, where I've 
configured a view showing the 15 minute load average for 6 hosts. Every single 
graph has a different Y axis scale, making it  hard to quickly identify which 
nodes are most busy. 

There is the upper-limit and lower-limit values available for custom 
graphs, would it be reasonable to put preferences into the view json file? 

Or maybe global Y axis ranges in conf.php on a per-metric basis?

I can't see a reasonable way to probe data ranges automatically.. this would 
put an unfair burden on on web frontend, I think.
--
FREE DOWNLOAD - uberSVN with Social Coding for Subversion.
Subversion made easy with a complete admin console. Easy 
to use, easy to manage, easy to install, easy to extend. 
Get a Free download of the new open ALM Subversion platform now.
http://p.sf.net/sfu/wandisco-dev2dev
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Making y axis consistent?

2011-08-12 Thread Seth Graham

To answer my own question, yes this is possible.

Shortly after sending this email I discovered the graphs.d folder and figured 
out how to use the report (otherwise known as custom graphs) feature, 
allowing me to make a custom graph with a fixed Y axis range.

Then I dug through the source for a bit to figure out how to get the views to 
use reports, which turns out is fairly easy.

An entry like this in the view .json file:

{hostname:novagpvm01.fnal.gov,metric:load_fifteen},

Will become:

{hostname:novagpvm01.fnal.gov,graph:my_report},


Sorry for the mail list noise.


On Aug 12, 2011, at 2:46 PM, Seth Graham wrote:

 
 Has anyone come up with a clever hack for getting rrd graphs produced by 
 ganglia to use the same axis clamp?
 
 At this point, my main issue is with the new Views page, where I've 
 configured a view showing the 15 minute load average for 6 hosts. Every 
 single graph has a different Y axis scale, making it  hard to quickly 
 identify which nodes are most busy. 
 
 There is the upper-limit and lower-limit values available for custom 
 graphs, would it be reasonable to put preferences into the view json file? 
 
 Or maybe global Y axis ranges in conf.php on a per-metric basis?
 
 I can't see a reasonable way to probe data ranges automatically.. this would 
 put an unfair burden on on web frontend, I think.
 --
 FREE DOWNLOAD - uberSVN with Social Coding for Subversion.
 Subversion made easy with a complete admin console. Easy 
 to use, easy to manage, easy to install, easy to extend. 
 Get a Free download of the new open ALM Subversion platform now.
 http://p.sf.net/sfu/wandisco-dev2dev
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
FREE DOWNLOAD - uberSVN with Social Coding for Subversion.
Subversion made easy with a complete admin console. Easy 
to use, easy to manage, easy to install, easy to extend. 
Get a Free download of the new open ALM Subversion platform now.
http://p.sf.net/sfu/wandisco-dev2dev
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] [Ganglia-developers] Announcing Ganglia Web 2.0RC1

2011-06-23 Thread Seth Graham

On Jun 22, 2011, at 4:15 PM, Alex Dean wrote:
 That requires that the view name match the cluster name, right?

Yes, it requires the view and the cluster to match. 

 Could you post your changes somewhere so we could see what you did?


I attached a gzipped diff of the gweb-2.0 release against the tree I was 
hacking on. 




view_permissions.patch.gz
Description: GNU Zip compressed data


Creating views doesn't work properly, adding to views does  (sort of.. it might 
be smarter if the dropdown box for adding to a view only listed valid options 
for the user).

 I still want to take a stab at this, I just haven't had the time.  Help me 
 understand your use case better.  You want to allow some non-admin users to 
 edit a single view, right?  Is there a case for limiting the visibility of a 
 view, or are we only concerned with who can change a view?
 
 More generally, what permissions do we need?
 - view a view
 - create a view
 - edit a view
 - delete a view

I personally only care about who can edit views.. anyone who can get to our web 
server is free to look at everything on it. But I suppose if effort is going to 
be put into editing views, might as well control who can view them too.

 I'd say sensible defaulte are that admins can do all of these things for all 
 views, and anonymous users can view all views which haven't been specifically 
 hidden.

Same here.--
Simplify data backup and recovery for your virtual environment with vRanger.
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today.
http://p.sf.net/sfu/quest-sfdev2dev___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] [Ganglia-developers] Announcing Ganglia Web 2.0RC1

2011-06-22 Thread Seth Graham

On Jun 9, 2011, at 12:10 PM, Alex Dean wrote:
 I started off intending to allow per-view edit access, just like we allow 
 per-cluster edit access for optional graphs.  The complication is that each 
 resource (a view or a cluster) in the ACL is only identified by a simple 
 string.  Thus you can't have a cluster and a view which share the same name - 
 or, if you did you'd probably unwittingly be granting permissions you didn't 
 mean to.  I thought about introducing some kind of namespacing, and then just 
 decided to punt until it was actually needed.
 
 So... maybe that time is now? :)
 
 Something like this wouldn't be too hard to implement:
  $acl-allowView( 'username', 'view-name', GangliaAcl::EDIT );
  $acl-allowCluster( 'username', 'cluster-name', GangliaAcl::EDIT );
 
 Please suggest alternate APIs here.  That's just my initial brainstorm.

I finally got a chance to sit down and poke at this.

The good news is it's easy to implement a permissions system for adding graphs 
to an existing view. My method was to edit GangliaAcl.php to add an 'EDIT_VIEW' 
resource, and use the add() function along with a clustername to give a user 
view editing privileges. After updating the checkAccess() calls where 
appropriate in host_view.php and views.php, a user can add graphs to their view.

More complicated is the creation of the views themselves. Because views can 
have names without any relation to ganglia clusters, the ACL system won't work. 
I guess one could put in a restriction that a user can only create views with 
the same name of clusters they have edit permissions for, but that would limit 
them to owning a single view per cluster.

(as an aside, is it intended that once a view is created, it cannot be removed 
via the web interface?)

The more I look at it, the more inclined I am to leave the configuration as it 
is. Every idea I come up with limits the flexibility of the Views or requires 
more acl maintenance in conf.php.


--
Simplify data backup and recovery for your virtual environment with vRanger.
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today.
http://p.sf.net/sfu/quest-sfdev2dev
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] [Ganglia-developers] Announcing Ganglia Web 2.0RC1

2011-06-09 Thread Seth Graham
On Jun 8, 2011, at 8:25 PM, Alex Dean wrote:

 Hi Seth.  I'm just back from a week off the grid, and trying to get caught up 
 on a mountain of electronic stuff.  Here's my quick response.  Please let me 
 know if more explanation is required.

Nope, the explanation makes sense. The only thing I was missing was detail 
about the philosophy behind the privileges system. 

 Editing views is not per-cluster permission because views can contain graphs 
 from many clusters.  Currently, we only support a single 'edit' permission 
 for all views.  (A user can either edit all views, or can edit none.)  You 
 can't selectively grant edit permission on a single view.  That restriction 
 could possible be lifted in the future if there is demand for it.

It's my primary motivation for updating to the new interface, actually. 

I don't know how typical my environment is, but I'm taking care of machines 
belonging to many different experiments. Users like to have their resources on 
their own web page, and not see nodes they don't care about. Traditionally I've 
dealt with this in gmetad.conf, moving machines between clusters or making new 
clusters based on the whims of scientists. It works, but is kind of a pain.

Being able to set up admin accounts and let the users arrange things to taste 
via a web page would make everyone happy.. I don't have to babysit ganglia, and 
they don't have to wait for me to update ganglia.

Fortunately, it's pretty easy to modify the access checks to allow this 
behavior, so if I'm a minority case, I can patch where needed. I just wasn't 
sure if I was using the ACL system properly.


thanks


--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Announcing Ganglia Web 2.0RC1

2011-06-07 Thread Seth Graham

I'm having some issues getting the user roles working as expected.

The wiki instructs something like:

$acl-addRole( $username, GangliaAcl::GUEST );
$acl-allow( $username, $cluster, GangliaAcl::EDIT );

Which does not result in the little blue + sign to be drawn next to graphs. 

From line 71 in host_view.php, there is this line:

if(checkAccess(GangliaAcl::ALL_VIEWS, GangliaAcl::EDIT, $conf)) {

Changing it to:

if(checkAccess($clustername, GangliaAcl::EDIT, $conf)) {

Allows the check to succeed, but I run into the same problem in views.php.


What does the 'EDIT' role actually allow a user to edit, if not views? And is 
it possible to configure the interface to allow a user to only edit specific 
views? As configured now, it appears view editing is all or nothing.


thanks,



On Jun 1, 2011, at 10:08 AM, Vladimir Vuksan wrote:

 
 Announcing Ganglia Web 2.0 Release Candidate 1.
 
 http://ganglia.info/?p=373
 
 Vladimir
 
 --
 Simplify data backup and recovery for your virtual environment with vRanger. 
 Installation's a snap, and flexible recovery options mean your data is safe,
 secure and there when you need it. Data protection magic?
 Nope - It's vRanger. Get your free trial download today. 
 http://p.sf.net/sfu/quest-sfdev2dev
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] default auth settings

2011-04-22 Thread Seth Graham

On Apr 22, 2011, at 9:43 AM, Alex Dean wrote:

 I'd like to get some feedback on how we should configure gweb's default 
 access permissions.
 
 #1. $conf['auth_system']=false; will disable authorization, so no logins 
 are required and the system behaves like the current ganglia web frontend.  
 In this case, should editing of views be allowed or denied?  Do we want 
 disabling auth to mean 'read-only access' or 'anything goes'?

I think it should be read only at that point. I envision groups of users with 
competing ideas of what machines are important putting the web interface 
through a tug of war.

If administrators want to risk that, it should be something they have to 
consciously enable.

 #3. Should the default be to ship with authorization enabled or disabled?
 
 My preference is that 'read-only  no authorization required' is the default 
 configuration.

That's my preference too.
--
Fulfilling the Lean Software Promise
Lean software platforms are now widely adopted and the benefits have been 
demonstrated beyond question. Learn why your peers are replacing JEE 
containers with lightweight application servers - and what you can gain 
from the move. http://p.sf.net/sfu/vmware-sfemails
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Need help configuring clusters to use separate multicast IP

2011-03-23 Thread Seth Graham

That might work, but I don't think anyone sets up their ganglia so that a 
single gmond is trying aggregate all clusters. That's what the gmetad daemon is 
for. 

Also note that even though you have a separate multicast address for each 
cluster, the port still has to be unique. The port is what gmetad and the web 
frontend use to distinguish between clusters. You get really weird results if 
multiple data_source lines use the same port.


An ideal configuration might be:

Each of the 5 clusters has a unique gmond.conf, with its own multicast address 
and port number.

The gmetad host has 5 data_source lines to query one host from each of the 5 
clusters.




On Mar 23, 2011, at 9:52 AM, Ron Cavallo wrote:

 
 I need some help. I am trying to configure my gmetad to collect from
 different clusters on different IP's. I have 5 clusters. This is my
 gmetad collections server's local gmond.conf configuration:
 
 
 /* Feel free to specify as many udp_send_channels as you like.  Gmond
   used to only support having a single channel */
 udp_send_channel {
  mcast_join = 239.2.11.72
  port = 8649
  ttl = 1
 }
 
 /* You can specify as many udp_recv_channels as you like as well. */
 udp_recv_channel {
  mcast_join = 239.2.11.71
  port = 8649
  bind = 239.2.11.71
 }
 
 udp_recv_channel {
  mcast_join = 239.2.11.72
  port = 8649
  bind = 239.2.11.72
 }
 
 udp_recv_channel {
  mcast_join = 239.2.11.73
  port = 8649
  bind = 239.2.11.73
 }
 
 udp_recv_channel {
  mcast_join = 239.2.11.74
  port = 8649
  bind = 239.2.11.74
 }
 
 udp_recv_channel {
  mcast_join = 239.2.11.75
  port = 8649
  bind = 239.2.11.75
 }
 
 udp_recv_channel {
  port = 8649
 }
 
 This is an excerpt from ONE OF THE CLUSTERS ABOVE (the .74 cluster)
 
 /* Feel free to specify as many udp_send_channels as you like.  Gmond
   used to only support having a single channel */
 udp_send_channel {
  mcast_join = 239.2.11.74
  port = 8649
  ttl = 1
 }
 
 /* You can specify as many udp_recv_channels as you like as well. */
 udp_recv_channel {
  mcast_join = 239.2.11.74
  port = 8649
  bind = 239.2.11.74
 }
 
 I configure only one server in a cluster to be polled from the gmetad
 since that server has all of the cluster members information in it
 anyway. Here is how I have it configured to talk to the one gmond shown
 directly above:
 
 data_source SaksGoldApps 45 sd1mzp01lx.saksdirect.com:8649
 
 --
 Enable your software for Intel(R) Active Management Technology to meet the
 growing manageability and security demands of your customers. Businesses
 are taking advantage of Intel(R) vPro (TM) technology - will your software 
 be a part of the solution? Download the Intel(R) Manageability Checker 
 today! http://p.sf.net/sfu/intel-dev2devmar
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Need help configuring clusters to use separate multicast IP

2011-03-23 Thread Seth Graham

On Mar 23, 2011, at 10:12 AM, Ron Cavallo wrote:

 I see. So I need a separate IP AND A SEPARATE PORT. Got it.
 
 Also, I use a single gmond in each cluster to aggregate the single
 cluster. I configure the gmetad to talk to only gmond from each cluster.
 Is that wrong?

No, your configuration is correct if the above is how you've set it up.

I interpreted your previous message as saying you had a gmond process with 
udp_recv_channels for every cluster.


--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Need help configuring clusters to use separate multicast IP

2011-03-23 Thread Seth Graham

On Mar 23, 2011, at 10:34 AM, Ron Cavallo wrote:

 Ahhh wait! I do!! On the AGGREGATION Server, I have both a gmetad.conf
 and a gmond.conf (I also monitor the server itself). 
 
 I configured RECEIVE channels in the gmond.conf on the aggregation
 server for every cluster, specifying the IP that the clusters will be
 sending on. Is that wrong?

It probably won't produce the desired results. So in that sense, yes it's 
wrong. 

But gmond will certainly let you do it, I'm just not sure what the resulting 
data will look like. Best case it would merge all clusters into a single 
cluster. Worst case, machines disappear and reappear randomly.


--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend

2011-03-22 Thread Seth Graham

On Mar 22, 2011, at 10:53 AM, Ron Cavallo wrote:
 
 I see other examples where I have to go hunting around for cluster members 
 that aren't reporting into the proper cluster.
 
 Any ideas?

Double check the ports in use in the gmond.conf on the machines that are 
misbehaving. 

Also note that machines tend to linger in an old cluster they were reporting 
to, even if their config file says otherwise. If you look at the XML dump from 
the gmetad, you may find that a given machine appears twice. The web frontend 
gives fairly random results when this happens.

These stale entries do eventually expire (default is 30 days I believe), but a 
restart of all gmond processes and gmetad will clean it up instantly.


--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia -- modify the source code

2011-03-11 Thread Seth Graham

On Mar 11, 2011, at 1:51 PM, Afef MDHAFFAR wrote:

 Hi all,
 
 I am trying to modify the source code of Ganglia in order to make ganglia 
 able to send monitored data via network connection to another component.
 I noticed that it sums the metric values of all nodes composing the cluster 
 (eg. it calculates the load of the cluster).
 Would you please help me to eliminate this aggregation and send values for 
 each node (for example: [Node1, Load, the value of the load for only this 
 node]).   

You shouldn't need to modify the ganglia source to do this. If you want the 
per-host value, parse the XML coming from gmond. Every host has an entry in 
this XML tree, and the values are not aggregated. 




--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia -- modify the source code

2011-03-11 Thread Seth Graham

On Mar 11, 2011, at 2:30 PM, Bernard Li wrote:

 Hi Seth:
 
 On Fri, Mar 11, 2011 at 12:26 PM, Seth Graham set...@fnal.gov wrote:
 
 You shouldn't need to modify the ganglia source to do this. If you want the 
 per-host value, parse the XML coming from gmond. Every host has an entry in 
 this XML tree, and the values are not aggregated.
 
 I was kind of surprised that no such library exists to do this -- are
 you aware of anything in the wild?  

No.. but I've never had a need to look for one.

I use php's xml parser to pull the bits of data I need, which was more or less 
a cut and paste from the ganglia.php that ships with the web frontend. :)

It's good code, it might be useful to turn it into a standalone library at some 
point.

--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Noobie questions re: Ganglia

2011-03-02 Thread Seth Graham

On Mar 1, 2011, at 11:26 AM, William Saxton wrote:

 Hi all (potential) new ganglia user here, with a couple quick questions 
 that I couldn't find the answers to via google.
 
 1) Where can I find how ganglia gathers information from a system?  

Well, it's an open source project, so you can find it by cracking open the 
source files. 

The stuff you're interested in is in the libmetrics directroy.

 2) Does anyone have any experience with using ganglia, just as a backend 
 for storage of RRD data, but then using their own custom front-end?  

Ganglia is structured well enough that you can easily remove any piece you 
don't want. The web interface is completely optional.. if you can parse xml and 
run rrdtool, making your own frontend is trivial.

Likewise, if all you need is the xml, you can eliminate the gmetad portion and 
query gmond directly.

Last, gmond uses a module system for collecting system metrics, allowing you to 
strip out anything you don't want and build your data collection up from 
scratch (it is a little restrictive on payload size but other than that, the 
sky's the limit).


--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Multicast/Unicast Poll

2011-01-13 Thread Seth Graham

On Jan 12, 2011, at 4:22 PM, Bernard Li wrote:

 Hi Seth:
 
 On Wed, Jan 12, 2011 at 1:31 PM, Seth Graham set...@fnal.gov wrote:
 
 Migrating to unicast eliminated the firewall issues, means only a select few 
 machines have to keep metrics in memory, and no more cross talk with other 
 groups. I never saw any solid evidence that ganglia was putting an unfair 
 load on systems, but it was easier to reconfigure than fight it.
 
 Since you guys are in HPC and are using unicast -- what
 send_metadata_interval do you use?

It's currently set to 15 seconds.

However, only a third of our machines are migrated to 3.1.7.. everything else 
is still on 2.5. I chose 15 seconds because that was the number that popped 
up when I was searching for information on send_metadata_interval, and I 
haven't touched it since.

MRTG data for my collector nodes don't show anything to be alarmed about, 
whatever bandwidth ganglia is using, it's getting buried by user consumption. I 
think it will stay this way, as 1000 machines is our largest cluster and that's 
how many are currently using 3.1.7. I do plan on sending all 3000 of our 
machines to a single gmetad, so it'll be interesting to see how that holds up 
(but my understanding is that send_metadata_interval has no effect on gmetad).


 Would appreciate your input on the following thread over at 
 ganglia-developers:
 
 http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg05725.html

I don't have much in the way of comments, because I haven't had any problems. 

I do like the idea to set the default to something other than zero if unicast 
is enabled. A warning on startup could be useful too.. I know I glazed over the 
send_metadata_interval in the man page several times until a google search 
pointed it out to me. Printing the message only when -d is specified might be 
good enough.. -d 1 is usually the first thing I try when things aren't doing 
what I want.


--
Protect Your Site and Customers from Malware Attacks
Learn about various malware tactics and how to avoid them. Understand 
malware threats, the impact they can have on your business, and how you 
can protect your company and customers by using code signing.
http://p.sf.net/sfu/oracle-sfdevnl
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Issue with gmetad

2011-01-12 Thread Seth Graham

On Jan 12, 2011, at 9:39 AM, John Williams wrote:

 I have also taken this one step further by installing our server on a brand 
 new Dell R710 with 6x240GB SSD (RAID5). Ganglia is the only thing running on 
 the server. I received the same errors after just a few minutes of running.
 
 I have also ran xmllint against the output from port 8651 and it reports no 
 errors.
 
 Any help is appreciated.


Are all your data_source groups on separate subnets? Are you using multicast?

Your gmetad.conf has you giving no ports, which means everyone is using 8649, 
and if the various machines are on the same wire, their data is going to get 
piled together. 

I don't know if gmetad 3.0 or newer is better about this (because I always 
define ports these days), but in the 2.5 era it would get hopelessly confused 
if multiple data_sources were using the same port. It may be worth giving each 
data_source its own port and see if things improve.

The port is what ganglia uses to differentiate clusters. Cluster name, 
data_source, ip address.. doesn't matter. If Two machines are using the same 
port to relay metrics, gmetad is going to think they're in the same cluster.


--
Protect Your Site and Customers from Malware Attacks
Learn about various malware tactics and how to avoid them. Understand 
malware threats, the impact they can have on your business, and how you 
can protect your company and customers by using code signing.
http://p.sf.net/sfu/oracle-sfdevnl
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Multicast/Unicast Poll

2011-01-12 Thread Seth Graham

On Jan 12, 2011, at 3:12 PM, Jesse Becker wrote:

 In light of the recent discussions over metadata and unicast vs.
 multicast, we (meaning Bernard) have created a poll on
 http://ganglia.info/ to try and gauge the use of each.  Please let us
 know if you use multicast, unicast, or both in your environments.
 
 If you have any comments about using one or the other, 

We used multicast for a long time because it's certainly easy, and ganglia is 
something multicast is well suited for.

But as the years rolled on, firewalls got involved, people became concerned 
about memory and network usage, and subnet privacy was eroding.  We started 
getting other departments' machines mixed in with our machines, and this caused 
all kinds of confusion on both sides.

Migrating to unicast eliminated the firewall issues, means only a select few 
machines have to keep metrics in memory, and no more cross talk with other 
groups. I never saw any solid evidence that ganglia was putting an unfair load 
on systems, but it was easier to reconfigure than fight it.

So the reasons to switch were mostly political.

--
Protect Your Site and Customers from Malware Attacks
Learn about various malware tactics and how to avoid them. Understand 
malware threats, the impact they can have on your business, and how you 
can protect your company and customers by using code signing.
http://p.sf.net/sfu/oracle-sfdevnl
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] tcp/ip instead of multi-cast ???

2011-01-10 Thread Seth Graham
On Jan 10, 2011, at 2:11 PM, Sayler, Steven (Contractor) wrote:

 Because of our network, multicast protocol will be a major problem. Is there 
 a way to run ganglia gmond/gmetad via tcp/ip?


Yes, look into the udp_send_channel option for gmond.conf.

If you specify a host and port gmond will switch to a unicast mode.



--
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl 
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia for data collection, not storage/graphing?

2010-12-09 Thread Seth Graham

Yes,  because all of the ganglia data is stored in an xml format. You telnet to 
a gmetad or gmond process and get a dump of everything that daemon knows about. 
Makes it easy to write additional tools because xml parsers are a dime a dozen.

It's fast enough to be used in a web page.. I use it for everything from 
monitoring kernel versions to system uptime for a tactical overview page that 
helps monitor around 3500 machines.



On Dec 9, 2010, at 11:57 AM, O G wrote:

 Hello,
 
 Is Ganglia written in a way that let's one use its data gathering 
 capabilities, but not data storage and graphing?
 
 For example, can one write some sort of Ganglia component or plugin that 
 takes data collected from any of its built-in or other components and sends 
 it somewhere else (a different file, a different database, out over the 
 network, etc.) instead of having the data being written by Ganglia to RRD 
 files and later graphed from there?
 
 Thanks,
 Otis
 
 
 
 
 
 --
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Archive Ganglia

2010-10-29 Thread Seth Graham

On Oct 29, 2010, at 5:24 AM, nigel.le...@uk.bnpparibas.com wrote:

 
 For various convoluted reasons, I would like to copy my rrd files to another 
 server, and view them as a point in time archive. In effect just have the 
 webfrontend running, and no gematd or gmond processes. 
 
 Any ideas ? It seems that simply copying the /var/www/html/ganglia and 
 /data/rrds directory does not work, as the webfrontend requires new data to 
 as input. 


Simply copying the files won't work because of the way rrd averages out data 
and expects new input.

The correct way to keep old data is to create the the RRDs with an RRA that 
holds data for as long as you want it. An alternative (painful) method is do 
dump the rrd's and use the xml output as your archive.


Another option to look into is the way rrdtool allows you to specify a time 
period to create a graph for. The ganglia frontend defaults to now for the 
end time. So you could copy your rrds somewhere, and tweak the php scripts to 
change now to the last time the rrds were updated.

This might not scale too well though, depending how many point in time 
archives you need to keep available.


--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Does Ganglia measure itself?

2010-09-21 Thread Seth Graham
On Sep 21, 2010, at 1:52 PM, Jesse Becker wrote:

 You can avoid this by using unicast to specifically designated
 collector gmonds (then having gmetad poll those for overall status).

Or by enabling 'deaf' on machines that you don't want collecting data and are 
stuck on multicast for whatever reason.

That said, gmond is probably the least burdensome metrics collection I've ever 
used.. an idle machine running gmond will still list a load average of zero. 
The hald that modern linux distros love to run consumes more processor.



--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia Web Forum?

2010-08-10 Thread Seth Graham

On Aug 10, 2010, at 2:15 PM, Bernard Li wrote:
 
 But I just want to clarify that we are *not* abandoning the
 mailing-list and IRC.  Web forums isn't really my cup of tea either
 but I just wanted to make sure that forum users have a place to get
 their questions answered about Ganglia if they prefer that over
 traditional mailing-lists.

This is the one big advantage of a forum.. once the google bot gets into it, 
finding answers to questions is ridiculously easy. The sourceforge mailing list 
search is passable, but the interface is crap and it's a huge hassle getting 
what you want.

 It would be great if forum/mailing-list can just be converged into one
 thing, perhaps we can look into using Nabble?
 
 http://www.nabble.com/


I've never used nabble, but I have used email-to-forum gateways before, and it 
always seems to turn into a mess.. sigs get into forum posts, quoting is all 
kinds of mess, fun stuff like that.

Maybe nabble is better. I wouldn't know.. but it's the sort of thing to keep an 
eye out for.


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] python module strings versus gmetric strings

2010-05-07 Thread Seth Graham
On 5/7/10 9:37 AM, Brad Nicholes wrote:

 This is the process which packages a metric into a very small packet which 
 can be passed between systems safely.

Apologies for barging into this discussion, but I've been working on 
getting used to the modules features of ganglia this week and this 
caught my eye.


In the past I've used gmetric to get some auditing information from our 
systems. For example, space and inode usage on specific user-owned 
filesystems. My script collects this list of mounts, delimits it, then 
uses gmetric to dump it into the xml tree for collection at a central 
location.

I'm aware ganglia is intended to be performance monitoring software, but 
it's always been so good at shipping data around it's hard to resist 
using it for more than just metrics.


At any rate, as we prepare for our upgrade to ganglia 3, I discovered 
that submitting strings via the python module interface is limited to 32 
characters.. anything longer produces some odd behavior in the xml tree.

I was able to increase this maximum by adjusting the MAX_G_STRING_SIZE 
in gm_value.h, but your comment about small packets and safety make me 
question whether this is wise.


Are there risks to this I am unaware of? If I have to pass arbitrary 
non-graphable data from my machines to a central host, should I continue 
to use gmetric?

If so, what does gmetric do differently that allows longer strings than 
gmond does?


--

___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Extending the format of gmetad.conf

2010-01-07 Thread Seth Graham
Daniel Pocock wrote:
 
 - is it important for users to maintain the files manually, or will the 
 focus shift to tools, web interface or config files generated from some 
 other enterprise data source?

I've been content with the existing file format for the 7 or so years 
I've been running ganglia. At this point if changes were going to be 
made, I think making it consistent with gmond's configuration format 
would be a noble effort.

Not a big fan of any kind of GUI interface to maintain a text file, many 
other software packages have made this mistake.. the problems it creates 
is a config file that reads like line noise, or hidden options in the 
config that never get GUI elements to control them.



--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Fw: No graph display on ganglia web page

2009-08-17 Thread Seth Graham

I would assume errors with using the rrd tool to make the graphs. Some 
kind of path issue perhaps?

Try checking your error logs and see if anything comes up. Or right 
click where the graph should be, copy the image location, and load it in 
a new tab. Sometimes you'll get some helpful text from this.


Thanachote Pothanant wrote:

 Hi Seth,

 Thank you for you reply and sorry if duplicate.
 I just try for only 1 node (client, server and web front are all in 
 the same node).
 About 'memory_limit' in php.ini, it already set to 128M.

 Please follow these links for my broswer screenshots.

 http://img90.imageshack.us/img90/1203/ganglia1.jpg
 http://img194.imageshack.us/img194/4039/ganglia2.jpg

 Thank you very much for your help.

 Thanach

 Inactive hide details for Seth Graham set...@fnal.govSeth Graham 
 set...@fnal.gov


 *Seth Graham set...@fnal.gov*

 08/14/2009 09:21 PM

   

 To
   
 Thanachote Pothanant/Thailand/i...@ibmth

 cc
   
 ganglia-general@lists.sourceforge.net

 Subject
   
 Re: [Ganglia-general] Fw: No graph display on ganglia web page

   



 How many machines are you monitoring? Are you getting the page headers
 at all?

 php defaults for memory allowance are pretty small. I forget what the
 default is, but whenever a php script exceeds this limit the script will
 exit. In the case of ganglia, this usually means either no graphs, or
 only a few graphs, are shown.

 Try doubling the 'memory_limit' in php.ini. I currently have mine set to
 128M, which is more than enough room for pages up to 1000 nodes.


 Thanachote Pothanant wrote:
 
  Hi all,
 
  I'm pretty new with ganglia. So please help me with this problem.
  My problem is that in ganglia web page, all graphs cannot be displayed.
 
  I set $debug to 1 in graph.php and got rrdtool command.
  When I tried executing the command and redirect () output to file,
  the graph was there.
  But in the web page there is nothing.
 
  My environment details are as follow,
  OS: AIX 6100-03-01-0921
  Processor type: PowerPC POWER4 processor
  Apache version: Apache/2.2.11 (Unix) PHP/5.2.9 with Suhosin-Patch
  mod_ssl/2.2.11 OpenSSL/0.9.8j DAV/2 mod_chroot/0.5
 
  Belows are list of rpm packages,
  apr-1.3.3-2
  libconfuse-2.6-1
  expat-2.0.1-2
  ganglia-lib-3.1.2-1
  ganglia-gmond-3.1.2-1
  zlib-1.2.3-5
  freetype2-2.3.9-1
  libpng-1.2.38-1
  libart_lgpl-2.3.20-1
  rrdtool-1.2.30-2
  ganglia-gmetad-3.1.2-1
  ganglia-web-3.0.3-1
  fontconfig-2.7.0-1
 
  I got rpm packages except ganglia-web-3.0.3-1 from
  http://www.perzl.org/ganglia/
 
  I'm using ganglia-web-3.0.3-1 because I got this error when I tried to
  install ganglia-web-3.1.1-1
 
  lparaix61:root rpm -Uvh --ignoreos ganglia-web-3.1.1-1.noarch.rpm
  error: failed dependencies:
  php-gd is needed by ganglia-web-3.1.1-1
 
  Please help me out with this problem. Thank you very much.
 
  Thanach
 
  
 
  
 --
  Let Crystal Reports handle the reporting - Free Crystal Reports 2008 
 30-Day
  trial. Simplify your report design, integration and deployment - and 
 focus on
  what you do best, core application coding. Discover what's new with
  Crystal Reports now.  http://p.sf.net/sfu/bobj-july
   
  
 
  ___
  Ganglia-general mailing list
  Ganglia-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/ganglia-general
   




--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Fw: No graph display on ganglia web page

2009-08-14 Thread Seth Graham

How many machines are you monitoring? Are you getting the page headers 
at all?

php defaults for memory allowance are pretty small. I forget what the 
default is, but whenever a php script exceeds this limit the script will 
exit. In the case of ganglia, this usually means either no graphs, or 
only a few graphs, are shown.

Try doubling the 'memory_limit' in php.ini. I currently have mine set to 
128M, which is more than enough room for pages up to 1000 nodes.


Thanachote Pothanant wrote:

 Hi all,

 I'm pretty new with ganglia. So please help me with this problem.
 My problem is that in ganglia web page, all graphs cannot be displayed.

 I set $debug to 1 in graph.php and got rrdtool command.
 When I tried executing the command and redirect () output to file, 
 the graph was there.
 But in the web page there is nothing.

 My environment details are as follow,
 OS: AIX 6100-03-01-0921
 Processor type: PowerPC POWER4 processor
 Apache version: Apache/2.2.11 (Unix) PHP/5.2.9 with Suhosin-Patch 
 mod_ssl/2.2.11 OpenSSL/0.9.8j DAV/2 mod_chroot/0.5

 Belows are list of rpm packages,
 apr-1.3.3-2
 libconfuse-2.6-1
 expat-2.0.1-2
 ganglia-lib-3.1.2-1
 ganglia-gmond-3.1.2-1
 zlib-1.2.3-5
 freetype2-2.3.9-1
 libpng-1.2.38-1
 libart_lgpl-2.3.20-1
 rrdtool-1.2.30-2
 ganglia-gmetad-3.1.2-1
 ganglia-web-3.0.3-1
 fontconfig-2.7.0-1

 I got rpm packages except ganglia-web-3.0.3-1 from 
 http://www.perzl.org/ganglia/

 I'm using ganglia-web-3.0.3-1 because I got this error when I tried to 
 install ganglia-web-3.1.1-1

 lparaix61:root rpm -Uvh --ignoreos ganglia-web-3.1.1-1.noarch.rpm
 error: failed dependencies:
 php-gd is needed by ganglia-web-3.1.1-1

 Please help me out with this problem. Thank you very much.

 Thanach

 

 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
 trial. Simplify your report design, integration and deployment - and focus on 
 what you do best, core application coding. Discover what's new with 
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
   
 

 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general
   


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] how to preserve rrd data (as long as I want)

2009-03-04 Thread Seth Graham

jiangyouu wrote:

Hi!
I want preserve rrd database as long as possible,and review specific 
day (or hour even minute )'s specific data(like cpu Utilization rate 
,network Utilization rate ),but the default value of dear Mr Ganglia 
is one year.


The resolution of stored data is configured when the rrd is generated, 
which is done by gmetad. If you need higher resolution you'd have to 
hack up the gmetad source to generate the RRA's you desire. The wishlist 
for gmetad mentions allowing custom RRA's, but it's not in yet.


The penalty this generates is the rrds use more disk space. I have no 
experience trying to store a year (or more) of per-minute data, but I 
would imagine it carries a performance penalty as well.


The other option is to do an rrd dump of the data, and archive that.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] gmond stops logging data

2008-09-15 Thread Seth Graham
timl wrote:
 I'm running ganglia version 3.1.0 and periodically gmond seems to stop
 collecting data.  I can see the incoming traffic to the server and the
 web pages show the hosts as being up, but no data is logged.
 Originally I started seeing this on 3.0.5 so I upgraded.. but the
 lastest seems to have made the problem worse.  Sometimes the graphs
 start showing data on their own, but usually I have to restart gmond on
 the clients.
   
Where does the fresh data stop showing up? That is, dumping out the XML 
does the reported stop updating?

I've seen holes appear in rrd files when the gmetad server is having 
trouble keeping up with the I/O, and restarting gmond on clients 
wouldn't repair that. But that's where it seems like the problem is, 
especially if the web page never shows the machines being down. Try 
running gmetad in debug mode and see if it complains about writing to 
the rrds.

btw, surprise meeting you here. ;)

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] ganglia and job monarch

2008-07-23 Thread Seth Graham
Daniel Bourque wrote:
 Currently, I have this as the revc_channel on the gmond accepting info 
 from all the worker nodes:

 udp_recv_channel {
   port = 8666
   family = inet4
 }


 I don't see how this channel is associated with a particular cluster 
 .  If I add another udp_recv_channel , and tell job monarch to use 
 that channel , how will ganglia be able to separate that input from 
 the the worker other nodes ?
Ganglia groups machines based on which port number they communicate on. 
Hostname, ip address or text labels in gmetad.conf are irrelevant. It 
takes everyone a while to wrap their head around this, but it works well 
once you get used to it.

Every port you have a gmond chattering on will have a completely unique 
XML tree. So you could put your worker nodes on port 8666, put the batch 
server on 8665, and enter two data_source lines in your gmetad.conf to 
collect from those ports. When you bring up the ganglia web page, you'll 
notice that the view has changed a little bit and you'll see two 
clusters instead of the single one you did originally.

Finally, you point your jobmonarch tools at port 8665 so it can get the 
data it needs.

You may be able to skip putting the data_source line in gmetad.conf for 
the batch server. I don't have an opportunity right now to test all the 
possibilities so you're on your own there.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] ganglia and job monarch

2008-07-22 Thread Seth Graham
Daniel Bourque wrote:
 Hi,

 my setup is as follow, 2 PBS head nodes running torque, moab , 
 ganglia and a group of compute nodes running pbs_mom. Ganglia's gmond is 
 running on the headnodes in mute mode.

 I'm trying to get rid of the localhost.localdomain node that now shows 
 up in ganglia because job monarch reports as localhost.localdomain.
   

I'm not sure what this means, because jobmonarch doesn't report as 
anything, instead it adds metrics to an existing host's entry in the xml 
tree (specifically,  your pbs server). If you telnet to your xml_port on 
the gmetad server and dump to a file, and search for 'MONARCH', you'll 
see that everything is included inside a pair of HOST tags. The only 
place the hostname is set is within that HOST tag.

It seems to me your pbs server is confused about its own hostname. 
Either a bad entry in /etc/hosts, or assuming a redhat system, something 
in /etc/sysconfig is setting the machine's name to localhost.localdomain 
(which is a default).



-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] ganglia and job monarch

2008-07-22 Thread Seth Graham
Daniel Bourque wrote:
 I don't want ganglia to report on the nodes running pbs_server. I only 
 care about the compute nodes. having non compute nodes  in ganglia 
 messes up the usage statistics.
The proper way to fix this is have your pbs server submit ganglia 
information on a different port than the worker nodes.
 Since Job Monarch must piggy back off an existing node, I must use 
 BATCH_HOST_TRANSLATE to map localhost.localdomain to one of my compute 
 node.  Correct ?
I can't comment, because I've never used that feature of job monarch. 
 From what I can tell in the jobmond.conf file, that's not the intended 
purpose. So if it does work, great. If it doesn't, I'm not surprised. ;)


-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] gmetad giving high TN values

2008-06-26 Thread Seth Graham
Bernard Li wrote:
 Hi Kirk:

 On Wed, Jun 25, 2008 at 1:53 PM, Kirk McDonald
 [EMAIL PROTECTED] wrote:

   
 gmetad runs on a certain host. Also on that host are a number of gmond
 instances, which are the gmond instances polled by gmetad. Each of
 these instances is reported to by a separate cluster, and they are
 each a separate data source for gmetad. All of the XML polling happens
 over localhost.
 

 I am curious why you are running multiple instances of gmond on the
 gmetad host.  Wouldn't it suffice to simply have gmetad poll gmonds
 running on your cluster directly?
   
I can't speak for Kirk,  but in one instance I do this to get around 
firewall restrictions. I can send packets out, but not in. So I had to 
jerry-rig some way to forward data to my gmetad. ;)

I wouldn't do it this way again, but I haven't gotten around to un-doing 
the decision.

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Leveraging Ganglia XML output for more then monitoring -- the Thebes Consortium Project

2008-06-10 Thread Seth Graham
Jesse Becker wrote:
 Bernard Li wrote:

 While not exactly what you have in mind, but have you taken a look at
 the JobMonarch project?

 https://subtrac.sara.nl/oss/jobmonarch/

 AFAIK it does also work with SGE.

 Meh...not really.  It's under development, and doesn't work so well 
 with the 6.x versions.  I think it works with the old 5.3 series though.

job monarch reveals a flaw with PBS (what we use, I imagine this isn't a 
unique trait) in the sense that the worker nodes do not have the 
capacity to report job information. Job monarch can only run on the 
central server.. which makes ganglia, due to its distributed nature, a 
poor partner. 

If I'm understanding the goal of the Thebes project, they would try to 
get the authors of batch system software to adopt a more ganglia like 
approach to reporting statistics.. which I'd be happy to see.

The concern I have is overloading the metrics held in memory by a gmond 
process to the point it starts consuming noticeable amounts of 
resources. Ganglia's xml output is by far my favorite feature of the 
program, the xml is easy to parse and use for homegrown monitoring 
tools. I worry that if ganglia became a default dumping ground for 
service information the xml would become inconvenient to work with.

The other downside I see is the nature of the data itself. Job 
information from a batch system is not something you can stuff into an 
RRD.. you'd have to develop some other way to store job history 
information, a task that would put a greater load on gmetad and 
introduce additional scalability concerns. My second favorite feature of 
ganglia is how simple it is, and I don't know if I'd appreciate it the 
same way if an installation had a dependency tree as long as my arm.




-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Largest Ganglia installation?

2008-06-06 Thread Seth Graham
Bernard Li wrote:
 Dear Ganglia community:

 Was browsing our SourceForge website and found this description of the 
 project:

 Ganglia is a scalable distributed monitoring system for
 high-performance computing systems such as clusters and Grids. It is
 based on a hierarchical design targeted at federations of clusters.
 Supports clusters up to 2000 nodes in size.

 The part I want to focus on is Supports clusters up to 2000 nodes in size.

 I suppose this was probably written a while back and I would like to
 update it -- so if you have an installation monitoring more than 2000
 hosts, do let us know and we'll update the description with the
 largest installation from the community :)
   

We have ~3000 machines being monitored with ganglia, but the load is 
split between two separate machines. Probably not the statistic you were 
after. :)

Best machine I had available when setting up carried 4GB of memory, so I 
could have put everything on one machine but it wouldn't have left much 
room for growth. More memory than 4GB is pretty common now.. next time I 
upgrade our gmetad hosts I'll probably try to put all 3000 on one node, 
but for now it's two machines.


 So far I have yet to hear of anybody reaching a ceiling in terms of
 scaling Ganglia installations once the usual steps of putting rrds in
 ramdisk/tmpfs -- do let us know if you learn otherwise!

 Thanks,

 Bernard

 -
 Check out the new SourceForge.net Marketplace.
 It's the best place to buy or sell services for
 just about anything Open Source.
 http://sourceforge.net/services/buy/index.php
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general
   


-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] how to setup ganglia to run in Unicast mode

2008-06-06 Thread Seth Graham
Sai p Seshasayee wrote:

 Hi Team,

 I am a new user to ganglia. I have been trying to setup ganglia to run 
 in Unciast mode. Please get back to me regarding the same.


Configuration for gmetad is identical for both unicast and multicast. 
The only difference is the gmond.conf on your machines.

Instead of mcast_join, you use something like:

udp_send_channel {
host = 127.0.0.2
port = 8000
}

When you specify a host option, gmond will use unicast.

On the machine that you specify as the destination for unicast packets, 
you'll need something like:

udp_recv_channel {
bind = 127.0.0.2
port = 8000
}


 Thanks and Regards
 Sai Prakash
 Poughkeepsie Unix Development Lab
 IBM Systems and Technology Group
 External: 845-435-4720
 email: [EMAIL PROTECTED]
 Notes: Sai p Seshasayee/Poughkeepsie/IBM
 

 -
 Check out the new SourceForge.net Marketplace.
 It's the best place to buy or sell services for
 just about anything Open Source.
 http://sourceforge.net/services/buy/index.php
   
 

 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general
   


-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Largest Ganglia installation?

2008-06-06 Thread Seth Graham
Bernard Li wrote:
 Hi Seth:

 On Fri, Jun 6, 2008 at 7:45 AM, Seth Graham [EMAIL PROTECTED] wrote:

   
 We have ~3000 machines being monitored with ganglia, but the load is split
 between two separate machines. Probably not the statistic you were after. :)

 Best machine I had available when setting up carried 4GB of memory, so I
 could have put everything on one machine but it wouldn't have left much room
 for growth. More memory than 4GB is pretty common now.. next time I upgrade
 our gmetad hosts I'll probably try to put all 3000 on one node, but for now
 it's two machines.
 

 Thanks for the stats -- it's always good to know what the community is up to.

 How large is your rrd currently

~550MB on one server, ~650MB on the other, using the default metrics 
setup. The value fluctuates due to machines getting moved between 
clusters, and old rrds not being deleted. Have approached 1GB in those 
situations, usually due to configuration errors. ;)
 Since you have 2 gmetad servers, do
 you have an additional gmetad server to aggregate the data from the
 federated servers?
   
Not currently but I've been considering a setup like that for including 
some new machines we're getting (to deal with a firewall).

Due to the way the departments here are set up, the clusters divide 
pretty cleanly and there was no need to aggregate the information. It's 
probably good that it is so.. a number of our users would start 
complaining if someone else's stats started appearing on their web page.

 It looks like rrdtool 1.3 is getting closer to being released.
   
I am looking forward to it too. Running out of the ram disk has never 
sat well with me, I dislike the threat of losing any amount of data if 
there's a crash.



-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia 3.0.7-1

2008-05-29 Thread Seth Graham
Owens, David L wrote:
 In the gmetad.conf file I have data_source called Non_Prod with four
 servers ie: hostname:8649. These four servers are on the same subnet. I
 want to add two machines that are on a different subnet. I have tried
 different ports but will not display under Non_Prod.  Any suggestions?
   
Ganglia uses the port to divide groups of machines, and provides no way 
of merging them. You have to find a way to get all the machines you want 
grouped together to chat on the same port.

Switching to unicast packets is the easiest way to do this, but you'll 
lose the redundancy that multicast provides.

I'm pretty sure you can mix multicast and unicast in a single gmond.conf 
but I've never actually done it so I could be talking gibberish.


 David 

 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general
   


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Setup large clusters

2008-03-12 Thread Seth Graham
Martin Hicks wrote:

 The configuration of gmetad has been modified to store the rrds in
 /dev/shm, but this directory gets very large so I'd like to move away
 from that.

Using tmpfs is pretty much your only option. As you discovered, the disk 
I/O will bring most machines to their knees.

 Is there a way that I should be architecting the configuration files
 to make ganglia scale to work on this cluster?
 
 I think I want to run gmetad on each head node, and to use that RRD data 
 without
 regenerating it on the admin node.  Is that possible?


This is definitely possible, though I don't think it's necessary. I have 
  machines handling 1500 reporting nodes without problems, writing the 
rrds to a tmpfs.

The downside of setting up ganglia with head nodes is that you have to 
set up some way to make the rrds available to a central web server. 
Several ways to do that too, but they introduce their own headaches.



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] XML Parser for Gmetad

2007-07-17 Thread Seth Graham
Buccaneer for Hire. wrote:
 Hey All,
 
 Anyone have an XML parser or pointer for more
 information  for gmetad?  I am trying to get a
 notifier together. THX


There's one in ganglia.php in the web frontend. I chopped it up for use 
in some of my personal scripts and works well.. assuming you're wanting 
to use php.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] RRDs in memory

2007-07-13 Thread Seth Graham
Ben Hartshorne wrote:

 I created a ramdisk when my cluster grew beyond ~50 nodes (I report a
 lot of extra statistics).  I use an actual ramdisk instead of tmpfs
 (though I chose it out of ignorance when I first set it up, wikipedia[*]
 says that tmpfs might swap to disk whereas ramfs is just straight up in
 memory, nothing fancy).  


I initially used ramdisk as well, also out of ignorance. Ran into 
stability problems with it.. once I tried allocating more than 1GB to 
the disk I started getting system crashes and out of memory errors 
(system had 4GB physical memory).

Once I switched to tmpfs it became rock solid. tmpfs has the added 
advantage of being easier to configure.. no editing kernel boot 
arguments, just pass mount the options you want and it does it all for you.

Ramdisk is probably better on a busy system where you don't want risk  a 
bunch of swapping, but on a dedicated gmetad host I reccomend tmpfs.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] RRDs in memory

2007-07-11 Thread Seth Graham
Ofer Inbar wrote:
 gmetad is very write-intensive, because it updates hundreds of RRD
 files about every minute or two.  Has anyone tried running it with
 the rrd directory on a RAM disk (tmpfs) ?
 
 You'd need something to periodically copy the RRDs to a real disk,
 but that could happen much less frequently (maybe every 20 minutes).
 
 You'd also need a more complicated boot time startup procedure to set
 up the repository on RAM disk before starting gmetad.
 
 Have any of you tried anything like this?  What'd you do?  How'd it go?

We were forced to move our rrd's to tmpfs a couple years ago once our 
numbered of monitored machines grew into the thousands. It works as well 
as you would want to, other than the risk of losing everything if the 
machine goes down.

A cron job was put into place to tar the rrds to physical disk once an 
hour, and edits to rc.local untars it back into the tmpfs on boot. Still 
risk losing data, but ganglia will average out the gap after a while.

Load on the machine drops nearly to zero once you move to tmpfs.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] A survey of Ganglia users and usage.

2007-04-03 Thread Seth Graham

Buccaneer for Hire. wrote:

The simplicity is a major plus as well as the
integration w/ Globus.  With a little thinking you can
extend the reporting easily.  


The only think I with I had was notification.  I have
a large cluster and a number a smaller (128 nodes)
and it would make  it easier for us to be proactive.
So I am writing something that will parse the xml and
notify.
  
This is something I've done, of a form. It was a fun little project, 
mostly because of the way ganglia makes everything so easy to parse.


Eventually the email got unpopular, so the project ended up being a web 
page that tied in with a hardware and ticket database that the admin 
could view and get a quick idea of what's down and what's being worked on.






Re: [Ganglia-general] A survey of Ganglia users and usage.

2007-04-02 Thread Seth Graham

[EMAIL PROTECTED] wrote:


Perhaps we could create a simple anonymous survey for Ganglia users?
Code authors could then be guided quantitively by what the community
is really doing - what kind of hosts they monitor - what they use
in Ganglia, and what they may need.

What do you (all) think?


I've been under the impression for a while ganglia wasn't getting a 
whole lot of development and was mostly in maintenance mode. It hasn't 
changed a whole lot in the few years I've been using it (except perhaps 
the config file format, a change that was much appreciated).


The software is already excellent, and most of the changes I could 
suggest would be philosophical my way is better than your way type things.




Re: [Ganglia-general] Gmetad and web frontend on different machines.

2007-03-29 Thread Seth Graham

Martin Knoblauch wrote:

Richard,

 depending on the cluster size, writing the RRDs via NFS might turn out
to be a huge bottleneck.



Writing them to local disk is sometimes bad enough.

Reading them over nfs may be okay though, depends how often users are 
hitting reload.




Cheers
Martin
--- [EMAIL PROTECTED] wrote:


Saundry,
 
It sort of looks like you can, but actually you can't.

gmetad writes to rrd databases as local files,
and the web and php read rrd databases as local
(actually it invokes rrdtool itself).
 
I imagine you could separate the two using NFS filessystems,

but I have not tried this.

kind regards,

Richard Grevis 
Production Architecture 
Barclays Capital, Canary Wharf, London, E14 4BB 
-Original Message-

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
saundrya mishra
Sent: 29 March 2007 14:30
To: ganglia-general@lists.sourceforge.net
Subject: [Ganglia-general] Gmetad and web frontend on different
machines.



Hi There,

I am new to Ganglia. Can we have gmetad and web frontend for a
cluster to be running on two different machines?? If yes, then how is
it
possible since i read in the configuration file of the web frontend
that
the RRDTool databases  need to be local to be read? 
	

Greetings,
Saundrya.







For more information about Barclays Capital, please visit our web
site at http://www.barcap.com.

Internet communications are not secure and therefore the Barclays
Group does not accept legal responsibility for the contents of this
message.  Although the Barclays Group operates anti-virus programmes,
it does not accept responsibility for any damage whatsoever that is
caused by viruses being passed.  Any views or opinions presented are
solely those of the author and do not necessarily represent those of
the Barclays Group.  Replies to this email may be monitored by the
Barclays Group for operational or business reasons.



-

Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to
share your
opinions on IT  business topics through brief surveys-and earn cash


http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___

Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general




--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general





Re: [Ganglia-general] ganglia between two networks

2007-03-26 Thread Seth Graham

Jeremy Hansen wrote:
I've setup ganglia in the past and typically it's pretty straight forward.  
Now I have to deal with nodes being in two completely separate networks 
where it seems udp broadcast are most likely filtered.


Is there just a simple config option to have nodes contact a host 
directly?


I was playing around with 

udp_send_channel, udp_recv_channel and tcp_accept_channel but I'm coming 
up short.  Perhaps something with my gmetad.conf?  Does gmetad accept the 
incoming connections or is this handled by another gmond running on the 
reporting host?



udp_send_channel is the proper configuration option to send directly to 
a host. You do need a gmond on the other end with udp_recv_channel AND 
tcp_accept_channel set. Then have the data_source in gemtad.conf connect 
to whatever you set in tcp_accept_channel.



I have this set up for a particular group of machines under my care, 
which have to punch through a firewall that the external gmetad cannot 
reach. Have a special gmond running on the gmetad host with the unique 
configuration options, whom gmetad interfaces with.





Thanks for any pointers.

-jeremy


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general





Re: [Ganglia-general] Using cluster name to differentiate clusters?

2007-02-26 Thread Seth Graham

Ben Hartshorne wrote:

On Mon, Feb 26, 2007 at 01:06:23PM -0600, Seth Graham wrote:

Ben Hartshorne wrote:

It seems to me that using the name to determine cluster membership would
simplify things for the people configuring ganglia.
It would, but when you have 3000+ machines all chattering on the same 
port that's a lot of data for a machine to deal with. Not only do the 
aggregating machines have to hold it all in memory, but the gmetad host 
has to dump all that info into the rrds.


Isn't the machine going to have to handle exactly the same amount of
data, regardless of whether its on one port or two?


Not neccessarily, because you can instruct a machine to not poll a port 
at all. This is an easy way to exploit the features of networking to 
limit the traffic a host has to parse.


If you break up clusters by the name, the machine will have to read in 
the data for everything that exists on a subnet and filter based on the 
data it captures.


 I would imagine

that by the time your network got to 3000+ hosts, things would be
segregated in their own right, independent of ganglia.  Such segregation
would make it easy (and more logical) to use head nodes as aggregators
and then pass data up the tree to your main web interface.  Multicast
networks can be broken up by subnet or VLAN, and the unicast nodes can
use ganglia's ability to only pass on summary info, etc.  


This is true. Assuming one keeps them reasonable the volume of data 
would not become a problem. But IP blocks always seem to be getting 
bigger and bigger when new ones are assigned, and people are always 
cramming more and more machines onto them. We haven't crossed the line 
of 1000 machines on a single vlan yet, but the IP space is there and I 
worry what happens then.


The main web interface is my concern. Gmetad sucks up memory like it's 
free, and the disk I/O created when rrds are updated quickly get out of 
hand. Because of this we had to move the rrds to a ramdisk, which eats 
up even more memory.



Of course, I have not had the privilege of working with a cluster of
that size.  I've only got just over 100 hosts, so please forgive
anything that will become obvious as soon as I actually have to deal
with the problem...  ;)


I think your idea could work, it just seems (to me) to rely on a lot 
more components being configured in an ideal way. In my experience, I 
never get that. ;)





Re: [Ganglia-general] What are the rrdtool creation parameters for Ganglia Databases?

2007-01-26 Thread Seth Graham


The rrd creation values can be found in gmetad/rrd_helpers.c and 
gmetad/conf.c




Ian Wootten wrote:

Hi all,

I want to replicate ganglia's storage in Java, using a multicast 
listener, storing and manipulating using rrd4j. Firstly has anyone done 
anything similar? I'm struggling knowing what parameters to set for the 
database and getting an adequate resolution of the metrics captured from 
multicast (10-30s for the application I desire). Does anyone know what 
the datasource and archive creation commands would be/how many there are?


Secondly, and I think this is the main thing, the capture of information 
seems to take ages to be recieved in this way. I'm aware of the MonaLISA 
project and their java interfaces into ganglia, but a similar 
implementation by myself seems extremely slow. Currently packets seem to 
be retrieved at a rate of 1 a second, with each packet containing a 
single metric value - I'd like to have a complete set after 10 or so 
seconds.Would I be better off sticking to my current method of 
interfacing with ganglia's rrd databases directly and extracting content 
via the fetch command?


Thanks,

Ian


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general





Re: [Ganglia-general] Obtaining Immediate Interval Data From Ganglia

2006-08-10 Thread Seth Graham

Ian Wootten wrote:

Hmm,

Apologies for the empty reply. Thanks for those suggestions...

I'm assuming we're talking kernel modules here,


No, we're not. The term 'module' is probably being misapplied here, the 
stuff being discussed is a module in the sense it extends basic ganglia
functionality, but it's not going to be something that is loaded as part 
of ganglia.


Anything you can write that can telnet to a port can fetch the ganglia 
xml data, enabling you to store it however you want.





Re: [Ganglia-general] Obtaining Immediate Interval Data From Ganglia

2006-08-09 Thread Seth Graham

Ben Hartshorne wrote:

On Tue, Aug 08, 2006 at 04:22:41PM +0100, Ian Wootten wrote:
I am facing a problem in that I would like short-segment up to date 
information from ganglia in order to monitor services after invocation.


One method I have heard of that achieves something similar; write a
separate module that interprets the XML feed directly.


This works well, and I've done it in a single page of code so it's 
simple stuff. It becomes more of a chore when you want to actually store 
that data. For short periods it's fine, but if you got a lot of machines 
and you don't cycle out data the database will get really big really fast.


The other thing to consider is how often the xml updates. Clients 
running gmond only report at set intervals, so if you're trying to get 
information once a second you'll be unhappy with the results. This, in 
combination with gmetad's write intervals, is why the rrds have the 
'NaN' holes in them. Writing your own module may improve getting real 
actual numbers, but it won't improve the quality of the data.


Newer versions of ganglia allow you to customize update intervals, but 
on a big network setting values too frequent will generate a lot of 
traffic, and probably slow the gmetad server to a crawl with all the rrd 
updates.





Re: [Ganglia-general] number of source problem in gmetad

2003-09-11 Thread Seth Graham
 [EMAIL PROTECTED] init.d]# telnet strauss01 8649
 Trying 192.168.1.110...
 Connected to strauss01.
 Escape character is '^]'.
 Connection closed by foreign host.
 
 Is anyone got any ideas?


When I started getting this, I had to add the server I was connecting from as 
one of the trusted_hosts on all the machines that were reporting data. I made 
this change in /etc/gmond.conf.

I don't know why this started happening, after a reboot of the cluster that was 
involved all the machines started refusing connections from the machine that 
was collecting the data.

The above problem looks similar to the one I had.