Re: [Ganglia-developers] Ganglia gmetad thread stuck at TCP SYN SENT
On Fri, Jan 25, 2013 at 12:45:10PM +, Nicholas Satterly wrote: Does anyone have any ideas of how the connection could at least be timed out? Keep in mind that the gmetad is multi-threaded so I'm pretty sure that rules out the use of SIGALRM. .., How could a 2 second timeout be enforced on this connect()? You set O_NONBLOCK on the socket before the connect, run select with a 2 sec timeout on the socket from there if you have a connection (depending on if select hit the timeout or not and what getsockopt for SO_ERROR returns) you set the socket back to blocking. Did you see any failures when the machine went away after the connect? I can't remember if we timeout while we are reading data from the scoket. -- Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] [SECURITY] [IMPORTANT] Security issue in Ganglia Web
On Thu, Aug 02, 2012 at 09:35:47PM +, Daniel Pocock wrote: I remember that logic - but that doesn't really reflect what the distributions do Just backporting/cherry-picking the most essential security fixes to an old branch shouldn't be a big pain though I believe Kostas has already pushed out patches for 3.1.7 to Fedora/EPEL so in terms of distributed binary packages I guess we should be fine? Debian 6 also has 3.1.x - when this was mentioned before, I thought Kostas was updating the 3.1 branch and then the Debian and Fedora packages could all be built from the same tarball Kostas, could you possibly commit what you did onto the 3.1 branch and then I'll release a tarball? As you noticed there are no branches (or tags) for old releases :( If we can ressurect them I'll be happy to push the fixes. Kostas -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Protocol Efficiency Ideas
On Sun, Jan 29, 2012 at 08:10:08AM -0500, Jeff Buchbinder wrote: On Sat, Jan 28, 2012 at 11:21 PM, Kostas Georgiou k.georg...@atreides.org.uk wrote: On Fri, Jan 27, 2012 at 06:59:06AM -0800, Im Root wrote: I believe that adding json would be a mistake. The reason is that when users install the main package there would be now a dependency on having json installed. It just adds to the complexity and helps to perpetuate RPM hell. I've had to deal with installing json in the past and it's been awful. It may be nice for a developer but not so nice for the end users. There is no need for any extra dependencies to output json, my current testing code is basically a copy of the code that outputs xml. Parsing json is a completely different story though. That can be accomplished by embedding a copy of json-c, which wouldn't require any external dependencies. ( https://github.com/json-c/json-c ) This isn't going to help, the Fedora packaging guidelines forbid this so I'll have to rip it out and use the system one for the fedora packages anyway. Of course other distributions might be more forgiving, I have no idea. Personally I can't say I care much about JSON, what I am aiming at is to make the input/output in gmond more modular and having a second encoding helps me see clearer what should be exposed to a plugin. Kostas PS I've pushed my WIP in my tests/json branch if any one wants to have a look. -- Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Protocol Efficiency Ideas
On Fri, Jan 27, 2012 at 06:59:06AM -0800, Im Root wrote: I believe that adding json would be a mistake. The reason is that when users install the main package there would be now a dependency on having json installed. It just adds to the complexity and helps to perpetuate RPM hell. I've had to deal with installing json in the past and it's been awful. It may be nice for a developer but not so nice for the end users. There is no need for any extra dependencies to output json, my current testing code is basically a copy of the code that outputs xml. Parsing json is a completely different story though. Kostas -- Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()
On Wed, Feb 23, 2011 at 05:12:03PM -0800, Bernard Li wrote: Hi Carlo: On Wed, Feb 23, 2011 at 9:42 AM, Bernard Li bern...@vanhpc.org wrote: I tested under EL5 and EL6 and it was't able to get past the initial buffer size. I believe what I did was: Correction. It works on EL6, but not on EL5: [CentOS 5.5 x86_64 with kernel 2.6.18-194.32.1.el5] read(3, 2.6.18-194.32.1., 16) = 16 read(3, , 16) = 0 [RHEL6b2 x86_64 with kernel 2.6.32-37.el6.x86_64] read(3, 2.6.32-37.el6.x8, 16) = 16 read(3, 6_64\n, 16) = 5 The issue may be specific to files in /proc/sys, because I tried reading /proc/stat on CentOS 5.5 and it worked fine. In any case the slurpfile resizable buffer doesn't really work :( slurpfile only resizes a buffer if it's NULL but this is the case only at the first call for a metric. char *bp=NULL; slurpfile(file, bp, 32); ...next polling interval... /* the buffer isn't resibale any more since it isn't NULL */ slurpfile(file, bp, 32); Kostas -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] sFlow counters in Ganglia
On Fri, Oct 08, 2010 at 11:52:29AM -0700, Peter Phaal wrote: Brad, Thanks for the information on spoofing in modules. The way that I would envision an sflow module working would be similar to the spoofing example module that is currently checked into the Ganglia SVN repository. The spoofing module can be found at http://ganglia.svn.sourceforge.net/viewvc/ganglia/trunk/monitor-core/gmond/python_modules/example/spfexample.py?revision=1895view=markup . Unfortunately it is a python module example rather than a C module but hopefully you can get the idea of what I am talking about from the code. One application of this kind of spoofing module would be to load it under a gmond instance running on a VM host. It would then query each VM running on the box and register a set of spoofed metrics for each VM. From that point on, the module just reports the metrics for each spoofed VM and returns them as if gmond were running on each of the VMs. I actually have another python module that does exactly that, but I haven't been able to release the source code for it yet. You can also look at the modpython.c module to get an idea of how to do the spoofing in C code. But then you guys have already worked with the spoofing code as part of the patch that you already did so you probably already know how that works. Basically an sflow module would be loaded like any other module and a collection interval would be set in the configuration file. In the sflow module itself, register a spoofed metric for each managed sflow monitored node. How you get the list of nodes to register is up to you. It could be from the gmond.conf file, some other configuration file or by listening to the sflow data packets themselves. The module would then start a thread that would read the XDR packets in exactly the same way that you are doing now in the gmond code. The thread would then store the current metric data temporarily for each monitored node that it knows about until the next collection cycle happens. Then when gmond requests specific metric data for a specific spoofed node, you just return the last metric read from your temporary storage. Hopefully this helps. The whole intent of the modular interface is to move all of the metric gathering out of gmond itself in order to make gmond more flexible and upgradable for the user without having to depend on all of us as Ganglia developers to get around to releasing a new version every time a bug is found or someone wants to collect a metric in a different way. By implementing sflow as a module rather than within gmond itself, it would help to make sure gmond remains flexible and upgradable for the user. I think that there is still a problem with this approach, resampling the counters from an sFlow module would create ugly artifacts. The sFlow sampling rates are set in the sFlow agents and there is no guarantee that they would mesh well with a polling interval set in the gmond.conf file. There is also no guarantee that the sFlow agents will all be using the same polling intervals. To get accurate results you need to asynchronously post sFlow data into the gmond repository as it arrives. Another advantage of the current approach is that it involves zero configuration. sFlow agents are automatically discovered and added to the database as soon as they start sending sFlow to gmond. The sFlow gateway function automatically detects and adapts to changes in sFlow agent polling intervals etc. Why not add an extra plugin system instead of the existing metrics one? This way the sflow integration can be on it's own .so lib and on the gmond side you select it with something like: tcp|udp_recv_channel { proto = sFlow port = 6343 } Kostas -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today. http://p.sf.net/sfu/beautyoftheweb ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ganglia 3.1.2 files
On Wed, Feb 10, 2010 at 03:54:42PM -0700, Eric Shubert wrote: That's where they belong then. Thanks Michael. Michael Perzl wrote: Hi Eric, in my opinion they should go into ganglia-gmond as they are the dynamically loaded modules by gmond and nobody else except gmond uses them. At fedora we have the modules in the ganglia package but thinking about it ganglia-gmond is a better location. BTW you should be able to use a recent fedora spec file with rhel5 without any problems, the only reason that it's not in epel5 is that the long term support doesn't allow us to easily drop 3.0. I am looking at the possibility to add it as ganglia31 but the policy isn't clear yet. Cheers, Kostas Georgiou -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] libmetrics - second configure script?
On Fri, Jan 15, 2010 at 09:01:26PM +, Daniel Pocock wrote: I'm not too sure why we have a second configure script for the libmetrics tree - could anybody give me some background information about this? Is libmetrics used anywhere else, or is it intended to be? I think that the idea was to have libmetrics as a standalone library. AFAIK nobody is using it in this way at the moment, Luke Kanies did consider using it for facter (part of puppet) but gave up because of licensing issues[1]. Assuming that we would like other projects using libmetrics taking the library out of the ganglia tree and on it's own project will be an advantage long term on the other hand dropping the configure script will be much easier to do. Cheers, Kostas Georgiou [1] http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg02072.html -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] bootstrapping for 3.1.X series and 3.2.X
On Thu, Jan 07, 2010 at 02:36:38PM +, Daniel Pocock wrote: then you are going to need either 2 public resources for all release managers to use consistently or a coordinate release process were the package is generated and then independently binary packages are added to it before the announcement (which also means we have to agree on what is going to be used for building those RPM packages). Not necessarily - it may be sufficient to provide some scripts that release managers can use, as long as the hostnames can be easily configured somewhere. Then the release manager just needs to have the necessary machines available, but that is not so difficult either thanks to Xen, VMware, etc. With mock[1] you probably don't need anything beyond one machine, mock basically runs rpmbuild inside a minimal chroot (which also installs for you) of a Fedora/Centos (might also be usable for Suse) release that you want and the rpm dependencies of the package. Given that most developers will have a Fedora/Centos/RHEL/Debian machine building rpms shouldn't be a problem really. Cheers, Kostas [1] http://fedoraproject.org/wiki/Projects/Mock -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Undefined variable: rrd_options in functions.php
On Thu, Sep 10, 2009 at 04:32:01PM +0100, Daniel Pocock wrote: Daniel Pocock wrote: Jesse Becker wrote: I think that $rrd_options might be defined in the wrong spot? I'm getting lots of Undefined variable: rrd_options in functions.php error messages in trunk. It is used in functions.php in two places, as of r2049. Digging a bit, I see that $rrd_options is defined at the top of get_context.php (also from r2049). get_context.php includes functions.php, but not vice-versa. It seems that the logical place for this variable is actually in conf.php, and then functions.php could call 'include_once(conf.php)'. This sound okay, or am I missing something? The reason I didn't put $rrd_options in conf.php is because it's value is meant to be derived from other variables, in other words, the administrator should not change $rrd_options itself. However, if get_context.php is not always read, then maybe there is another approach: I can create a `do not touch/advanced users only' section at the bottom of conf.php, and things like $rrd_options can go there. I haven't seen any other comments on this, so it's now at the bottom of conf.php.in The only problem that I can think is that since conf.php doesn't get updated automatically by packaging tools (you don't want to overwrite the admins changes) a minor update is now more complicated than it should be. I'll prefer to see not admin changeable variables somewhere else than conf.php if they are required. Kostas -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Patch for multithread gmond
On Mon, Jul 13, 2009 at 04:49:17PM +0100, Daniel Pocock wrote: Brad Nicholes wrote: On 7/13/2009 at 8:17 AM, in message e6ccb7f50907130717i2f3dfd5fi1c69dbd4124a7...@mail.gmail.com, utopia zh utopia...@gmail.com wrote: Hi, While trying to use gmond to monitor our applications, we found some issues: - Metric collecting may take long time to finish, such examples include collecting master/slave status from LDAP, parsing web pages to get statistics. This is definitely an issue for some metrics, but not all It would be nice to have threading support within the agent rather than requiring each person who implements a module to re-invent the wheel. However, it might be best to make it configurable, so that `start a thread' becomes a config option for each module or metric. There could be some flag in Ganglia_25metric with which a developer can tell gmond that a metric must be run in it's own thread, and some configuration file option to allow the sysadmin to override the default setting of the flag. As a lazy sysadmin I would prefer not to have to deal with individual metric settings, so how about letting gmond decide if a thread is needed or not for a module depending on the modules response time and keep the config simple with something like max_threads = n, max_collection_delay = 4000ms? Kostas -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond python module interface
On Wed, Jan 28, 2009 at 06:09:48AM -0800, Gilad Raphaelli wrote: I think this is a well thought out email and I'm a little surprised at the lack of response to it. Is it because no one is actually using the gmond python module interface and hasn't had to make these types of decisions? I don't have a single gmetric script/module that only collects one metric and have used both the mutli-threaded collector approach and the single-thread hopeful/naive approach (having written the mysql metric module you mention). I agree that the multi-threaded design seems less prone to problems but in practice I haven't had any problems either way. That being said, I will be trying some of your suggestions or moving to threads for that mysql module when time permits. You could also modify gmond to run each plugin in a new thread, no sure if it is easy/possible though since I haven't looked at that part of the code yet. My frustration with the gmond python module interface is that it's not actually a complete replacement for gmetric scripts as I use them. Needing to know all of the metrics that a module will report before runtime makes for a lot of upfront work creating .pyconf files and doesn't allow for adding new metrics without restarting gmond. Being able to deploy one gmetric script that conditionally reports gmetrics based on what's running/hardware installed/etc on a box is a big advantage for a gmetric script over conditionally generating pyconf files and then still having to conditionally collect metrics in the actual gmond module. I expect most users just stick with the gmetric script in this case and handle scheduling themselves? I agree here, my suggestion in the Wildcard Configuration was to add a metric_autoconf function in the plugin and get gmond to use that to get the collection_group from there. In any case for the modules that I develop I call this function from main so I can do python module.py -t module.pyconf with a similar effect. This doesn't solve all problems though, I would like to be able to just start mysql in a host or add a new disk for example and have metrics for them without touching the configuration at all. Maybe extending the configuration so in a .pyconf you can write something like collection_group { autoconf = 300 } to have gmond interogate the plugin every 5 mins to get a new collection group will be enough, I need to think about it a bit more... Kostas -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoFandnetwork overload + [Feature] multiple requests per connoninteractive port
On Thu, Jan 15, 2009 at 04:54:23PM -0700, Brad Nicholes wrote: Since it is limited by REQUESTLEN, I am OK with it. So it sounds like we just need a fix for the off-by-one and the check for NULL on the malloc. What about this patch then? Kostas diff --git a/monitor-core/gmetad/server.c b/monitor-core/gmetad/server.c index eb21449..52b22b3 100644 --- a/monitor-core/gmetad/server.c +++ b/monitor-core/gmetad/server.c @@ -371,8 +371,7 @@ tree_report(datum_t *key, datum_t *val, void *arg) /* sacerdoti: This function does a tree walk while respecting the filter path. * Will return valid XML even if we have chosen a subtree. Since tree depth is - * bounded, this function guarantees O(1) search time. The recursive structure - * does not require any memory allocations. + * bounded, this function guarantees O(1) search time. */ static int process_path (client_t *client, char *path, datum_t *myroot, datum_t *key) @@ -421,7 +420,9 @@ process_path (client_t *client, char *path, datum_t *myroot, datum_t *key) len = q-p; /* +1 not needed as q-p is already accounting for that */ - element = malloc(len); + element = malloc(len + 1); + if ( element == NULL ) + return 1; strncpy(element, p, len); element[len] = '\0'; -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port
On Thu, Jan 15, 2009 at 01:41:53PM -0700, Brad Nicholes wrote: On 1/15/2009 at 8:56 AM, in message 496efa2a02ac0003a...@lucius.provo.novell.com, Brad Nicholes bnicho...@novell.com wrote: After taking a little closer look at the patch, I think we are OK as far as the recursive call to process_path() is concerned since this case is an error condition and should stop processing rather than continuing in the recursive loop. The other two concerns are still there however. I still think that we are off-by-one in the malloc call. It should be len+1 and I still think that we should limit the malloc to 256 rather than allowing it to be unlimited. I agree about the off-by-one but I am not too worried about a malloc limit, from what I can tell it can only get as high as REQUESTLEN. The malloc call needs to be checked for NULL and the comment that The recursive structure doesn't require any memory allocations is false now if malloc replaces the stack allocation. Kostas -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port
On Wed, Jan 14, 2009 at 07:00:29PM -0500, Jesse Becker wrote: On Wed, Jan 14, 2009 at 18:45, Kostas Georgiou k.georg...@imperial.ac.uk wrote: On Wed, Jan 14, 2009 at 05:10:31PM -0500, Jesse Becker wrote: I suggest that once this is accepted, we release 3.1.2 ASAP. Any possibility for 3.0.8 as well? I don't see why not. Is that an offer to help test the patch? ;-) I've build rpms for Fedora 9 and I'll be pushing them shortly after I've done some local testing. If anyone wants to have a look they are availbale in koji [1]. Kostas [1] https://koji.fedoraproject.org/koji/buildinfo?buildID=78558 -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Wildcard Configuration
On Tue, Aug 26, 2008 at 09:01:43AM -0600, Brad Nicholes wrote: Thanks Rich. This got me thinking a little about how the same thing might be done only in a more generic way rather than just sequentially numbered metrics. I am wondering if, rather than looping through numbers, we actually tried to do some pattern matching over the known metrics. For every known metric that matches a metric pattern found in a collection_group, the metric name is constructed and the metric is added using the add_metric_series() function. Of course it would also have to do some kind of substitution for the metric title as well. This could simplify some of the metric configurations. +1 Another option will be to have each module able to print some usable configuration. I am attaching some patch from a quick hack that I wrote for the multidisk plugin to output a usable .conf file. Maybe extending the mudule api with a metric_autoconf function so you can do a gmond -t modulename modulename.conf (or something similar) is a good idea. It can get you going for simple setups where you just need to collect all available metrics. Kostas --- multidisk.py-orig 2008-07-14 23:28:12.0 +0100 +++ multidisk.py2008-08-28 17:20:00.0 +0100 @@ -32,6 +32,7 @@ import statvfs import os +import sys descriptors = list() @@ -123,9 +124,46 @@ '''Clean up the metric module.''' pass +def metric_autoconf(): +'''Prints example config.''' +print +modules { + module { +name = multidisk +language = python + } +} +print +collection_group { + collect_every = 10 + time_threshold = 50 + +for d in descriptors: +if d['name'].endswith('-disk_used'): +print ' metric {' +print 'name = %s' % (d['name']) +print value_threshold = 1.0\n } +print } + +print +collection_group { + collect_once = yes + time_threshold = 20 + +for d in descriptors: +if d['name'].endswith('-disk_total'): +print ' metric {' +print 'name = %s' % (d['name']) +print value_threshold = \1.0\\n } +print } + #This code is for debugging and unit testing if __name__ == '__main__': metric_init(None) +for o in sys.argv[1:]: +if o == -t: + metric_autoconf() + sys.exit(0) for d in descriptors: v = d['call_back'](d['name']) print 'value for %s is %f' % (d['name'], v) - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Any reason why the init script starts gmetad as root?
Hi, I just noticed that the default init scripts start gmetad as root and it then does a setuid to nobody. Is there a reason why gmetad needs extra priviliges? It doesn't need to bind privileged ports or anything else that requires root privileges as far as I can tell so a daemon --user nobody $GMETAD should be enough. Kostas - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [PATCH] remove version from libganglia package name
On Tue, Aug 12, 2008 at 07:38:15AM -0500, Martin Hicks wrote: On Mon, Aug 11, 2008 at 11:30:24PM +0200, Marcus Rueckert wrote: this is not the package version. it is the soname mangled a bit. the base idea behind it is, that you can install multiple version of the same library in parallel. Okay. I guess I just don't see this very often. Are we expecting to break library compatibility often? Even if you break library compatibility there is no need for the soname being encoded in the rpm name for the most current version. As it is now during an upgrade libganglia-$soname will stay installed even if nothing requires it anymore. The common practice in the rpm world is to not to use the soname for the latest version and have something like compat-ganglia-30 or libganglia30 for example for the older versions (no need to encode the minor version since changes there don't break compatibility). Kostas - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] disabling python plugins
On Mon, Aug 11, 2008 at 12:08:29PM -0600, Brad Nicholes wrote: Actually this issue really isn't that hard solve. The python capability is all implemented in the C module mod_python. When mod_python's init() function is called, it simply reads the python module directory from it's configuration and then starts a readdir() loop on that location. Anything that it finds in the python module directory that has an extension of .py, mod_python loads and calls the module's metric_init() function. This behavior could be changed by simply adding a Disabled=yes or Status=disabled type of directive to the module{} section for the python module in it's configuration. Since gmond has already read and parsed the entire gmond.conf including all 'included' .conf's, whether a python module is loaded or not should be a simple matter of looking for the module{} section for a specific module and checking to see if it is enabled or disabled. If mod_python finds a .py file for a module but can not find a corresponding module{} section in the configuration, the enabled status is assumed to be 'disabled' automatically. Otherwise the enabled state is assumed to be 'enabled' unless otherwise indicated. If a python module is determined to be disabled, mod_python would not load it and obviously the python module's metric_init() would never be called. Does that work? This will be perfect actually :) - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] disabling python plugins
Hi, I've noticed that the python scripts are always initialized even if they don't have any metrics collected. In most cases this isn't too bad (assuming module_init doesn't do anything too invasive) but for example tcpconn.py starts a thread running netstat that will keep running even if nobody cares about the data. One solution is to rename the .py files but this will not play well for people using packages since they will be readded in the next package update. Having a different package per plugin so they can be removed/added individually is also an option if no other way to disable a plugin is available. Kostas - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Auto configuration ideas
On Tue, Jul 29, 2008 at 09:51:58AM +0100, [EMAIL PROTECTED] wrote: I've been looking at how we currently deploy Ganglia configuration files in our organisation, and whether the process can be improved. Is anyone already working on any aspect of this issue: You should look at one of the existing configuration management tools for this. Have a look at puppet/cfengine/bcfg2 and choose one that you like. They will take care of pushing the configuration, restarting the server and all other issues for you. For my systems I am using puppet with something like: $ cat ganglia.pp define gmondconfig( $port, $cluster ) { file { /etc/gmond.conf: path = /etc/gmond.conf, mode = 644, owner = root, group = root, ensure = file, notify = Service[gmond], content = template(apps/ganglia/gmondconfig.erb) } service { gmond: ensure = running, enable = true, hasstatus = true, require = Package[ganglia-gmond] } package { ganglia-gmond: ensure = present } } The template is a standard config file with the cluster name and send/receive ports as variables. ... udp_recv_channel { port = %= port % ... Then for the different nodes I use definitions like the following to add the machine to the right cluster... gmondconfig { gmondconfig: port = 8690, cluster = foo.webservers } gmondconfig { gmondconfig: port = 8691, cluster = foo.tests } Cheers, Kostas - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Nagios plugin
Hi, Recently I started using a small script to use the information that ganglia collects from nagios. Other people have similar nagios plugins I suspect so adding one of them in contrib is probably a good idea. Any thoughts? I am attaching the plugin that I use, it's nothing special so if someone has a more polished version I would love to know. Cheers, Kostas #!/usr/bin/python import sys import getopt import socket import xml.parsers.expat class GParser: def __init__(self, host, metric): self.inhost =0 self.inmetric = 0 self.value = None self.host = host self.metric = metric def parse(self, file): p = xml.parsers.expat.ParserCreate() p.StartElementHandler = parser.start_element p.EndElementHandler = parser.end_element p.ParseFile(file) if self.value == None: raise Exception('Host/value not found') return float(self.value) def start_element(self, name, attrs): if name == HOST: if attrs[NAME]==self.host: self.inhost=1 elif self.inhost==1 and name == METRIC and attrs[NAME]==self.metric: self.value=attrs[VAL] def end_element(self, name): if name == HOST and self.inhost==1: self.inhost=0 def usage(): print Usage: check_ganglia \ -h|--host= -m|--metric= -w|--warning= \ -c|--critical= [-s|--server=] [-p|--port=] sys.exit(3) if __name__ == __main__: ## ganglia_host = '127.0.0.1' ganglia_port = 8649 host = None metric = None warning = None critical = None try: options, args = getopt.getopt(sys.argv[1:], h:m:w:c:s:p:, [host=, metric=, warning=, critical=, server=, port=], ) except getopt.GetoptError, err: print check_gmond:, str(err) usage() sys.exit(3) for o, a in options: if o in (-h, --host): host = a elif o in (-m, --metric): metric = a elif o in (-w, --warning): warning = float(a) elif o in (-c, --critical): critical = float(a) elif o in (-p, --port): ganglia_port = int(a) elif o in (-s, --server): ganglia_host = a if critical == None or warning == None or metric == None or host == None: usage() sys.exit(3) try: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((ganglia_host,ganglia_port)) parser = GParser(host, metric) value = parser.parse(s.makefile(r)) s.close() except Exception, err: print CHECKGANGLIA UNKNOWN: Error while getting value \%s\ % (err) sys.exit(3) if value = critical: print CHECKGANGLIA CRITICAL: %s is %.2f % (metric, value) sys.exit(2) elif value = warning: print CHECKGANGLIA WARNING: %s is %.2f % (metric, value) sys.exit(1) else: print CHECKGANGLIA OK: %s is %.2f % (metric, value) sys.exit(0) - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers