Re: [Ganglia-developers] Ganglia gmetad thread stuck at TCP SYN SENT

2013-02-05 Thread Kostas Georgiou
On Fri, Jan 25, 2013 at 12:45:10PM +, Nicholas Satterly wrote:

 Does anyone have any ideas of how the connection could at least be
 timed out? Keep in mind that the gmetad is multi-threaded so I'm
 pretty sure that rules out the use of SIGALRM.
..,
 How could a 2 second timeout be enforced on this connect()?

You set O_NONBLOCK on the socket before the connect, run select
with a 2 sec timeout on the socket from there if you have a connection
(depending on if select hit the timeout or not and what getsockopt for
SO_ERROR returns) you set the socket back to blocking.

Did you see any failures when the machine went away after the connect?
I can't remember if we timeout while we are reading data from the
scoket.

--
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [Ganglia-general] [SECURITY] [IMPORTANT] Security issue in Ganglia Web

2012-08-03 Thread Kostas Georgiou
On Thu, Aug 02, 2012 at 09:35:47PM +, Daniel Pocock wrote:

 I remember that logic - but that doesn't really reflect what the
 distributions do
 
 Just backporting/cherry-picking the most essential security fixes to an
 old branch shouldn't be a big pain though
 
  I believe Kostas has already pushed out patches for 3.1.7 to
  Fedora/EPEL so in terms of distributed binary packages I guess we
  should be fine?
 
 Debian 6 also has 3.1.x - when this was mentioned before, I thought
 Kostas was updating the 3.1 branch and then the Debian and Fedora
 packages could all be built from the same tarball
 
 Kostas, could you possibly commit what you did onto the 3.1 branch and
 then I'll release a tarball?

As you noticed there are no branches (or tags) for old releases :(
If we can ressurect them I'll be happy to push the fixes.

Kostas

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Protocol Efficiency Ideas

2012-01-29 Thread Kostas Georgiou
On Sun, Jan 29, 2012 at 08:10:08AM -0500, Jeff Buchbinder wrote:

 On Sat, Jan 28, 2012 at 11:21 PM, Kostas Georgiou
 k.georg...@atreides.org.uk wrote:
  On Fri, Jan 27, 2012 at 06:59:06AM -0800, Im Root wrote:
 
  I believe that adding json would be a mistake. The reason is that when
  users install the main package there would be now a dependency on
  having json installed. It just adds to the complexity and helps to
  perpetuate RPM hell. I've had to deal with installing json in the past
  and it's been awful. It may be nice for a developer but not so nice
  for the end users.
 
  There is no need for any extra dependencies to output json, my current
  testing code is basically a copy of the code that outputs xml. Parsing
  json is a completely different story though.
 
 That can be accomplished by embedding a copy of json-c, which wouldn't
 require any external dependencies. ( https://github.com/json-c/json-c )

This isn't going to help, the Fedora packaging guidelines forbid this so
I'll have to rip it out and use the system one for the fedora packages
anyway. Of course other distributions might be more forgiving, I have no
idea.

Personally I can't say I care much about JSON, what I am aiming at is
to make the input/output in gmond more modular and having a second
encoding helps me see clearer what should be exposed to a plugin.

Kostas

PS I've pushed my WIP in my tests/json branch if any one wants to
have a look.



--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Protocol Efficiency Ideas

2012-01-28 Thread Kostas Georgiou
On Fri, Jan 27, 2012 at 06:59:06AM -0800, Im Root wrote:

 I believe that adding json would be a mistake. The reason is that when
 users install the main package there would be now a dependency on
 having json installed. It just adds to the complexity and helps to
 perpetuate RPM hell. I've had to deal with installing json in the past
 and it's been awful. It may be nice for a developer but not so nice
 for the end users.

There is no need for any extra dependencies to output json, my current
testing code is basically a copy of the code that outputs xml. Parsing
json is a completely different story though.

Kostas

--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()

2011-02-23 Thread Kostas Georgiou
On Wed, Feb 23, 2011 at 05:12:03PM -0800, Bernard Li wrote:

 Hi Carlo:
 
 On Wed, Feb 23, 2011 at 9:42 AM, Bernard Li bern...@vanhpc.org wrote:
 
  I tested under EL5 and EL6 and it was't able to get past the initial
  buffer size.  I believe what I did was:
 
 Correction.  It works on EL6, but not on EL5:
 
 [CentOS 5.5 x86_64 with kernel 2.6.18-194.32.1.el5]
 
 read(3, 2.6.18-194.32.1., 16) = 16
 read(3, , 16) = 0
 
 [RHEL6b2 x86_64 with kernel 2.6.32-37.el6.x86_64]
 
 read(3, 2.6.32-37.el6.x8, 16) = 16
 read(3, 6_64\n, 16)   = 5
 
 The issue may be specific to files in /proc/sys, because I tried
 reading /proc/stat on CentOS 5.5 and it worked fine.

In any case the slurpfile resizable buffer doesn't really work :(
slurpfile only resizes a buffer if it's NULL but this is the case
only at the first call for a metric.

char *bp=NULL;
slurpfile(file, bp, 32);  
...next polling interval...
/* the buffer isn't resibale any more since it isn't NULL */
slurpfile(file, bp, 32);  

Kostas

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] sFlow counters in Ganglia

2010-10-08 Thread Kostas Georgiou
On Fri, Oct 08, 2010 at 11:52:29AM -0700, Peter Phaal wrote:

 Brad,
 
 Thanks for the information on spoofing in modules.
 
  The way that I would envision an sflow module working would be
  similar to the spoofing example module that is currently checked
  into the Ganglia SVN repository.  The spoofing module can be found
  at
  http://ganglia.svn.sourceforge.net/viewvc/ganglia/trunk/monitor-core/gmond/python_modules/example/spfexample.py?revision=1895view=markup
  .  Unfortunately it is a python module example rather than a C
  module but hopefully you can get the idea of what I am talking about
  from the code.  One application of this kind of spoofing module
  would be to load it under a gmond instance running on a VM host.  It
  would then query each VM running on the box and register a set of
  spoofed metrics for each VM.  From that point on, the module just
  reports the metrics for each spoofed VM and returns them as if gmond
  were running on each of the VMs.  I actually have another python
  module that does exactly that, but I haven't been able to release
  the source code for it yet.  You can also look at the modpython.c
  module to get an idea of how to do the spoofing in C code.  But then
  you guys have already worked with the spoofing code as part of the
  patch that you already did so you probably already know how that
  works.
  
  Basically an sflow module would be loaded like any other module and
  a collection interval would be set in the configuration file.  In
  the sflow module itself, register a spoofed metric for each managed
  sflow monitored node.  How you get the list of nodes to register is
  up to you.  It could be from the gmond.conf file, some other
  configuration file or by listening to the sflow data packets
  themselves.  
  
  The module would then start a thread that would read the XDR packets
  in exactly the same way that you are doing now in the gmond code.
  The thread would then store the current metric data temporarily for
  each monitored node that it knows about until the next collection
  cycle happens.  Then when gmond requests specific metric data for a
  specific spoofed node, you just return the last metric read from
  your temporary storage.
  
  Hopefully this helps.  The whole intent of the modular interface is
  to move all of the metric gathering out of gmond itself in order to
  make gmond more flexible and upgradable for the user without having
  to depend on all of us as Ganglia developers to get around to
  releasing a new version every time a bug is found or someone wants
  to collect a metric in a different way.  By implementing sflow as a
  module rather than within gmond itself, it would help to make sure
  gmond remains flexible and upgradable for the user.
  
 
 I think that there is still a problem with this approach, resampling
 the counters from an sFlow module would create ugly artifacts. The
 sFlow sampling rates are set in the sFlow agents and there is no
 guarantee that they would mesh well with a polling interval set in the
 gmond.conf file. There is also no guarantee that the sFlow agents will
 all be using the same polling intervals. To get accurate results you
 need to asynchronously post sFlow data into the gmond repository as it
 arrives.
 
 Another advantage of the current approach is that it involves zero
 configuration. sFlow agents are automatically discovered and added to
 the database as soon as they start sending sFlow to gmond. The sFlow
 gateway function automatically detects and adapts to changes in sFlow
 agent polling intervals etc.

Why not add an extra plugin system instead of the existing metrics one?
This way the sflow integration can be on it's own .so lib and on the
gmond side you select it with something like:
  
tcp|udp_recv_channel {
  proto = sFlow
  port = 6343
}

Kostas 

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] ganglia 3.1.2 files

2010-02-11 Thread Kostas Georgiou
On Wed, Feb 10, 2010 at 03:54:42PM -0700, Eric Shubert wrote:

 That's where they belong then. Thanks Michael.
 
 Michael Perzl wrote:
  Hi Eric,
  
  in my opinion they should go into ganglia-gmond as they are the 
  dynamically loaded modules by gmond and nobody else except gmond uses them.

At fedora we have the modules in the ganglia package but thinking about
it ganglia-gmond is a better location. 

BTW you should be able to use a recent fedora spec file with rhel5
without any problems, the only reason that it's not in epel5 is that the
long term support doesn't allow us to easily drop 3.0. I am looking at
the possibility to add it as ganglia31 but the policy isn't clear yet.

Cheers,
Kostas Georgiou

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] libmetrics - second configure script?

2010-02-08 Thread Kostas Georgiou
On Fri, Jan 15, 2010 at 09:01:26PM +, Daniel Pocock wrote:

 I'm not too sure why we have a second configure script for the 
 libmetrics tree - could anybody give me some background information 
 about this?
 
 Is libmetrics used anywhere else, or is it intended to be?

I think that the idea was to have libmetrics as a standalone library.
AFAIK nobody is using it in this way at the moment, Luke Kanies did
consider using it for facter (part of puppet) but gave up because of
licensing issues[1]. Assuming that we would like other projects using
libmetrics taking the library out of the ganglia tree and on it's own
project will be an advantage long term on the other hand dropping the
configure script will be much easier to do.

Cheers,
Kostas Georgiou

[1] 
http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg02072.html

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] bootstrapping for 3.1.X series and 3.2.X

2010-01-08 Thread Kostas Georgiou
On Thu, Jan 07, 2010 at 02:36:38PM +, Daniel Pocock wrote:

  then you are going to need either 2 public resources for all release 
  managers
  to use consistently or a coordinate release process were the package is
  generated and then independently binary packages are added to it before
  the announcement (which also means we have to agree on what is going to be
  used for building those RPM packages).
 

 Not necessarily - it may be sufficient to provide some scripts that 
 release managers can use, as long as the hostnames can be easily 
 configured somewhere.  Then the release manager just needs to have the 
 necessary machines available, but that is not so difficult either thanks 
 to Xen, VMware, etc.

With mock[1] you probably don't need anything beyond one machine, mock
basically runs rpmbuild inside a minimal chroot (which also installs for
you) of a Fedora/Centos (might also be usable for Suse) release that you
want and the rpm dependencies of the package. Given that most developers
will have a Fedora/Centos/RHEL/Debian machine building rpms shouldn't be
a problem really.

Cheers,
Kostas

[1] http://fedoraproject.org/wiki/Projects/Mock

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Undefined variable: rrd_options in functions.php

2009-09-10 Thread Kostas Georgiou
On Thu, Sep 10, 2009 at 04:32:01PM +0100, Daniel Pocock wrote:

 Daniel Pocock wrote:
  Jesse Becker wrote:

  I think that $rrd_options might be defined in the wrong spot?
 
  I'm getting lots of Undefined variable: rrd_options in functions.php 
  error 
  messages in trunk.  It is used in functions.php in two places, as of 
  r2049. 
  Digging a bit, I see that $rrd_options is defined at the top of 
  get_context.php (also from r2049).  get_context.php includes 
  functions.php, 
  but not vice-versa.
 
  It seems that the logical place for this variable is actually in conf.php, 
  and 
  then functions.php could call 'include_once(conf.php)'.
 
  This sound okay, or am I missing something?

  
  The reason I didn't put $rrd_options in conf.php is because it's value 
  is meant to be derived from other variables, in other words, the 
  administrator should not change $rrd_options itself.
 
  However, if get_context.php is not always read, then maybe there is 
  another approach: I can create a `do not touch/advanced users only' 
  section at the bottom of conf.php, and things like $rrd_options can go 
  there.

 I haven't seen any other comments on this, so it's now at the  bottom of 
 conf.php.in

The only problem that I can think is that since conf.php doesn't get
updated automatically by packaging tools (you don't want to overwrite
the admins changes) a minor update is now more complicated than it 
should be. I'll prefer to see not admin changeable variables somewhere
else than conf.php if they are required.

Kostas

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Patch for multithread gmond

2009-07-17 Thread Kostas Georgiou
On Mon, Jul 13, 2009 at 04:49:17PM +0100, Daniel Pocock wrote:

 Brad Nicholes wrote:
  On 7/13/2009 at 8:17 AM, in message
  
  e6ccb7f50907130717i2f3dfd5fi1c69dbd4124a7...@mail.gmail.com, utopia zh
  utopia...@gmail.com wrote:

  Hi,
 
  While trying to use gmond to monitor our applications, we found some 
  issues:
  - Metric collecting may take long time to finish, such examples
  include collecting master/slave status from LDAP, parsing web pages to
  get statistics.
  
 
 This is definitely an issue for some metrics, but not all
 
 It would be nice to have threading support within the agent rather than 
 requiring each person who implements a module to re-invent the wheel.  
 However, it might be best to make it configurable, so that `start a 
 thread' becomes a config option for each module or metric.  There could 
 be some flag in Ganglia_25metric with which a developer can tell gmond 
 that a metric must be run in it's own thread, and some configuration 
 file option to allow the sysadmin to override the default setting of the 
 flag.

As a lazy sysadmin I would prefer not to have to deal with individual
metric settings, so how about letting gmond decide if a thread is needed
or not for a module depending on the modules response time and keep the
config simple with something like max_threads = n, max_collection_delay = 
4000ms?

Kostas
 

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] gmond python module interface

2009-01-28 Thread Kostas Georgiou
On Wed, Jan 28, 2009 at 06:09:48AM -0800, Gilad Raphaelli wrote:

 
 I think this is a well thought out email and I'm a little surprised at
 the lack of response to it.  Is it because no one is actually using
 the gmond python module interface and hasn't had to make these types
 of decisions?  I don't have a single gmetric script/module that only
 collects one metric and have used both the mutli-threaded collector
 approach and the single-thread hopeful/naive approach (having written
 the mysql metric module you mention).  I agree that the multi-threaded
 design seems less prone to problems but in practice I haven't had any
 problems either way.  That being said, I will be trying some of your
 suggestions or moving to threads for that mysql module when time
 permits.

You could also modify gmond to run each plugin in a new thread, no sure
if it is easy/possible though since I haven't looked at that part of the
code yet.

 My frustration with the gmond python module interface is that it's not
 actually a complete replacement for gmetric scripts as I use them.
 Needing to know all of the metrics that a module will report before
 runtime makes for a lot of upfront work creating .pyconf files and
 doesn't allow for adding new metrics without restarting gmond.  Being
 able to deploy one gmetric script that conditionally reports gmetrics
 based on what's running/hardware installed/etc on a box is a big
 advantage for a gmetric script over conditionally generating pyconf
 files and then still having to conditionally collect metrics in the
 actual gmond module.  I expect most users just stick with the gmetric
 script in this case and handle scheduling themselves?

I agree here, my suggestion in the Wildcard Configuration was to add a
metric_autoconf function in the plugin and get gmond to use that to get
the collection_group from there. In any case for the modules that I
develop I call this function from main so I can do python module.py -t 
module.pyconf with a similar effect.

This doesn't solve all problems though, I would like to be able to just
start mysql in a host or add a new disk for example and have metrics for
them without touching the configuration at all. Maybe extending the
configuration so in a .pyconf you can write something like
collection_group {
  autoconf = 300
}
to have gmond interogate the plugin every 5 mins to get a new collection
group will be enough, I need to think about it a bit more...

Kostas


--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoFandnetwork overload + [Feature] multiple requests per connoninteractive port

2009-01-16 Thread Kostas Georgiou
On Thu, Jan 15, 2009 at 04:54:23PM -0700, Brad Nicholes wrote:

 Since it is limited by REQUESTLEN, I am OK with it.  So it sounds like
 we just need a fix for the off-by-one and the check for NULL on the
 malloc.

What about this patch then?

Kostas 
diff --git a/monitor-core/gmetad/server.c b/monitor-core/gmetad/server.c
index eb21449..52b22b3 100644
--- a/monitor-core/gmetad/server.c
+++ b/monitor-core/gmetad/server.c
@@ -371,8 +371,7 @@ tree_report(datum_t *key, datum_t *val, void *arg)

 /* sacerdoti: This function does a tree walk while respecting the filter path.
  * Will return valid XML even if we have chosen a subtree. Since tree depth is
- * bounded, this function guarantees O(1) search time. The recursive structure 
- * does not require any memory allocations. 
+ * bounded, this function guarantees O(1) search time.
  */
 static int
 process_path (client_t *client, char *path, datum_t *myroot, datum_t *key)
@@ -421,7 +420,9 @@ process_path (client_t *client, char *path, datum_t 
*myroot, datum_t *key)
   
  len = q-p;
  /* +1 not needed as q-p is already accounting for that */
- element = malloc(len);
+ element = malloc(len + 1);
+ if ( element == NULL )
+ return 1;
  strncpy(element, p, len);
  element[len] = '\0';
   
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-15 Thread Kostas Georgiou
On Thu, Jan 15, 2009 at 01:41:53PM -0700, Brad Nicholes wrote:

  On 1/15/2009 at 8:56 AM, in message
 496efa2a02ac0003a...@lucius.provo.novell.com, Brad Nicholes
 bnicho...@novell.com wrote:
 
 After taking a little closer look at the patch, I think we are OK as
 far as the recursive call to process_path() is concerned since this
 case is an error condition and should stop processing rather than
 continuing in the recursive loop.  The other two concerns are still
 there however.  I still think that we are off-by-one in the malloc
 call.  It should be len+1 and I still think that we should limit the
 malloc to 256 rather than allowing it to be unlimited. 

I agree about the off-by-one but I am not too worried about a malloc
limit, from what I can tell it can only get as high as REQUESTLEN.

The malloc call needs to be checked for NULL and the comment that
The recursive structure doesn't require any memory allocations is
false now if malloc replaces the stack allocation.

Kostas

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-14 Thread Kostas Georgiou
On Wed, Jan 14, 2009 at 07:00:29PM -0500, Jesse Becker wrote:

 On Wed, Jan 14, 2009 at 18:45, Kostas Georgiou
 k.georg...@imperial.ac.uk wrote:
  On Wed, Jan 14, 2009 at 05:10:31PM -0500, Jesse Becker wrote:
 
  I suggest that once this is accepted, we release 3.1.2 ASAP.
 
  Any possibility for 3.0.8 as well?
 
 I don't see why not.  Is that an offer to help test the patch?  ;-)

I've build rpms for Fedora 9 and I'll be pushing them shortly after I've
done some local testing. If anyone wants to have a look they are
availbale in koji [1].

Kostas

[1] https://koji.fedoraproject.org/koji/buildinfo?buildID=78558

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Wildcard Configuration

2008-08-29 Thread Kostas Georgiou
On Tue, Aug 26, 2008 at 09:01:43AM -0600, Brad Nicholes wrote:

   Thanks Rich.  This got me thinking a little about how the same thing
   might be done only in a more generic way rather than just
   sequentially numbered metrics.  I am wondering if, rather than
   looping through numbers, we actually tried to do some pattern
   matching over the known metrics.  For every known metric that
   matches a metric pattern found in a collection_group, the metric
   name is constructed and the metric is added using the
   add_metric_series() function.  Of course it would also have to do
   some kind of substitution for the metric title as well.  This could
   simplify some of the metric configurations.

+1

Another option will be to have each module able to print some usable
configuration. I am attaching some patch from a quick hack that I wrote
for the multidisk plugin to output a usable .conf file. Maybe extending
the mudule api with a metric_autoconf function so you can do a gmond -t
modulename  modulename.conf (or something similar) is a good idea.
It can get you going for simple setups where you just need to collect
all available metrics.

Kostas
--- multidisk.py-orig   2008-07-14 23:28:12.0 +0100
+++ multidisk.py2008-08-28 17:20:00.0 +0100
@@ -32,6 +32,7 @@
 
 import statvfs
 import os
+import sys
 
 descriptors = list()
 
@@ -123,9 +124,46 @@
 '''Clean up the metric module.'''
 pass
 
+def metric_autoconf():
+'''Prints example config.'''
+print 
+modules {
+  module {
+name = multidisk
+language = python
+  }
+}
+print 
+collection_group {
+  collect_every = 10
+  time_threshold = 50
+
+for d in descriptors:
+if d['name'].endswith('-disk_used'):
+print '  metric {'
+print 'name = %s' % (d['name'])
+print value_threshold = 1.0\n  }
+print }
+
+print 
+collection_group {
+  collect_once = yes
+  time_threshold = 20
+
+for d in descriptors:
+if d['name'].endswith('-disk_total'):
+print '  metric {'
+print 'name = %s' % (d['name'])
+print value_threshold = \1.0\\n  }
+print }
+
 #This code is for debugging and unit testing
 if __name__ == '__main__':
 metric_init(None)
+for o in sys.argv[1:]:
+if o == -t:
+  metric_autoconf()
+  sys.exit(0)
 for d in descriptors:
 v = d['call_back'](d['name'])
 print 'value for %s is %f' % (d['name'],  v)
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


[Ganglia-developers] Any reason why the init script starts gmetad as root?

2008-08-13 Thread Kostas Georgiou
Hi,

I just noticed that the default init scripts start gmetad as root and it
then does a setuid to nobody. Is there a reason why gmetad needs extra
priviliges? It doesn't need to bind privileged ports or anything else
that requires root privileges as far as I can tell so a 
daemon --user nobody $GMETAD
should be enough.

Kostas

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [PATCH] remove version from libganglia package name

2008-08-12 Thread Kostas Georgiou
On Tue, Aug 12, 2008 at 07:38:15AM -0500, Martin Hicks wrote:

 
 On Mon, Aug 11, 2008 at 11:30:24PM +0200, Marcus Rueckert wrote:
  
  this is not the package version. it is the soname mangled a bit. the
  base idea behind it is, that you can install multiple version of the
  same library in parallel.
 
 Okay.  I guess I just don't see this very often.  Are we expecting to
 break library compatibility often?

Even if you break library compatibility there is no need for the soname
being encoded in the rpm name for the most current version. As it is now
during an upgrade libganglia-$soname will stay installed even if nothing
requires it anymore.

The common practice in the rpm world is to not to use the soname for the
latest version and have something like compat-ganglia-30 or libganglia30
for example for the older versions (no need to encode the minor version
since changes there don't break compatibility).

Kostas

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] disabling python plugins

2008-08-12 Thread Kostas Georgiou
On Mon, Aug 11, 2008 at 12:08:29PM -0600, Brad Nicholes wrote:

 Actually this issue really isn't that hard solve.  The python
 capability is all implemented in the C module mod_python.  When
 mod_python's init() function is called, it simply reads the python
 module directory from it's configuration and then starts a readdir()
 loop on that location.  Anything that it finds in the python module
 directory that has an extension of .py, mod_python loads and calls the
 module's metric_init() function.  
 
 This behavior could be changed by simply adding a Disabled=yes or
 Status=disabled type of directive to the module{} section for the
 python module in it's configuration.  Since gmond has already read and
 parsed the entire gmond.conf including all 'included' .conf's, whether
 a python module is loaded or not should be a simple matter of looking
 for the module{} section for a specific module and checking to see if
 it is enabled or disabled.  If mod_python finds a .py file for a
 module but can not find a corresponding module{} section in the
 configuration, the enabled status is assumed to be 'disabled'
 automatically.  Otherwise the enabled state is assumed to be 'enabled'
 unless otherwise indicated.  If a python module is determined to be
 disabled, mod_python would not load it and obviously the python
 module's metric_init() would never be called.
 
 Does that work?

This will be perfect actually :)

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


[Ganglia-developers] disabling python plugins

2008-08-04 Thread Kostas Georgiou
Hi,

I've noticed that the python scripts are always initialized even if they
don't have any metrics collected. In most cases this isn't too bad
(assuming module_init doesn't do anything too invasive) but for example
tcpconn.py starts a thread running netstat that will keep running even
if nobody cares about the data. 

One solution is to rename the .py files but this will not play well for
people using packages since they will be readded in the next package
update. Having a different package per plugin so they can be
removed/added individually is also an option if no other way to disable
a plugin is available.

Kostas 

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Auto configuration ideas

2008-07-29 Thread Kostas Georgiou
On Tue, Jul 29, 2008 at 09:51:58AM +0100, [EMAIL PROTECTED] wrote:

 I've been looking at how we currently deploy Ganglia configuration files
 in our organisation, and whether the process can be improved.
 
 Is anyone already working on any aspect of this issue:


You should look at one of the existing configuration management tools
for this. Have a look at puppet/cfengine/bcfg2 and choose one that you
like. They will take care of pushing the configuration, restarting the
server and all other issues for you.

For my systems I am using puppet with something like:
$ cat ganglia.pp
define gmondconfig( $port, $cluster ) {
  file { /etc/gmond.conf:
path = /etc/gmond.conf,
mode = 644,
owner = root,
group = root,
ensure = file,
notify = Service[gmond],
content = template(apps/ganglia/gmondconfig.erb)
  }
  service { gmond:
ensure = running,
enable = true,
hasstatus = true,
require = Package[ganglia-gmond]
  }
  package { ganglia-gmond:
ensure = present
  }
}

The template is a standard config file with the cluster name and
send/receive ports as variables.
...
udp_recv_channel {
  port = %= port %
...

Then for the different nodes I use definitions like the following
to add the machine to the right cluster...
 gmondconfig { gmondconfig: port = 8690, cluster = foo.webservers }
 gmondconfig { gmondconfig: port = 8691, cluster = foo.tests }
 
Cheers,
Kostas

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


[Ganglia-developers] Nagios plugin

2008-07-29 Thread Kostas Georgiou
Hi,

Recently I started using a small script to use the information that
ganglia collects from nagios. Other people have similar nagios plugins I
suspect so adding one of them in contrib is probably a good idea. Any
thoughts?

I am attaching the plugin that I use, it's nothing special so if someone
has a more polished version I would love to know.  

Cheers,
Kostas
#!/usr/bin/python

import sys
import getopt
import socket
import xml.parsers.expat

class GParser:
  def __init__(self, host, metric):
self.inhost =0
self.inmetric = 0
self.value = None
self.host = host
self.metric = metric

  def parse(self, file):
p = xml.parsers.expat.ParserCreate()
p.StartElementHandler = parser.start_element
p.EndElementHandler = parser.end_element
p.ParseFile(file)
if self.value == None:
  raise Exception('Host/value not found')
return float(self.value)

  def start_element(self, name, attrs):
if name == HOST:
  if attrs[NAME]==self.host:
self.inhost=1
elif self.inhost==1 and name == METRIC and attrs[NAME]==self.metric:
  self.value=attrs[VAL]

  def end_element(self, name):
if name == HOST and self.inhost==1:
  self.inhost=0

def usage():
  print Usage: check_ganglia \
-h|--host= -m|--metric= -w|--warning= \
-c|--critical= [-s|--server=] [-p|--port=] 
  sys.exit(3)

if __name__ == __main__:
##
  ganglia_host = '127.0.0.1'
  ganglia_port = 8649
  host = None
  metric = None
  warning = None
  critical = None

  try:
options, args = getopt.getopt(sys.argv[1:],
  h:m:w:c:s:p:,
  [host=, metric=, warning=, critical=, server=, port=],
  )
  except getopt.GetoptError, err:
print check_gmond:, str(err)
usage()
sys.exit(3)

  for o, a in options:
if o in (-h, --host):
   host = a
elif o in (-m, --metric):
   metric = a
elif o in (-w, --warning):
   warning = float(a)
elif o in (-c, --critical):
   critical = float(a)
elif o in (-p, --port):
   ganglia_port = int(a)
elif o in (-s, --server):
   ganglia_host = a

  if critical == None or warning == None or metric == None or host == None:
usage()
sys.exit(3)
   
  try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((ganglia_host,ganglia_port))
parser = GParser(host, metric)
value = parser.parse(s.makefile(r))
s.close()
  except Exception, err:
print CHECKGANGLIA UNKNOWN: Error while getting value \%s\ % (err)
sys.exit(3)

  if value = critical:
print CHECKGANGLIA CRITICAL: %s is %.2f % (metric, value)
sys.exit(2)
  elif value = warning:
print CHECKGANGLIA WARNING: %s is %.2f % (metric, value)
sys.exit(1)
  else:
print CHECKGANGLIA OK: %s is %.2f % (metric, value)
sys.exit(0)
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers