Re: [Ganglia-developers] template-based metric definition with PCRE

2010-01-04 Thread Carlo Marcelo Arenas Belon
On Mon, Dec 28, 2009 at 08:47:35PM +, Daniel Pocock wrote:
 Jesse Becker wrote:
  On Sat, Nov 28, 2009 at 08:42, Daniel Pocock dan...@pocock.com.au wrote:

  For those following trunk, you may need to bootstrap again, and make
  sure you have pcre available.
 
  I've linked gmond with libpcre so that it can dynamically match the
  metric names
 
  E.g., for the multicpu module, this is the only metric definition that
  needs to be given to enable all metrics on all cores:
 
   metric {
 name_match = multicpu_([a-z]+)([0-9]+)
 value_threshold = 1.0
 title = CPU-\\2 \\1
   }
 
  Oh, that's cool. +1 for me.

 I've backported to 3.1,

that was a bad idea IMHO, not because the implementation is bad, but because
3.1.3^H4^H5^H6 has been delayed long enough that adding anything else to it
this late and therefore resetting the testing cycle would be unwise;
specially considering there are other fairly significant fixes/features
waiting as well for backport as well.

there is also the fact that there was a valid (sorta, even if no code was
ever produced otherwise) comment on how this functionality should be made
optional (just like python is) and that wasn't discussed further (except
on this email after it was committed), neither corrected.

lastly, this code makes using multicpu so easy that it will be fairly obvious
the module never worked fine to begin with and so it would therefore make
more sense to also backport the needed fixes in r2116 (still incomplete), and
maybe even the configuration cleanup patches in r2118 which are also somehow
related, and also consider better ways to protect users of other platforms
than Linux and Cygwin from shooting themselves on the foot by trying to get
that module loaded, and which is an even bigger issue.

 $ svn log -r2160
 
 r2160 | d_pocock | 2009-12-28 20:43:54 + (Mon, 28 Dec 2009) | 1 line
 
 Patch for PCRE support (backport r2112 and r2119)

you are missing also r2150 and r2156 and some yet not existent patches
so that the dependency will be also in the RPM SPEC and documented in
the configuration man page and other needed places.

would suggest instead to revert this backport for now.

  I'd be interested in any feedback on the PCRE dependency.  If necessary,
  the feature can be made into a compile time option so that gmond can
  build without it.
 
  Yes, an optional compile time option is the way to do this.  Use it if
  present, but continue on without it if not present.

 Is PCRE not available on any platform that we want to support for 3.1?  

most likely available everywhere (just like python), but since not having
it would most likely only imply that the use of the corresponding
configuration wouldn't be possible it really makes sense to be considered
optional.

 If not, then I'll leave the patch as it is, too many #ifdefs can make 
 the code look messy.  The current implementation tries default locations 
 for pcre, or let's you specify your own version:
 
 ./configure --with-libpcre=/opt/pcre

ideally all that should be needed will be to also have a --enable-pcre or
equivalent flag to control how to disable support for this at compile time
just like it is possible for python (and that has proven to be really useful
for Solaris users AFAIK)

being able to use then autoconf like #defines to either enable a dummy
implementation of the missing functionality should be all that is needed
and shouldn't made the code that ugly (unless it needs refactoring anyway)

but I understand if you are looking instead to get the feature initially
released without having this as a posibility.

Carlo

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] bootstrapping for 3.1.X series and 3.2.X

2010-01-04 Thread Carlo Marcelo Arenas Belon
On Mon, Dec 28, 2009 at 10:51:51PM +, Daniel Pocock wrote:
 Carlo Marcelo Arenas Belon wrote:
  On Sun, Dec 06, 2009 at 09:28:04AM +, Daniel Pocock wrote:

  Carlo Marcelo Arenas Belon wrote:
  
  On Wed, Nov 25, 2009 at 11:00:21AM +, Daniel Pocock wrote:

  b) should the choice of bootstrap environment be locked for all 
  3.1.X, and only changed when increasing the minor version number 
  (e.g. when we go from 3.1 to 3.2)?
  
  no, but since our build system is full of hacks and not completely 
  reliable
  it might be a good idea to test no issues are introduced when looking at
  a new version.

  Ok, but if it is not locked down, let's consider some of the following:
 
  - document the version we expect
 
  agree, and that is what README.SVN is for, but first we have to decide which
  version to expect to begin with.

  - maybe add some check to configure that warns if a different version of  
  autotools is detected?
 
  configure doesn't depend autotools and so that would be the wrong place to 
  put
  any checks, but configure.in does and there is where bootstrapping should be
  aborted using AC_PREREQ and friends if using the wrong versions.

 Ok, should we use AC_PREREQ for 3.1.6, are there any disadvantages?

only if the macros will definitely break with an older version of autoconf
as otherwise all we are doing is enforcing a recommendation and preventing
anyone that might not have access to the newest version of autotools the
posibility of getting their own bootstrap (not much of an issue if we also
provide regular snapshots though).

  d) Can anyone volunteer to provide a stable bootstrap environment 
  (e.g. a virtual server) just for Ganglia?  Two such environments may 
  be needed, one for trunk and one for the current release branch.
  
  Matt did offer an EC2 instance if we could agree on an OS version :
 

  http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg05271.html
 
  I suggested Debian 5.0 (more conservative) or Fedora 12 (to be updated 
  more
  frequently) but as far as it is agreed, documented and reproducible 
  anything
  should work.

  I prefer Debian 5.0 (lenny), that is what I have on my laptop, home PC  
  and various other infrastructure that I use. Elsewhere I am using 
  RHEL3/4/5.
 
  Debian 5.0 is also what is being used for bugzilla AFAIK and so that might
  be a good option for consolidation.
 
 Who controls access to the Bugzilla server?  I wouldn't mind having use 
 of that as a bootstrap environment.

Matt would know, but I suspect that shell access might be probably problematic
to get and therefore unless we are talking about some continuous build system
like cruisecontrol or hudson making snapshots, it might be problematic
otherwise.

to easy using one of those systems r2144 (still incomplete) was committed
but would be nice to know which direction we are going anyway and for now it
would seem there is not much dialogue going on about the alternatives.

  We also have access to the OpenCSW build farm, and they are willing to  
  consider applications for access by Ganglia developers, so we could look  
  at that as a bootstrap environment.
 
  Bootstrapping is done only once per package and so wouldn't make sense to
  also do bootstrapping in Solaris.

 No, I wasn't suggesting we bootstrap separately for Solaris.  I was just 
 suggesting that we use the OpenCSW machine to bootstrap for all platforms.
 
 However, we would be stuck with whatever version of autotools is current 
 in the OpenCSW environment, and any decision to change the version there 
 would be out of our control.
 
 I think Debian 5.0 (lenny) is the final decision then

Debian 5.0 (lenny) x86 (32-bit) right?

 any final objections/comments?

the only one I can think of is that we sometimes used to provide RPMs with
the releases but that would be IMHO not that important considering that
fedora/EPEL might be the package most people would use anyway and at least
for fedora that used to be released fairly quickly after the source package
was posted on our site as the fedora packagers are also actively involved
in the list.

debian/ubuntu is usually also well represented, and that shouldn't be an
issue for releases in debian 5 anyway.

 Should we
 
 a) after fixing the other showstopper (fork issue), do we tag 3.1.6 and 
 let people test a tarball from Debian 5 autotools?, or
 
 b) make another 3.1.5 tarball using Debian 5 autotools, and put it in a 
 separate location for people to test before we tag?

Using debian for this release will break Solaris (I have a fix ready but
not yet backported) and also AIX (which Michael is maintaining outside
our tree and with patched generated based on the bootstrapping used for
3.1.2) :

  http://www.perzl.org/ganglia/

As I said in the STATUS file for 3.1, it would be better IMHO to delay
this decision until 3.1.7 (which hopefully would also include support
for AIX 

Re: [Ganglia-developers] [RFC] two step gmond initialization

2010-01-04 Thread Carlo Marcelo Arenas Belon
On Mon, Dec 28, 2009 at 11:05:36PM +, Daniel Pocock wrote:
 Carlo Marcelo Arenas Belon wrote:
 On Fri, Dec 18, 2009 at 04:18:16PM +, Daniel Pocock wrote:
   
 Carlo Marcelo Arenas Belon wrote:
 
 On Sun, Dec 13, 2009 at 10:49:00AM +, Daniel Pocock wrote:
   
 I could accept Brooks' solution, because it means gmond would 
 only fail  for something like out-of-memory, while any 
 configuration failure, port  in use, etc would cause it to fail 
 before detaching.
 
 If gmond still fails silently in some cases, you have not accomplished the
 objective that you were trying to obtain with r2025 anyway.
 
 I agree - it doesn't completely meet my goal, but it does at least   
 result in an error code for most types of bad configuration (or port 
 in  use)

 that part is OK, but you still have the added sideeffects of r2025 which
 would affect gmond in other interesting ways :

 * the metric (and module) initialization is now done by the parent and  
   expected to be inherited by the child, this means for example that 
 the
   parent will send (and receive) metric information (even before forking)
 * the suid is done by the parent and therefore the child isn't privileged
   (while the metric initialization was done as root), this would at least
   prevent anyone to bind gmond to privileged ports but also could result
   in complicated permission issues by metric collection scripts.

 as I said before I think the apr_poll issue with BSD should be taken as
 a warning of how the changes we were planning to do could have unintended
 sideeffects, and since moving the daemonization was only one way to solve
 the original problem, makes more sense to instead revert this change and
 evaluate alternatives.
   
 It is this line of argument, rather than the concerns about APR, that  
 makes me think reverting the change completely might be the way to go  
 for now, although the reason for the change is still a legitimate issue  
 and can be tracked in bugzilla.

agree, and I have to admit I am surprised this (which was my main argument)
somehow wasn't made clear until now.

indeed, the proposed alternative implementation of a fix was published just
because I agree that this issue is legitimate a bug (even if there might not
be a bugzilla for it) which needed to be corrected anyway.

 Maybe this type of disruptive change will have to come in 3.2, there we  
 can look at the various phases of initialisation more closely, prompt  
 people to review their modules, etc.

I was looking forward for 3.2 being the windows native version and therefore
if the problem with the initialization is solved in a windows incompatible
way then we are going to be left with no other option than to do this
disruptive change there anyway.

Carlo

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


[Ganglia-developers] I Need a Ganglia Developer!

2010-01-04 Thread Elian Jones
For a 6 month rolling contract in London with a major Investment Bank.
Please give me a call on 0207 220 0800 or email me on
ejo...@ikasinternational.com if you are interested.
 
Regards,
Elian.
 
 
--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev ___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers