[Ganglia-developers] Change in default RRAs
Hi friends, just found out by chance that the default RRAs for "gmetad" have changed some time ago? What was the rationale for this? This is an almost 59x increase in database size. OK, disk is cheap, but still a factor, especially for large clusters. Just curious Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de-- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gemtric4j and spoofing
Hi Daniel, so, what is the story. If I look at the code in ./gmetric4j/src/main/java/ganglia/gmetric/Protocolv31x.java, I find: public void announce( String name, String value, GMetricType type, String units, GMetricSlope slope, int tmax, int dmax, String groupName) throws Exception { Ganglia_metric_id metric_id = new Ganglia_metric_id(); metric_id.host = InetAddress.getLocalHost().getHostName(); metric_id.name = name; metric_id.spoof = false; if ( isTimeToSendMetadata( name ) ) { encodeGMetric( metric_id, name, value, type, units, slope, tmax, dmax, groupName ); send(xdr.getXdrData(),xdr.getXdrLength()); } encodeGValue( metric_id, value ); send(xdr.getXdrData(),xdr.getXdrLength()); } which seems to indicate that spoofing is off by default for the V3.1 protocol ?!? Thanks Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de > > From: Martin Knoblauch >To: "dan...@pocock.com.au" >Cc: "ganglia-developers@lists.sourceforge.net" > >Sent: Friday, September 14, 2012 5:33 PM >Subject: gemtric4j and spoofing > > >Hi Daniel, > > > seems you are the master of "gmetric4j". Short question: does it support >spoofing? > > >Cheers >Martin > >-- >Martin Knoblauch >email: k n o b i AT knobisoft DOT de >www: http://www.knobisoft.de > >-- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] gemtric4j and spoofing
Hi Daniel, seems you are the master of "gmetric4j". Short question: does it support spoofing? Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de-- Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.3.5 tagged
> From: Bernard Li >To: Daniel Pocock >Cc: "ganglia-developers@lists.sourceforge.net" >; Carlo Marcelo Arenas Belon > >Sent: Tuesday, March 27, 2012 10:24 PM >Subject: Re: [Ganglia-developers] 3.3.5 tagged > > >I thought the original idea was that the "web" component was going to be a >separate entity and thus can be released at different cycles from the other >components. If we are again releasing web at the same time as ganglia-core >then this is back to how things were originally when the code is in SVN. > >Just my $0.02. > >Cheers, > >Bernard > Add my coins to that. Wanted to write the same. Just lets be progressive and split out the WEB part from the data collection part and let them move at their own pace. That cross project link in the repo is most confusing anyway. Cheers Martin >On Tuesday, March 27, 2012, Daniel Pocock wrote: >> On 27/03/2012 16:52, Carlo Marcelo Arenas Belon wrote: >>> On Mon, Mar 26, 2012 at 04:50:18PM +0100, Daniel Pocock wrote: Release 3.3.5 The release has now been tagged in git commit = 9db9beea062c7ce5e5b4d10ed553c9b7cea7642e >>> >>> wrong bundle : >>> >>> carenas@dell ~/src/git/ganglia $ git describe --tags >>> 3.3.5 >>> carenas@dell ~/src/git/ganglia $ cd web/ >>> carenas@dell ~/src/git/ganglia/web $ git describe --tags >>> 3.3.2-3 >>> >>> while web has since had a lot more fixes added as shown by : >>> >>> carenas@dell ~/src/git/ganglia-web $ git describe --tags >>> 3.3.4-14-g7383ed8 >>> carenas@dell ~/src/git/ganglia-web $ git diff --stat 3.3.2-3.. | cat >>> Makefile | 2 +- >>> api/host.php | 9 ++--- >>> cluster_view.php | 4 ++-- >>> functions.php | 15 +++ >>> graph.php | 5 +++-- >>> header.php | 1 + >>> inspect_graph.php | 4 ++-- >>> templates/default/views_view.tpl | 16 >>> 8 files changed, 42 insertions(+), 14 deletions(-) >> >> Does this result in any actual breakage: does 3.3.5 break anything that >> was working in 3.3.1 or 3.3.0? If the answer to that question is `no', >> then we ignore this issue and 3.3.5 continues to be the release candidate. >> >> Are these all fixes that belong in the 3.3.x release, or are some of >> them features that belong in 3.4.x? >> >> I am not automatically increasing the pointer to the web submodule >> because I think that only crucial things should be accepted in 3.3.x >> releases from now on - the alternative is that >> a) we freeze the web repo against all non-essential commits until the >> release is finally finished >> b) I update the pointer to the web repo on every 3.3.x release attempt >> >> >> -- >> This SF email is sponsosred by: >> Try Windows Azure free for 90 days Click Here >> http://p.sf.net/sfu/sfd2d-msazure >> ___ >> Ganglia-developers mailing list >> Ganglia-developers@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/ganglia-developers >> >-- >This SF email is sponsosred by: >Try Windows Azure free for 90 days Click Here >http://p.sf.net/sfu/sfd2d-msazure >___ >Ganglia-developers mailing list >Ganglia-developers@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia-3.3.1: How to get back the old web interface ??
- Original Message - > From: Jeff Buchbinder > To: Bernard Li > Cc: "ganglia-developers@lists.sourceforge.net" > ; Martin Knoblauch > > Sent: Thursday, March 1, 2012 5:22 PM > Subject: Re: [Ganglia-developers] Ganglia-3.3.1: How to get back the old web > interface ?? > > On Thu, Mar 1, 2012 at 1:36 AM, Bernard Li wrote: >> Hi Vladimir: >> >> On Wed, Feb 29, 2012 at 7:03 AM, Vladimir Vuksan > wrote: >> >>> If you'd like to rework the templates to reinstate the old behavior > ie. call >>> it legacy templates that would be fine. >> >> Hmmm... is it a simple template change or is it more involved? I >> thought a whole bunch of the PHP files got changed so would it be >> possible to have the old-style and new style GUI co-exist in the same >> code tree? > > Besides templating and caching, the "look and feel" of the older could > be accomplished through templating, but they probably couldn't exist > in the same directory. > > That being said, nothing stops you from dropping a copy of the old UI > code elsewhere, since it uses the same gmetad data source. I have run > the old Ganglia web interface at the same time as the new one -- they > just have to be in different directories. > > Jeff > That is simple. But IMHO we should keep the old code in the repository and maybe even build RPMs for it. What speaks against putting the code back as "legacy-web"?. But frankly, I would have preferred that the new code was stored as "gweb-2" and the old code kept as "web". Or was there a real pressing reason for the reorganization. Thanks Martin -- Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia-3.3.1: How to get back the old web interface ??
> > From: Bernard Li >To: Martin Knoblauch >Cc: "ganglia-developers@lists.sourceforge.net" > >Sent: Wednesday, February 29, 2012 9:06 AM >Subject: Re: [Ganglia-developers] Ganglia-3.3.1: How to get back the old web >interface ?? > >Hi Martin: > >On Tue, Feb 28, 2012 at 2:41 AM, Martin Knoblauch wrote: > >> While I think it is an "interesting" development, I do not think it is >> ready for general consuption (more later). Big question: is there a way to >> configure back the old behaviour?? > >You can just download the old source, build the web RPM and install >that instead of what comes with the new version. It should just work. > Hi Bernard, hey - while correct and obvious, this is not what I asked for :-) I just think it would be good to have a config option that brings back the interface to the old look/simplicity/speed. >Cheers, > >Bernard > Martin > -- Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Ganglia gmond memory leak?
Hi, after running my plain gmond for about 48 hours under valgrind control I get the following summary: ==5428== LEAK SUMMARY: ==5428== definitely lost: 24,804 bytes in 13 blocks ==5428== indirectly lost: 20 bytes in 4 blocks ==5428== possibly lost: 352,725 bytes in 141 blocks ==5428== still reachable: 1,342,402 bytes in 2,189 blocks ==5428== suppressed: 0 bytes in 0 blocks ==5428== Reachable blocks (those to which a pointer was found) are not shown. ==5428== To see them, rerun with: --leak-check=full --show-reachable=yes So we may have "lost" about 400 KB plus 1.3 MB of stuff that we [likely] just fail to free on exit. Not to bad. ==5428== ==5428== For counts of detected and suppressed errors, rerun with: -v ==5428== Use --track-origins=yes to see where uninitialised values come from ==5428== ERROR SUMMARY: 97221 errors from 203 contexts (suppressed: 68 from 5) Those errors are kind of interesting. Most of it just comes out of Python with no traces of ganglia code. Not sure how to track/document that in a useful way. I will rerun the valgrind tracing with the suggested options on a 3.3.1 gmond for some time. Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de > > From: Martin Knoblauch >To: Aidan Wong ; "Ave-Lallemant, Nathan P" >; ganglia-general > >Sent: Monday, February 27, 2012 8:21 PM >Subject: Re: [Ganglia-general] Ganglia gmond memory leak? > > >Hi Aidan, > > > for what it is worth, I cannot reproduce the growing memory consumption on a >small 3.2.0 grid using only standard metrics in unicast mode. Running now for >a few hours. Will check again tomorrow. > > >Cheers >Martin > >-- >Martin Knoblauch >email: k n o b i AT knobisoft DOT de >www: http://www.knobisoft.de > > >> >> From: Aidan Wong >>To: "Ave-Lallemant, Nathan P" ; >>ganglia-general >>Sent: Thursday, February 23, 2012 8:34 AM >>Subject: Re: [Ganglia-general] Ganglia gmond memory leak? >> >> >>I've restarted the gmond process and memory usage drops until gmond hogs >>memory over time. Any Ganglia contributors who may want to chime in on this >>memory leak issue? I'm on Ganglia 3.2.0. Are there any improvements on >>version 3.3.1 addressing this issue? >> >> >>Thanks >> >>From: "Ave-Lallemant, Nathan P" >>Date: Wed, 22 Feb 2012 16:31:58 -0600 >>To: Aidan Wong , ganglia-general >> >>Subject: RE: Ganglia gmond memory leak? >> >> >> >> >>I have seen the same behavior in my environment but do not have a solution. >> >> >>Nathan >> >> >> >>From:Aidan Wong [mailto:aidanw...@attinteractive.com] >>Sent: Wednesday, February 22, 2012 4:10 PM >>To: ganglia-general >>Subject: [Ganglia-general] Ganglia gmond memory leak? >> >>Hi it looks like my install of gmond version 3.2.0 is leaking memory. The >>amount of resident used memory that the process uses, gets up pretty high and >>keeps increasing. >> >>USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND >>root 18647 0.0 9.9 2965464 1836268 ? Ss Jan14 11:24 >>/home/t/hadoop-ganglia-client/sbin/gmond -c >>/home/t/hadoop-ganglia-client/gmond.conf -p >>/home/t/hadoop-ganglia-client/logs/gmond.pid >> >>Is this a bug? Can anyone suggest a solution? >> >>Thank you >>>> >> CONFIDENTIALITY NOTICE: This e-mail and any files transmitted with it are >> intended solely for the use of the individual or entity to whom they are >> addressed and may contain confidential and privileged information protected >> by law. If you received this e-mail in error, any review, use, >> dissemination, distribution, or copying of the e-mail is strictly >> prohibited. Please notify the sender immediately by return e-mail and delete >> all copies from your system. >> >> >> >>-- >>Virtualization & Cloud Management Using Capacity Planning >>Cloud computing makes use of virtualization - but cloud computing >>also focuses on allowing computing to be delivered as a service. >>http://www.accelacomm.com/jaw/sfnl/114/51521223/ >>___ >>Ganglia-general mailing list >>ganglia-gene...
[Ganglia-developers] Ganglia-3.3.1: How to get back the old web interface ??
Hi, in order to stay current with current ganglia, I built SLES11 RPMs of 3.3.1 and installed them on one of my clusters. Only then I realized that the new WEB interface is now standard. To late ... :-( While I think it is an "interesting" development, I do not think it is ready for general consuption (more later). Big question: is there a way to configure back the old behaviour?? Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de-- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.3.0 released
Hi Vladimir, congratulations to the new release. Short questions on compatibility: will it work with a 3.1.7/3.2.0 RRD database? Will a 3.3.0 "gmetad" work with 3.1.7/3.2.0 gmonds? Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de > > From: Vladimir Vuksan >To: ganglia-developers@lists.sourceforge.net; >ganglia-us...@lists.sourceforge.net >Sent: Wednesday, February 1, 2012 11:38 PM >Subject: [Ganglia-developers] Ganglia 3.3.0 released > >This was gonna be the 4.0.0 release however we received feedback that >making a major version bump may get cause issues with various Linux >distribution packaging policies e.g. Fedora. Therefore it's been rebranded >as 3.3.0. Announcement is here > >http://ganglia.info/?p=489 > >Enjoy, > >Vladimir > >-- >Keep Your Developer Skills Current with LearnDevNow! >The most comprehensive online learning library for Microsoft developers >is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, >Metro Style Apps, more. Free future releases when you subscribe now! >http://p.sf.net/sfu/learndevnow-d2d >___ >Ganglia-developers mailing list >Ganglia-developers@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > >-- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Looking for 3.1.7 binaries/rpms for RHEL-5.x on IA64
> > From: Daniel Pocock >To: ganglia-developers@lists.sourceforge.net >Sent: Wednesday, December 21, 2011 7:58 AM >Subject: Re: [Ganglia-developers] Looking for 3.1.7 binaries/rpms for RHEL-5.x >on IA64 > > >> someone have those available? Species on the extinction list - I know, >> but a customer has a bunch of those. > >I believe I successfully built x86_64 RPMs using the spec file when >testing the 3.1.7 release > Hi Daniel, X86_64 != IA64 :-) IA64 really lacks support on all levels :-( RH does not even have it for EL6. Pity, because it is a great HPC CPU. I fell back to just building "gmond", as this is all I need. Was difficult enough without some of the "-devel" packages the customer did not have. Back to building GNU stuff :-) So, the urgency for me is over. Cheers Martin -- Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Looking for 3.1.7 binaries/rpms for RHEL-5.x on IA64
Hi folks, someone have those available? Species on the extinction list - I know, but a customer has a bunch of those. Thanks in advance Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de-- Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] revisiting bogus spikes
Hi Cameron, [adding the developers list] OK: 1) we write the unmodified data in line 233 to capture the "raw" counters. That is what we are using in line 227 for the comparison 2) "ns" is created and returned by "hash_lookup" 3) The ULONG_MAX logic in line 231 is there because we need to ensure that the result is always positive. Needed because the variables are unsigned. 4) "update_ifdata" is called once by "metric_init" and then every time one of the byte/pkts_in/out collectors fires Now this does not solve your problem ... Question: do you see any of the debug messages that should be created by "update_ifdata" in case of something unusual? That should help to get an idea on how the interface counters on your machine(s) look like. Lokk in "/var/log/messages", or just start "gmond" noninteractive. Hmm. Another question: do you compile "gmond" in 64-bit or 32-bit mode? The ULONG_MAX logic may/will fail in 32-bit mode, if the kernel is 64-bit. It could even be that the interface counters on 32-bit kernels are written as 64-bit values. Hope this helps Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de > >From: Cameron L. Spitzer >To: "ganglia-gene...@lists.sourceforge.net" > >Sent: Thu, April 28, 2011 3:21:04 AM >Subject: [Ganglia-general] revisiting bogus spikes > > >Once again I've been asked to make Ganglia usable on Linux hosts with the >Broadcom NIC with the 32-bit byte counters. >E.g., HP "Proliant" 580 G5, a rather popular machine where Ganglia doesn't >work >out of the box. > >So I'm trying to understand ganglia-3.1.7/libmetrics/linux/metrics.c again. > >In update_ifdata(), we parse /proc/net/dev for the current bytes and packets >in >and out. >There's a structure "ns" (declared where?) of type net_dev_stats, representing >the previous sample? >I'm not sure exactly what "ns" represents. > >There's a sanity check at line 227 "if ( rbi >= ns->rbi )" for whether the >counter went up or down. If it went down, we assume the counter rolled >around, >and guess the value is negative, and invert it, line 231. " l_bytes_in += >ULONG_MAX - ns->rbi + rbi;" >(I don't understand how that is supposed to work.) >Then, regardless of whether the sample passed or failed the sanity check, it's >saved in the "ns" structure. >Line 233, "ns->rpi = rpi;" > >After the parsing is all done, and the crazy value is in "ns", an optional >reasonableness test (REMOVE_BOGUS_SPIKES) >returns early if any of the numbers are extremely large. Otherwise it updates >the static running counts and then returns. >On our HP 580G5s, defining REMOVE_BOGUS_SPIKES had no effect. The network >traffic graphs become useless within a minute of starting gmond. > >The part I don't understand is when the line 227 check fails, we put the >known-bad data in "ns" anyway. > >I'd appreciate it if someone familiar with update_ifdata() could explain its >logic. When is this routine called? >(I can see modules/network/mod_net.c calls it via bytes_in_func(), but I >haven't >figured out when net_metric_handler() >is called. Maybe that would explain how bogus data in "ns" doesn't matter.) >Is there any way to keep way out-of-scale data out of these graphs? >Thanks for any help. > >-Cameron in Los Gatos > > > > > >This email message is for the sole use of the intended recipient(s) and may >contain confidential information. Any unauthorized review, use, disclosure >or >distribution is prohibited. If you are not the intended recipient, please >contact the sender by reply email and destroy all copies of the original >message. > > -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.1 branch backport proposals
Hi Bernard, for what it is worth... +1 for both. Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Bernard Li > To: ganglia-developers@lists.sourceforge.net > Sent: Tue, January 25, 2011 9:29:52 PM > Subject: [Ganglia-developers] 3.1 branch backport proposals > > Hi all: > > Could someone please vote on the following two backport proposals for 3.1? > > * build: Install manpages in appropriate locations when `make install` is >run > http://sourceforge.net/apps/trac/ganglia/changeset/2299 > http://sourceforge.net/apps/trac/ganglia/changeset/2301 > +1: bernardli > > * build: Include BUGS file to distribution tarball > http://sourceforge.net/apps/trac/ganglia/changeset/2455 > +1: bernardli > bernardli: depends on "Install manpages in appropriate locations > when `make install` is run" > > Thanks! > > Bernard > > -- > Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! > Finally, a world-class log management solution at an even better price-free! > Download using promo code Free_Logger_4_Dev2Dev. Offer expires > February 28th, so secure your free ArcSight Logger TODAY! > http://p.sf.net/sfu/arcsight-sfd2d > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > -i -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Fw: [Ganglia-general] How can gmetad be configured for 2 clusters?
really adding the developers ... - Forwarded Message > From: Martin Knoblauch > To: David Birdsong ; Whit Blauvelt > > Cc: ganglia-gene...@lists.sourceforge.net > Sent: Sat, November 13, 2010 8:34:43 AM > Subject: Re: [Ganglia-general] How can gmetad be configured for 2 clusters? > > - Original Message > > > From: David Birdsong > > To: Whit Blauvelt > > Cc: Martin Knoblauch ; > >ganglia-gene...@lists.sourceforge.net > > Sent: Fri, November 12, 2010 9:56:26 PM > > Subject: Re: [Ganglia-general] How can gmetad be configured for 2 clusters? > > > > On Fri, Nov 12, 2010 at 9:19 AM, Whit Blauvelt wrote: > > > On Fri, Nov 12, 2010 at 08:35:44AM -0800, Martin Knoblauch wrote: > > > > > >> In order to separate the two clusters, they need to run on different > >ports. > > >> > > >> In addition: when you list more than one node on the data_source, > > >> this > >does not > > >> define the cluster. I just adds failover capability. "gmetad" will > > >> only > >talk to > > >> one of the hosts at a time. If that fails, it will try the next on the > > >> > >list. > > > > > > Thanks Martin. That was the whole trick. I was making the assumption that > > > gmetad, being "meta," would be the gatherer of data from the nodes. > > > Understanding that the gmonds go ahead and consolidate that changes the > > > picture entirely. As my five-year-old sometimes says, "Silly me." > > > > > > Whit > > > > > While I can't argue against something that clearly fixed this for you, > > this doesn't sound correct and it would be nice to hear this > > clarified. > > > > Sure every host would have info about every other host, but each > > host's xml tree should have all the nodes in a nested in their > > corresponding cluster tags. Gmetad could hit any host and pick up > > info about both clusters on any host, but it should know to distribute > > the updates from the xml stream to the correct clusters and not 'cross > > pollinate' the two. > > > > As far as I know, every gmond just puts all the information it has inside > its > > own "cluster" tags. It does not care about the cluster tags it receives from > other gmonds. It has always been the task of gmetad to build up the correct >XML > > for the complete grid. Therefore it is vital that the gmond configuration > for > multiple clusters is "correct". > > One could argue that this behaviour of "gmond" needs improvement. One >solution > > could be that it aggregates only data coming from the "cluster". On the > other > hand, the "cluster" tag is just optional. What should a gmond without such a >tag > > do about data from tagged gmonds? I still favor correct configuration. In > any > case, I am adding ganglia developers to CC. > > But the confusion shows, that documentation might be lacking ... > > Cheers > Martin > > -- Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Replaced TemplatePower with Dwoo in trunk web code
Hi, my € 0.02 are: +1 for Trunk -1 for 3.1.x and 3.0.x. why am I opposed to the backports? Dwoo introduces considerable new infrastructure that *I* view not suitable for the "stable" and "legacy" trees. Both of them are bug-fix only in my opinion. It is fine for trunk, no question. Does the GPL licensing cause any real issues to end users? Just curious. The situation is pretty old by now anyway. Regards Martin ---------- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Brad Nicholes > To: Carlo Marcelo Arenas Belon ; Bernard Li > > Cc: ganglia-developers@lists.sourceforge.net > Sent: Thu, September 9, 2010 4:49:03 PM > Subject: Re: [Ganglia-developers] Replaced TemplatePower with Dwoo in trunk > web >code > > >>> On 9/9/2010 at 7:30 AM, in message <20100909133028.gb31...@sajinet.com.pe>, > Carlo Marcelo Arenas Belon wrote: > > On Wed, Sep 08, 2010 at 05:00:28PM -0700, Bernard Li wrote: > >> > >> Just a quick note saying that I have replaced TemplatePower with Dwoo > >> as our PHP templating engine in the trunk web code: > > > > why would we want to do that and throw away useful and time > > tested code? why would we do this in trunk destabilizing the > > development branch instead of in an independent branch which > > could be tested and validated before it gets merged into trunk > > if proven to at least be as usefull as the old code?, what is > > the scope of the work that is required on the templates and > > the rest of the PHP code to make this transition? > > > >> Please test and report any issues (especially security related). I'd > >> like to get this backported to 3.1.x and 3.0.x branches soon. > > > > -1 in both accounts, they are both maintenance branches and shouldn't > > have any major rearchitecture done in them. > > > >> Dwoo is modified/new BSD-licensed, which is the same as Ganglia. > > > > and this doesn't make a difference at all AFAIK with the fact that > > templatepower is GPL and some of the PHP code LGPL (for a discusion > > on that read old threads on this issue) specially considering that > > the frontend code is not "linked" with anything else. > > > > Bernard is talking about the Ganglia license as a whole which includes the >front end code. The fact that templatepower is GPL has a direct affect on >the >front end code and therefore affects Ganglia as a project. It is true that >the >front end code does not link with any of the backend (ie. gmond, gmetad), but >it does link with all of the PHP code. Therefore removing templatepower not >only from trunk but from the 3.1 and 3.0 branches as well, relieves our end >users from having to worry about licensing and any modifications that they >make >to their customized PHP code. Basically this move just brings Ganglia more >inline with regards to licensing. I don't see any harm in replacing >templatepower in trunk and then after a sufficient amount of testing, >backporting this change to the 3.1 and 3.0 branches. That is exactly the >purpose of trunk and complies with the guidelines that we have established on >the wiki. > > BTW, the 3.1.x branch is not a maintenance branch. It is currently our >release branch. Any new releases of Ganglia will be produced from the 3.1.x >branch. In addition, trunk is our development branch which should allow for >new contributions at any time. Therefore being able to move forward with a >change like incorporating Dwoo rather than TemplatePower in trunk, for >whatever >reason, is the appropriate thing to do. > > Brad > > > -- > This SF.net Dev2Dev email is sponsored by: > > Show off your parallel programming skills. > Enter the Intel(R) Threading Challenge 2010. > http://p.sf.net/sfu/intel-thread-sfd > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > & > -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] bogus spikes of network_report, is that a bug on the kernel?
Hi, can you tell us which NIC you are using (/sbin/lspci) and which version of the driver? When I wrote that REMOVE_BOGUS_SPIKES hack, it was because of a HW/FW problem in certain Broadcom devices. It was supposed to be fixed after kernel 2.6.9. The debug output from gmond suggests the overflow coming from the bytes_out counter (BO). And you are right, just lowering the thresholds is not useful in general. Cheers Martin > >From: 左扬 >To: ganglia-developers@lists.sourceforge.net >Sent: Wed, April 28, 2010 1:48:58 PM >Subject: [Ganglia-developers] bogus spikes of network_report, is that a bug on >the kernel? > >hello dear all~ > >we use the ganglia to generate the network traffic report, > >bu because of the bogus spikes up to 400p, I can see nothing...(as the graph >in the attachment, i modified the graph.d/network_report.php, change the unit >from bytes/s to bits/s ) > >and I read the code and then made some tests for days > > >in the libmetrics/linux/metrics.c:line 287, there is a switch, so i re-make >ganglia with CFLAGS=DREMOVE_BOGUS_SPIKES, and restart the gmond, > >after days, i found there were still spkes (about 4T) > >so I have to change the Line 292 from > >if ((l_bin > 1.0e13) || (l_bout > 1.0e13) || > >to > >if ((l_bin > 2.5e8) || (l_bout > 2.5e8) || /* 2Gbps , there are 2 gigabit NIC >on our server) >> >to avoid the spikes. > >I think that is not a good idea, the others may use the faster NIC, and then I >added some code in the update_ifdata() to log the contents of '/proc/net/dev >'(value of proc_net_dev.buffer) > > >logs from /var/log/message: >Apr 27 23:19:13 hostname /opt/ganglia/sbin/gmond[18465]: >update_ifdata(BO) - Overflow in rbo: 304634803029227 -> 630666266 >[1272381553] >>Apr 27 23:20:13 hostname /opt/ganglia/sbin/gmond[18465]: >update_ifdata(BO) - Overflow in rbi: 10458900526801464705 -> >38016437180368 [1272381613] >>Apr 27 23:20:13 hostname /opt/ganglia/sbin/gmond[18465]: >update_ifdata(BO) - Overflow in rpo: 219388676028 -> 219365592250 >[1272381613] > > >logs for the /proc/net/dev > >>-- 1272381433.117603 - >>Inter-| Receive| Transmit >>face |bytespackets errs drop fifo frame compressed multicast|bytes >>packets errs drop fifo colls carrier compressed >>lo:3143390051 39831988000 0 0 0 3143390051 >>39831988000 0 0 0 >>tunl0: 0 0000 0 0 00 >> 0000 0 0 0 >>eth0:38015520377153 1355870331350 85871160 0 0 6 >>304631801519418 219359254753000 0 0 0 >>eth1: 0 0000 0 0 00 >> 0000 0 0 0 > >>-- 1272381493.118502 - >>Inter-| Receive| Transmit >>face |bytespackets errs drop fifo frame compressed multicast|bytes >>packets errs drop fifo colls carrier compressed >>lo:3143407797 39832216000 0 0 0 3143407797 >>39832216000 0 0 0 >>tunl0: 0 0000 0 0 00 >> 0000 0 0 0 >>eth0:38015973907827 1355884370100 85871160 0 0 6 >>304634803029227 219361451245000 0 0 0 >>eth1: 0 0000 0 0 00 >> 0000 0 0 0 > >>-- 1272381553.121013 - >>Inter-| Receive| Transmit >>face |bytespackets errs drop fifo frame compressed multicast|bytes >>packets errs drop fifo colls carrier compressed >>lo:3143407797 39832216000 0 0 0 3143407797 >>39832216000 0 0 0 >>tunl0: 0 0000 0 0 00 >> 0000 0 0 0 >>eth0:10458900526801464705 1355646742930 85871160 0 0 >>219363599555 630666266 219388676028 772300 07723 0 >>eth1: 0 0000 0 0 00 >> 0000 0 0 0 > >>-- 1272381613.123535 - >>Inter-| Receive| Transmit >>face |bytespackets errs drop fifo frame compressed multicast|bytes >>packets errs drop fifo colls carrier compressed >>lo:3143444605 39832676000 0 0 0 3143444605 >>39832676000 0 0 0 >>tunl0: 0 0000 0 0 00 >> 0000 0 0
Re: [Ganglia-developers] bogus spikes of network_report, is that a bug on the kernel?
Hi, btw. this is the bug that REMOVE_BOGUS_SPIKES is/was supposed to fix: https://bugzilla.redhat.com/show_bug.cgi?id=515274 Cheers Martin - Original Message > From: Martin Knoblauch > To: 左扬 ; ganglia-developers@lists.sourceforge.net > Sent: Wed, April 28, 2010 6:32:32 PM > Subject: Re: [Ganglia-developers] bogus spikes of network_report, is that a > bug on the kernel? > > Hi, can you tell us which NIC you are using (/sbin/lspci) and which > version of the driver? When I wrote that REMOVE_BOGUS_SPIKES hack, it was > because of a HW/FW problem in certain Broadcom devices. It was supposed to be > fixed after kernel 2.6.9. The debug output from gmond suggests the > overflow coming from the bytes_out counter (BO). And you are right, just > lowering the thresholds is not useful in > general. Cheers Martin > >From: 左扬 < > ymailto="mailto:weichon...@gmail.com"; > href="mailto:weichon...@gmail.com";>weichon...@gmail.com> >To: > ymailto="mailto:ganglia-developers@lists.sourceforge.net"; > href="mailto:ganglia-developers@lists.sourceforge.net";>ganglia-developers@lists.sourceforge.net >Sent: > Wed, April 28, 2010 1:48:58 PM >Subject: [Ganglia-developers] bogus spikes > of network_report, is that a bug on the kernel? > >hello dear > all~ > >we use the ganglia to generate the network traffic report, > > >bu because of the bogus spikes up to 400p, I can see > nothing...(as the graph in the attachment, i modified the > graph.d/network_report.php, change the unit from bytes/s to bits/s > ) > >and I read the code and then made some tests for > days > > >in the libmetrics/linux/metrics.c:line 287, there is > a switch, so i re-make ganglia with CFLAGS=DREMOVE_BOGUS_SPIKES, and restart > the > gmond, > >after days, i found there were still spkes (about > 4T) > >so I have to change the Line 292 from > >if > ((l_bin > 1.0e13) || (l_bout > 1.0e13) > || > >to > >if ((l_bin > 2.5e8) || (l_bout > > 2.5e8) || /* 2Gbps , there are 2 gigabit NIC on our > server) >> >to avoid the > spikes. > >I think that is not a good idea, the others may use the > faster NIC, and then I added some code in the update_ifdata() to log the > contents of '/proc/net/dev '(value of > proc_net_dev.buffer) > > >logs from > /var/log/message: >Apr 27 23:19:13 hostname > /opt/ganglia/sbin/gmond[18465]: >update_ifdata(BO) - Overflow in rbo: > 304634803029227 -> 630666266 >[1272381553] >>Apr 27 23:20:13 > hostname /opt/ganglia/sbin/gmond[18465]: >update_ifdata(BO) - Overflow in > rbi: 10458900526801464705 -> >38016437180368 [1272381613] > >>Apr 27 23:20:13 hostname > /opt/ganglia/sbin/gmond[18465]: >update_ifdata(BO) - Overflow in rpo: > 219388676028 -> 219365592250 >[1272381613] > > >logs > for the /proc/net/dev > >>-- 1272381433.117603 > - >>Inter-| Receive > > | > Transmit >>face |bytespackets errs drop fifo frame > compressed multicast|bytespackets errs drop fifo colls carrier > compressed >>lo:3143390051 3983198800 > 0 0 0 >0 3143390051 39831988000 >0 0 > 0 >>tunl0: 0 0 > 000 0 > 0 00 >0000 0 > 0 > 0 >>eth0:38015520377153 1355870331350 8587116 > 0 0 0 >6 304631801519418 21935925475300 > 0 0 0 > 0 >>eth1: 0 0 > 000 0 > 0 00 > 0000 > 0 0 > 0 > >>-- 1272381493.118502 > - >>Inter-| Receive > > | > Transmit >>face |bytespackets errs drop fifo frame > compressed multicast|bytespackets errs drop fifo colls carrier > compressed >>lo:3143407797 3983221600 > 0 0 0 >0 3143407797 39832216000 >0 0 > 0 >>tunl0: 0 0 > 000 0 > 0 00 >0000 0 > 0 > 0 >>eth0:38015973907827 1355884370100 8587116 > 0 0 0 >6 304634803029227 21936145124500 > 0 0 0 > 0 >>eth1: 0 0 > 000 0
Re: [Ganglia-developers] [Ganglia-general] Ganglia 3.1.7 ready for testing
- Original Message > From: Daniel Pocock > To: kn...@knobisoft.de > Cc: ganglia-developers@lists.sourceforge.net; > "ganglia-gene...@lists.sourceforge.net" > > Sent: Tue, March 2, 2010 12:23:32 PM > Subject: Re: [Ganglia-developers] [Ganglia-general] Ganglia 3.1.7 ready for > testing > > > Thanks to those who provided feedback - any objections to making 3.1.7 > generally available? I would like to make it GA within the next 1-2 > days now. > unless there is a [severe] regression compared to 3.1.2 - just let it escape. You know, the perfect is the enemy of the good. Cheers Martin > > Michael Perzl wrote: > > I have successfully compiled and tested 3.1.7 on > > - AIX 5.1 ML04 > > - AIX 5.3 ML00 > > - AIX 5.3 TL07 > > - AIX 6.1 TL03 > > > > Regards, > > Michael > > > > On 02/22/2010 12:15 PM, Daniel Pocock wrote: > > > >> Just a reminder - any feedback is welcome, or feel free to discuss 3.1.7 > >> on IRC > >> > >> It would be good to have positive confirmation of which platforms this > >> has been tested on, so far, I have tested > >> - Debian lenny, > >> - RHEL3/4/5, > >> - CentOS 5, > >> - Solaris 8 and > >> - Cygwin. > >> > >> and Brad has done some testing on SLES10 > >> > >> Regards, > >> > >> Daniel > >> > >> Daniel Pocock wrote: > >> > >> > >>> I've tagged 3.1.7 and built a tarball: > >>> > >>> http://ganglia.info/testing/ganglia-3.1.7.tar.gz > >>> > >>> The md5sum for 3.1.7 is: 6aa5e2109c2cc8007a6def0799cf1b4c > >>> > >>> Since 3.1.6, only two things have changed and may need to be tested > >>> again by those who tested 3.1.6: > >>> - the build system (support for commas in CFLAGS) > >>> - the multicpu module - percentages reported differently > >>> > >>> This is not confirmation that the release is in GA status - a further > >>> notification will be sent when the testing period has elapsed without > >>> any serious defect. Users are invited to test the tarball and submit > >>> feedback. > >>> > >>> Please do not commit on branches/monitor-core-3.1 until after 3.1.7 > >>> goes GA, in case further tweaks are needed to facilitate a successful > >>> release. > >>> > >>> Below are the release notes from the STATUS file. Other documentation > >>> has also changed since 3.1.2 and should be reviewed: > >>> > >>> GANGLIA 3.1 STATUS: -*-text-*- > >>> Last modified at [$Date: 2010-02-17 11:01:08 + (Wed, 17 Feb 2010) $] > >>> > >>> The current version of this file can be found at: > >>> > >>>* > >>> > http://ganglia.svn.sourceforge.net/svnroot/ganglia/branches/monitor-core-3.1/STATUS > >>> > >>> Release history: > >>> > >>> 3.1.7 : Tagged: Feb 17, 2010 > >>> 3.1.6 : Tagged: Feb 4, 2010 (not released for GA) > >>> 3.1.5(hargrave) : Tagged: Nov 24, 2009 (not released for GA) > >>> 3.1.4(hargrave) : Tagged: Oct 26, 2009 (not released for GA) > >>> 3.1.3(avenger): Tagged: Sep 19, 2009 (not released for GA) > >>> 3.1.2(langley): Released: Feb 17, 2009 > >>> 3.1.1(wien) : Released: Sep 10, 2008 > >>> 3.1.0(amelia) : Released: Jul 30, 2008 > >>> > >>> Contributors looking for a mission: > >>> > >>>* Just do an egrep on "TODO", "XXX" or "FIXME" in the source. > >>>* Review the bug database at: http://bugzilla.ganglia.info/ > >>>* Open bugs in the bug database. > >>>* Implement a feature from the wishlist at: > >>> http://sourceforge.net/apps/trac/ganglia/wiki/ganglia_wish-list > >>> > >>> CURRENT RELEASE NOTES: > >>>(Please update this area with a brief description of bug fixes and > >>> enhancements that have been backported for the current release) > >>> > >>>Note: 3.1.3, 3.1.4, 3.1.5 and 3.1.6 never became GA, therefore, > >>>the release notes for all of them are combined below. > >>> > >>>3.1.7: > >>> > >>>* Fix build support for RHEL5/issue with commas in CFLAGS > >>>* multicpu module: show CPU utilization as a value between 0-100% for > >>> each core > >>> > >>>3.1.6: > >>> > >>>* Merge commit 1966 from trunk to fix "contrib/removespikes.pl" > >>>* Bootstrapping with Debian 5.0 (lenny) versions of autotools for > >>> this and future releases. > >>> > >>> > http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg05352.html > >>> > >>> > http://www.mail-archive.com/ganglia-gene...@lists.sourceforge.net/msg04688.html > >>>* Require user to explicitly specify sysconfdir when building from > >>> source, > >>> due to the fact that the old behavior was not consistent with the > >>> documented behavior. > >>>* Configuration files and scripts are now created during the install > >>> phase > >>> rather than during configure. This allows values such as > >>> @sysconfdir@ > >>> to be used in the template configuration files. > >>>* Abolish the use of release names - only release numbers will be used > >>> to distinguish v
Re: [Ganglia-developers] versioning confusion
- Original Message > From: Brad Nicholes > To: Martin Knoblauch ; Ramon Bastiaans > > Cc: "ganglia-developers@lists.sourceforge.net" > > Sent: Thu, February 4, 2010 4:33:31 PM > Subject: Re: [Ganglia-developers] versioning confusion > > >>> On 2/4/2010 at 6:50 AM, in message <4b6ad096.8030...@sara.nl>, Ramon > Bastiaans > wrote: > > Ahh, I see. > > > > On 02/04/2010 12:11 PM, Martin Knoblauch wrote: > >> > > If we were to make release candidates publically available with a release > number > other than major.minor.revision (for example 3.1.3rc1), we would also be > required to put this same release number in the source code itself to ensure > that there is a differentiation between a release candidate and the official > release since both would be made public (one during the testing period and > the > other being an official release). In order to transition the release > candidate, > in this case to an official release, we would be required to explode the > tarball, change the version number, retag SVN with the changed file and > revision > number, re-boot strap the source code, recreate the tarball and then finally > make the new tarball publically available under the final release number. > All > of this leaves the final tarball open to potential problems. It just makes > more > sense from a testing and release prospective to release the tarball in the > exact > condition as it was tested. This leaves no possibility for errors or > problems > creeping into the final released tarball. So, why not put the "rc" or "pre" Tag into an GANGLIA_EXTRA_VERSION and embed that into the code. That way there would be no confusion about what is in the tarball. Then we could have as many testing releases before the final one. SVN tags are cheap. What am I missing? I mean, now we are confuing people with skipped "releases". > > Another option would be to tag and tar the source code under the final > release > version number and make it available for testing. Then if bugs are found > during > testing, fix the bugs, retag and retar under the same version number. The > problem with this is that we could end up with multiple different tarballs > all > with the same version number publically available. The only way to tell > which > one was the real release would be by the date on the tarball rather than > version > number. > much to convoluted and confusing. Agreed. > Anyway, you can read more about this process on the Ganglia wiki page at > http://sourceforge.net/apps/trac/ganglia/wiki/how_project_works This > release > process was basically patterned after the way that the Apache httpd project > produces testing and official tarballs. > As I said in the past, that process may work for Apache. I do not see many skipped releases there. Maybe they have a more strict project management. Personally I think Ganglia is to small for that. Watching the discussions here, I see us spend more time on "process" than on "progress". But I maybe burnt by day-job. There I am forced to follow a lot of completetly bogus (technically) processes, just to make some beancounters an process-engineers happy (no, I dont like either). Cheers Martin -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] versioning confusion
- Original Message > From: Ramon Bastiaans > To: ganglia-developers@lists.sourceforge.net > Sent: Thu, February 4, 2010 10:19:03 AM > Subject: [Ganglia-developers] versioning confusion > > Hi, > > I haven't been following all the discussions lately, but I'm getting a bit > confused on what the latest Ganglia 3.1 release is. I see communications > about > 3.1.6 on the developer list, while the latest downloadable version on > www.ganglia.info is still 3.1.2. > > What happened with version 3.1.3, 3.1.4 and 3.1.5? > > Skipping versions is highly confusing to me and I don't really understand the > reasoning behind it. > > Or is the website simply not updated? > > 3.1.3 .. 3.1.5 were canned during testing. Apparently our process does not allow for fixing bugs/regressions between tagging and final release, so it was decided to never publish the intermediates. One of the reasons might be lack of good beta testing (which I am guilty of myself :-(, but I do not really understand, why we couldn't just keep 3.1.3 as the name of the release. Cheers Martin -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Policy on updating files in 3.1.x/contrib
- Original Message > From: Daniel Pocock > To: Martin Knoblauch > Cc: ganglia-developers@lists.sourceforge.net > Sent: Wed, February 3, 2010 4:47:23 PM > Subject: Re: [Ganglia-developers] Policy on updating files in 3.1.x/contrib > > Martin Knoblauch wrote: > > - Original Message > > > > > >> From: Daniel Pocock > >> To: Martin Knoblauch > >> Cc: ganglia-developers@lists.sourceforge.net > >> Sent: Wed, February 3, 2010 1:04:51 PM > >> Subject: Re: [Ganglia-developers] Policy on updating files in 3.1.x/contrib > >> > >> > >> > >>> what is the policy for updating files in the "contrib" directory of > >>> 3.0.x > and > >>> > >> 3.1.x? Do I need to do the backport approval dance (*)? Or can I just go > ahead. > >> The "removespikes.pl" file needs an update in the 3.1.x branch. > >> > >>> > >>> > >> Any updates to 3.1 require co-ordination from the release manager (myself) > when > >> a release is imminent (as it is now). Generally, let me know the commit > >> number(s) on trunk and then I will let you know if you can backport it on > 3.1.6 > >> or wait for 3.1.7. According to the policies, the release manager has the > final > >> say, but I am open to consider anyone who has an opinion for/against a > >> particular patch. > >> > >> To backport something for 3.0, it needs to meet two criteria: > >> > >> - formal approval (vote) > >> > >> - it must have already been backported to 3.1 > >> > > Hi Daniel, > > > > please consider r1966 for inclusion into 3.1.x Being in "contrib", it has > > no > (zero) impact on the core functionality. The current commit in 3.1 (r1699) is > plain broken. > > > > > > Ok, you can go ahead and apply this patch on monitor-core-3.1 > > Please include a note about it in the STATUS file as part of the commit Done. Unfortunatelly I botched the commit message, but the commit itself is OK. Cheers Martin -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Policy on updating files in 3.1.x/contrib
- Original Message > From: Daniel Pocock > To: Martin Knoblauch > Cc: ganglia-developers@lists.sourceforge.net > Sent: Wed, February 3, 2010 1:04:51 PM > Subject: Re: [Ganglia-developers] Policy on updating files in 3.1.x/contrib > > > > what is the policy for updating files in the "contrib" directory of 3.0.x > > and > 3.1.x? Do I need to do the backport approval dance (*)? Or can I just go > ahead. > The "removespikes.pl" file needs an update in the 3.1.x branch. > > > > > Any updates to 3.1 require co-ordination from the release manager (myself) > when > a release is imminent (as it is now). Generally, let me know the commit > number(s) on trunk and then I will let you know if you can backport it on > 3.1.6 > or wait for 3.1.7. According to the policies, the release manager has the > final > say, but I am open to consider anyone who has an opinion for/against a > particular patch. > > To backport something for 3.0, it needs to meet two criteria: > > - formal approval (vote) > > - it must have already been backported to 3.1 Hi Daniel, please consider r1966 for inclusion into 3.1.x Being in "contrib", it has no (zero) impact on the core functionality. The current commit in 3.1 (r1699) is plain broken. It is no issue for 3.0.x, as the file does not exist there. In order to avoid the process and because 3.0.x should only get critical fixes, I will not request inclusion. Cheers Martin -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Policy on updating files in 3.1.x/contrib
Hi, what is the policy for updating files in the "contrib" directory of 3.0.x and 3.1.x? Do I need to do the backport approval dance (*)? Or can I just go ahead. The "removespikes.pl" file needs an update in the 3.1.x branch. Cheers Martin (*) No, I never liked the process ... ---------- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [RFC] status update for removing ganglia release names from the code
Hi folks, my comment to that thread still stays [1], and I do not think that coming up with a name is so difficult that it actually can block a release (seems 3.1.x has bigger problems than that :-). But it really seems to be used nowhere (I thought the name was displayed on the web page ?), so lets come to a closure on this. So my proposals are: a) display the name on the web-page to make it non-dead or b) nuke it a) of course preferred. Cheers Martin [1] http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg04698.html-- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Carlo Marcelo Arenas Belon > To: Jesse Becker > Cc: ganglia-developers@lists.sourceforge.net > Sent: Thu, December 3, 2009 1:44:19 PM > Subject: [Ganglia-developers] [RFC] status update for removing ganglia > release names from the code > > Jesse > > There is a backport request for 3.1 labeled "build: remove ganglia release > name from the code" and that has a veto from you which I would like to see > reconsidered. > > your objection refers to a thread[1] that includes the explanation of why > this backport proposal is consistent with the consensus at that time (and > which has since changed[2]) as it only removes the name from the web > frontend configuration where it wasn't being used (dead code): > > > http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg04719.html > > It is important to note that since the proposal has been stalled for a long > time it won't be able to cleanly be backported from trunk and so to simplify > the reviewing process a conflict free version of it is attached to this > email. > > Carlo > > [1] > http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg04697.html > [2] > http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg05246.html -- Join us December 9, 2009 for the Red Hat Virtual Experience, a free event focused on virtualization and cloud computing. Attend in-depth sessions from your desk. Your couch. Anywhere. http://p.sf.net/sfu/redhat-sfdev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Question on the Ganglia RRD Database
Hi Michael, yes, your reply is very helpful as it corrects my understanding of how the RRD database is defined. Makes all a lot of sense now. So I am basically defining my RRAs as follows, with a 4 sec polling intervall: RRAs "RRA:AVERAGE:0.5:1:315" \ "RRA:AVERAGE:0.5:3:315" \ "RRA:AVERAGE:0.5:24:315" \ "RRA:AVERAGE:0.5:72:315" \ "RRA:AVERAGE:0.5:504:315" \ "RRA:AVERAGE:0.5:1008:315" \ "RRA:AVERAGE:0.5:2016:315" \ "RRA:AVERAGE:0.5:6048:315" \ "RRA:AVERAGE:0.5:12096:315" \ "RRA:AVERAGE:0.5:21600:370" With 5% margin from 20-minutes to 6-month and 370 days for the year. Cheers Martin - Original Message > From: Michael Perzl > To: ganglia-developers@lists.sourceforge.net > Sent: Wed, November 25, 2009 4:55:19 PM > Subject: Re: [Ganglia-developers] Question on the Ganglia RRD Database > > Hi Martin, > > I think this is how the default monitoring intervals have to be interpreted: > > RRAs \ > "RRA:AVERAGE:0.5:1:240"\ > "RRA:AVERAGE:0.5:24:240" \ > "RRA:AVERAGE:0.5:168:240" \ > "RRA:AVERAGE:0.5:672:240" \ > "RRA:AVERAGE:0.5:5760:370" > > used for > display of > Take 240 samples at15 seconds intervalshour > Take 240 samples at 24 × 15 seconds (= 6 minutes) intervalsday > Take 240 samples at 168 × 15 seconds (= 42 minutes) intervalsweek > Take 240 samples at 672 × 15 seconds (= 168 minutes) intervalsmonth > Take 370 samples at 5760 × 15 seconds (= 24 hours)intervalsyear > > So I think for your case you have to decide "how many" samples of the > chosen sampling rate (20 minutes, 8 hours etc.) you want to collect > which then determines the overall time interval covered by this specific > sampling rate. > > The main question is: How granular do you want the sampling rate to be > for a given time interval? > > This then determines: > a) the number of multiples of 15 seconds (to get the sampling rate) > b) the total number of samples required ("number of samples" x "sampling > rate" = "time interval") > > Hope that helps. > > Regards, > Michael > > On 11/25/2009 02:24 PM, Martin Knoblauch wrote: > > Hi folks, > > > > currently I am setting up monitoring for a cluster, where the demand is > > to > have additional monitoring intervalls. We want to see stuff like > "20-minutes", > "8-hours", "2-weeks", "3-month" and "6-month". Doing so seems easy, but I > have a > question on the RRA definitions. > > > > The default setup seems to be (assuming a 15 second polling intervall): > > > > hour-> "RRA:AVERAGE:0.5:1:244" > > day -> "RRA:AVERAGE:0.5:24:244" > > week -> "RRA:AVERAGE:0.5:168:244" > > month -> "RRA:AVERAGE:0.5:672:244" (more like 4-weeks :-) > > year-> "RRA:AVERAGE:0.5:5760:374" (367.86 days) > > > > > > So from hour to month we have 244 datapoints with nicely increasing > > steps > (1,24*1,7*24*1,4*7*24*1). So why are we doing it differently for the year? I > would have expected the "year" RRA to be "RRA:AVERAGE:0.5:8784:244" (366 > days). > Any particular reasons for this? > > > > Cheers > > Martin > > -- > > Martin Knoblauch > > email: k n o b i AT knobisoft DOT de > > www: http://www.knobisoft.de > > > > > > -- > > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > > trial. Simplify your report design, integration and deployment - and focus > > on > > what you do best, core application coding. Discover what's new with > > Crystal Reports now. http://p.sf.net/sfu/bobj-july > > ___ > > Ganglia-developers mailing list > > Ganglia-developers@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > > > > -- > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/b
[Ganglia-developers] Question on the Ganglia RRD Database
Hi folks, currently I am setting up monitoring for a cluster, where the demand is to have additional monitoring intervalls. We want to see stuff like "20-minutes", "8-hours", "2-weeks", "3-month" and "6-month". Doing so seems easy, but I have a question on the RRA definitions. The default setup seems to be (assuming a 15 second polling intervall): hour-> "RRA:AVERAGE:0.5:1:244" day -> "RRA:AVERAGE:0.5:24:244" week -> "RRA:AVERAGE:0.5:168:244" month -> "RRA:AVERAGE:0.5:672:244" (more like 4-weeks :-) year-> "RRA:AVERAGE:0.5:5760:374" (367.86 days) So from hour to month we have 244 datapoints with nicely increasing steps (1,24*1,7*24*1,4*7*24*1). So why are we doing it differently for the year? I would have expected the "year" RRA to be "RRA:AVERAGE:0.5:8784:244" (366 days). Any particular reasons for this? Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Avg Utilization
- Original Message > From: "Witham, Timothy D" <[EMAIL PROTECTED]> > To: Brad Nicholes <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL > PROTECTED]>; "ganglia-developers@lists.sourceforge.net" > > Sent: Monday, December 8, 2008 5:48:31 PM > Subject: Re: [Ganglia-developers] Avg Utilization > > >> Windows users are looking at the figure and thinking that `Avg > >> Utilization' refers to CPU utilization (from the cpu_report graph). > >> > >> Maybe both are needed: > >> > >> cluster_util_load: displayed as `Avg Utilization (Load)' > >> > >> cluster_util_cpu: displayed as `Avg Utilization (CPU)' > >> > >> Can anyone suggest a better way to name these figures, or would this be > >> an acceptable patch? > > I would remove all that and put the numbers on the graphs only, like done in > trunk. I think it is clearer to have the % numbers on all of the graphs. > That > way the user sees average values for all metrics plotted, right there in the > graph legend. So they can look at CPU or Load or Memory or anything, instead > of > wondering what the number off to the side is. > > I have voted this way in the 3.1 STATUS file. > can't vote at the moment, but I really like the display of the average(s) on the overview pages. My vote is "+1". IMO the graphs are already to cluttered for a quick overview and I have ideas for adding even more clutter (that matters to me, like the date the graph was produced :-). But if clutter doesn't matter, why not do both? I do not believe that the additional overhead is such a problem. Cheers Martin -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [PROPOSAL] Development procedure change...
- Original Message > From: Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]> > To: Brad Nicholes <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Tuesday, November 25, 2008 10:26:33 AM > Subject: Re: [Ganglia-developers] [PROPOSAL] Development procedure change... > > On Thu, Nov 20, 2008 at 11:32:02AM -0700, Brad Nicholes wrote: > > > > I have put together a procedure change proposal and posted it on the > > Ganglia wiki site: > > > > http://ganglia.wiki.sourceforge.net/Proposed+Release+Policy+Change) > > +1 +1 > > on one of those patches that can't be committed to trunk first to be > backported will then soon present a backport proposal as an exercise. > > do we have an ETA from when we could start stabilizing for the 3.1.2 > release under this new rules if approved? > > > Please review this procedure change proposal and provide us with any > > feedback that you have, either positive or negative. > > something that is not explicitally mentioned in the wiki page is that > since the proposals will be emailed, it is also expected that any > discussion about them, if needed, will be done in the list. > hmm. As far as I see, it is mentioned: " All new features or enhancement should be presented/discussed on the ganglia-developers mailing list." :-) Cheers Martin - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] printing output with rrdtool
- Original Message > From: Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]> > To: Jesse Becker <[EMAIL PROTECTED]> > Cc: ganglia-developers > Sent: Saturday, September 13, 2008 8:46:30 PM > Subject: Re: [Ganglia-developers] printing output with rrdtool > > On Fri, Sep 12, 2008 at 03:57:37PM -0400, Jesse Becker wrote: > > On Fri, Sep 12, 2008 at 13:33, Carlo Marcelo Arenas Belon > > > > > the following commit (r1754 in ganglia's svn) seems to be patching the fix > > > proposed by Jason as part of BUG37 and that was committed in r1595 and has > > > been left unconsistent (as not all uses of this feature has been converted > > > to use /dev/null). > > > > Then the other instances should be converted, IMO. > > Committed revision 1760. > seen the commit and the remark about Windows. Please ignore my ignorance, but wouldn't "NUL:" serve the purpose on the evil OS? Cheers Martin - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Anyone experience petabyte peaks in network metric in ganglia 3.x.y ?
- Original Message > From: "Escobio, Roger " <[EMAIL PROTECTED]> > To: Bernard Li <[EMAIL PROTECTED]> > Cc: [EMAIL PROTECTED] > Sent: Wednesday, September 10, 2008 9:19:47 PM > Subject: Re: [Ganglia-general] Anyone experience petabyte peaks in network > metric in ganglia 3.x.y ? > > > > > -Original Message- > > From: Bernard Li [mailto:[EMAIL PROTECTED] > > Sent: September 10, 2008 3:05 PM > > To: Escobio, Roger [CMB-IT] > > Cc: [EMAIL PROTECTED] > > Subject: Re: [Ganglia-general] Anyone experience petabyte > > peaks in network metric in ganglia 3.x.y ? > > > > Hi Roger: > > > > On Wed, Sep 10, 2008 at 11:52 AM, Martin Knoblauch > > wrote: > > > > >> I created a patch again linux/metrics.c (3.1.1 version) to add the > > >> counterdiff function found in *bsd/metrics.c > > >> Are you interested in it? Just let me know and I'll send > > it to the list > > >> > > > > > > Yes please. I am definitely like to have a look at your patch. > > > > In case the patch is too large to be sent to the mailing-list, you > > could also file a bug and upload the patch via bugzilla.ganglia.info. > > Well, is not big > As I said, it is just a copy paste from *bsd code , so not big deal :-) > > But at least compile and not coredump gmond in our linux :-) > Roger, [changed Mailing List to ganglia-developers] just to understand things right, your patch is only a code cleanup and you still need the "#ifdef REMOVE_BOGUS_SPIKES" to get rid of the spikes. Correct? Some comments on the patch: >--- libmetrics/linux/metrics.c-ori 2008-09-09 18:54:40.0 + >+++ libmetrics/linux/metrics.c 2008-09-09 19:09:44.0 + >@@ -222,40 +222,20 @@ > if ( !ns ) return; > > rbi = strtoul( p, &p ,10); >-if ( rbi >= ns->rbi ) { >- l_bytes_in += rbi - ns->rbi; >-} else { >- debug_msg("update_ifdata(%s) - Overflow in rbi: %lu -> >%lu",caller,ns->rbi,rbi); >- l_bytes_in += ULONG_MAX - ns->rbi + rbi; >-} >+l_bytes_in = counterdiff(rbi,ns->rbi,ULONG_MAX, 0); Shouldn't that be "+= counterdiff/..."? l_bytes_in is cummulated over all NICs. > ns->rbi = rbi; > > rpi = strtoul( p, &p ,10); >-if ( rpi >= ns->rpi ) { >- l_pkts_in += rpi - ns->rpi; >-} else { >- debug_msg("updata_ifdata(%s) - Overflow in rpi: %lu -> >%lu",caller,ns->rpi,rpi); >- l_pkts_in += ULONG_MAX - ns->rpi + rpi; >-} >+l_pkts_in = counterdiff(rpi,ns->rpi,ULONG_MAX, 0); ditto > ns->rpi = rpi; > > for (i = 0; i < 6; i++) strtol(p, &p, 10); > rbo = strtoul( p, &p ,10); >-if ( rbo >= ns->rbo ) { >- l_bytes_out += rbo - ns->rbo; >-} else { >- debug_msg("update_ifdata(%s) - Overflow in rbo: %lu -> >%lu",caller,ns->rbo,rbo); >- l_bytes_out += ULONG_MAX - ns->rbo + rbo; >-} >+l_bytes_out = counterdiff(rbo,ns->rbo,ULONG_MAX, 0); ditto > ns->rbo = rbo; > > rpo = strtoul( p, &p ,10); >-if ( rpo >= ns->rpo ) { >- l_pkts_out += rpo - ns->rpo; >-} else { >- debug_msg("update_ifdata(%s) - Overflow in rpo: %lu -> >%lu",caller,ns->rpo,rpo); >- l_pkts_out += ULONG_MAX - ns->rpo + rpo; >-} >+l_pkts_out = counterdiff(rpo,ns->rpo,ULONG_MAX, 0); ditto > ns->rpo = rpo; > } > p = index (p, '\n') + 1;// skips a line >@@ -1305,3 +1285,40 @@ >val.f = most_full; >return val; > } >+ >+static unsigned long >+counterdiff(unsigned long oldval, unsigned long newval, unsigned long maxval, >unsigned long maxdiff) >+{ >+ unsigned long diff; >+ >+ if (maxdiff == 0) >+ maxdiff = maxval; >+ >+ /* Paranoia */ >+ if (oldval > maxval || newval > maxval) >+ return 0
Re: [Ganglia-developers] Updated patches available for trunk
Original Message > From: "Witham, Timothy D" <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]> > Sent: Wednesday, August 27, 2008 6:20:21 PM > Subject: RE: [Ganglia-developers] Updated patches available for trunk > > > please see my comment on #193. Your proposal to add averages to the > >graphs is great, but goes over the original request in #193, which also is > >great. > > Ok, I can submit it separately. But IMHO, if the number is displayed on the > graph, then the same number doesn't need to be displayed on the HTML. Seems > redundant, which is why I put it in that existing bug as an alternate way to > reach the goal. > Hi Timothy, the original #193 proposal makes the number "stand out" on its own, which I really like. Is the extra call to rrdtool really that expensive? So, having the average on the load graphs as well is fine. But I kind of fear that the graphs may get cluttered. And if asked, personally I would even more love to see a "timestamp" - at least on the "enlarged" versions of the graphs. Good for reporting/documentation purposes. Cheers Martin - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Updated patches available for trunk
Timothy, please see my comment on #193. Your proposal to add averages to the graphs is great, but goes over the original request in #193, which also is great. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: "Witham, Timothy D" <[EMAIL PROTECTED]> > To: ganglia-developers > Sent: Tuesday, August 26, 2008 11:55:28 PM > Subject: [Ganglia-developers] Updated patches available for trunk > > Hi guys, > > I have updated a few web frontend patches to apply cleanly on current trunk > for > your consideration / discussion. They are on bugzilla.ganglia.info: > > bz#176 custom time range with optional calendar widget > bz#184 show and zoom 4 reports on the current grid > bz#193 put averages on all graphs and min/max on metric graphs > > -twitham > > > - > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia release name
- Original Message > From: Bernard Li <[EMAIL PROTECTED]> > To: Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]>; ganglia-developers > > Sent: Friday, August 22, 2008 2:23:13 AM > Subject: [Ganglia-developers] Ganglia release name > > Hi Carlo: > > It looks like in this commit, you have removed the release name for Ganglia: > > http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=rev&revision=1703 > > I didn't see you re-add the name some place else, so I assume your > proposal is to get rid of release name for future releases completely? > if true, I would find it very sad. Even if it has no technical use, it somehow belongs to Ganglia. Some people still wonder what the names stand for :-) Just my 0.02 € Martin - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Bugzilla Bug 193: Avg Load percentages and overall cluster utilization.
- Original Message > From: Bernard Li <[EMAIL PROTECTED]> > To: ganglia-developers > Sent: Wednesday, August 20, 2008 2:33:02 AM > Subject: [Ganglia-developers] Bugzilla Bug 193: Avg Load percentages and > overall cluster utilization. > > Dear all: > Hi Bernard, > http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=193 > > This patch has not been tested with meta-sources (i.e. gmetad > aggregating gmetad) and thus the average utilization numbers are > incorrect for grid of grids. > thanks for spotting. It just shows that the number of deployment scenarios is just to big for the patch/feature developers. And that we cannot assume that a release will have been completely tested for all scenarios. > I currently have an incomplete fix, but I need to get consensus as to > what average utilization really means for grid of grids: should > "average utilization" for a grid be load average divided by the number > of cpus for the *entire* meta-grid or just over the grid in question? > > Alternatively, we can rollback this backport and punt it until 3.1.2. > No real strong feelings. > On a related note, I think we should distinguish between a "Grid" and > a "Meta-Grid" (i.e. a grid of grids) in the Front End -- do people > care? > Definitely a good idea, as it seems to be a more and more common case. Cheers Martin - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-svn] SF.net SVN: ganglia:[1538]trunk/monitor-core/Makefile.am
- Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Jesse Becker <[EMAIL PROTECTED]>; ganglia-developers > > Sent: Thursday, July 10, 2008 9:48:20 PM > Subject: Re: [Ganglia-developers] [Ganglia-svn] SF.net SVN: > ganglia:[1538]trunk/monitor-core/Makefile.am > > >>> On 7/10/2008 at 12:52 PM, in message > , "Jesse Becker" > wrote: > > On Thu, Jul 10, 2008 at 13:15, Brad Nicholes wrote: > >> I'm OK with it either way. If we add contrib/ to the package, then we > > should still have someplace where we put stuff that we like and think is > > valuable, but haven't approved yet. Does that make sense? However a > > download page on the wiki or some other kind of web directory listing might > > make it easier to reference for the user. > > > > This is exactly what "contrib" directories are for: things that are > > useful and worth distributing as a courtesy, but are *not* directly > > supported by the main development team. If something is ever > > promoted/taken over by main developers, then it gets removed from > > contrib/, and added into the "proper" location elsewhere in the > > project. > > So what does that mean? Should contrib/ be part of the tarballs, snapshots, > releases or just an SVN repository location for misc. stuff? > I personally would put them into any archive that gets into the hand of developers: tarballs, src-RPMs, ... They don't neccessarily have to be part of the binary packages. On the other hand, would it hurt? What about "/usr/share/ganglia/contrib/" ? Cheers Martin - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] relicensing the web frontend as GNU GPL v2
Hi Carlo, v2/v2+ is fine with me. Nice and clear, almost understandable to a human being (as opposed to a lawyer). Btw. what is the overall licensing status? Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]> > To: ganglia-developers@lists.sourceforge.net > Sent: Saturday, April 19, 2008 9:14:19 AM > Subject: [Ganglia-developers] relicensing the web frontend as GNU GPL v2 > > most likely just a formality, as the web frontend templating system was based > on the GPLv2+ TemplatePower class from the very beginning (at least as > shown from the history in svn). > > a quick line count from the files involved says the contributers that will > need to consent will be (including number of lines committed from all files in > the web directory including non php files which could be as well discarded as > an alternative) : > > 38 bnicholes > 87 carenas > 410 knobi1 > 426 bernardli > 686 hawson > 830 sacerdoti >3940 massie > > the web/COPYING file will need to be updated after that so that the use of > class.TemplatePower.inc.php is consistent with the rest of the frontend code. > > as stated in the title, GPLv2 only will be my suggestion, but I am also ok > with GPLv2+ or GPLv3 if someone has a really good argument for it. > > Carlo > > - > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak with gmetad r1224
- Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Kumar vaibhav <[EMAIL PROTECTED]>; Martin Knoblauch <[EMAIL PROTECTED]> > Cc: Ganglia Developers > Sent: Saturday, April 12, 2008 5:20:54 PM > Subject: Re: [Ganglia-developers] Memory leak with gmetad r1224 > > This has been fixed already. Check out r1229 > r1229 is for gmetad. Kumar complains about gmond. Right? Martin > Brad > > >>> On 4/12/2008 at 3:09 AM, in message > <[EMAIL PROTECTED]>, Martin Knoblauch > wrote: > > Hi Kumar, > > > > any chance to get valgrind snapshots for various runtimes to distinguish > > between one-time allocations that stay for the entire lifetime and stuff > > that > > actually grows? > > > > Cheers > > Martin > > -- > > Martin Knoblauch > > email: k n o b i AT knobisoft DOT de > > www: http://www.knobisoft.de > > > > - Original Message > >> From: Kumar vaibhav > >> To: Brad Nicholes > >> Cc: Ganglia Developers > >> Sent: Saturday, April 12, 2008 9:09:59 AM > >> Subject: Re: [Ganglia-developers] Memory leak with gmetad r1224 > >> > >> Hi, > >> > >> > >> I am seeing memory leaks in gmond also. It is memory footprint is > >> growing with time > >> > >> > >> Vaibhav > >> > >> Brad Nicholes wrote: > >> > >> >>>> On 4/10/2008 at 5:13 PM, in message > >> >>>> > >> > , "Bernard Li" > >> > wrote: > >> > > >> >> Hi guys: > >> >> > >> >> Looks like we might have introduced memory leak in gmetad recently. I > >> >> don't have the exact numbers, but the memory usage is definitely > >> >> growing. I left my gmetad running for 2 days, and it was consuming > >> >> ~500MB and there is only one host. > >> >> > >> >> > >> > > >> > You're right, I am seeing it also. I will take a look to see if I can > >> > spot > > > > >> what might have caused this. > >> > > >> > Brad > >> > > >> > > >> > - > >> > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > >> > Don't miss this year's exciting event. There's still time to save $100. > >> > Use priority code J8TL2D2. > >> > > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaon > > > > e > >> > ___ > >> > Ganglia-developers mailing list > >> > Ganglia-developers@lists.sourceforge.net > >> > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > >> > > >> > >> > >> - > >> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > >> Don't miss this year's exciting event. There's still time to save $100. > >> Use priority code J8TL2D2. > >> > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaon > > > > e > >> ___ > >> Ganglia-developers mailing list > >> Ganglia-developers@lists.sourceforge.net > >> https://lists.sourceforge.net/lists/listinfo/ganglia-developers > >> > >> > > > > > - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [RFC] Backport #76 "gmetad summary dataincorrect during summarization" fixes to 3.0.x
- Original Message > From: Bernard Li <[EMAIL PROTECTED]> > To: "Witham, Timothy D" <[EMAIL PROTECTED]> > Cc: Ganglia Developers > Sent: Saturday, April 12, 2008 12:19:20 AM > Subject: Re: [Ganglia-developers] [RFC] Backport #76 "gmetad summary > dataincorrect during summarization" fixes to 3.0.x > > On Fri, Apr 11, 2008 at 2:49 PM, Witham, Timothy D > wrote: > > > Yes, I believe the patch is complete now and will resolve bz#76 for > > 3.0.x. > > Checked into 3.0.x branch r1232. > > I would like to request that we start rolling 3.0.8 betas for an > imminent release -- unless of course there are other bugfixes planned. > There is a fix in linux/metrics.c r1029 that bz#180 . This should go into 3.0.8. The fix is relatively nice in trunk, but more ugly in 3.0.x due to the code duplication in the networking metrics. I see two possible ways: a) we backport only r1029, even if if it is ugly b) we take the current state of linux/metrics.c minus r1010 /float conversion for memory metrics) Also, I am a bit worried about the reported memory leak in "gmond". If Kumar can give us a pointer, maybe we could get that fixed to. > Also, looks like we can have one more release before 3.1.0 (or 3.1.1, > whatever we are calling the release now...), so we'd better start > cracking :-) > > Actually, I wonder if we should call it 3.1.x any more... since we've > had so many 3.0.x releases, it might make 3.1.x not seem like a major > upgrade which it is. Maybe call it 3.5 or 4.0? > Maybe having a larger offset in Minor might be good. I do not really see that it warrants an upgrade of Major. We can do that once the python fraction has eliminated the last piece of c-code :-) Cheers Martin - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak with gmetad r1224
Hi Kumar, any chance to get valgrind snapshots for various runtimes to distinguish between one-time allocations that stay for the entire lifetime and stuff that actually grows? Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Kumar vaibhav <[EMAIL PROTECTED]> > To: Brad Nicholes <[EMAIL PROTECTED]> > Cc: Ganglia Developers > Sent: Saturday, April 12, 2008 9:09:59 AM > Subject: Re: [Ganglia-developers] Memory leak with gmetad r1224 > > Hi, > > > I am seeing memory leaks in gmond also. It is memory footprint is > growing with time > > > Vaibhav > > Brad Nicholes wrote: > > >>>> On 4/10/2008 at 5:13 PM, in message > >>>> > > , "Bernard Li" > > wrote: > > > >> Hi guys: > >> > >> Looks like we might have introduced memory leak in gmetad recently. I > >> don't have the exact numbers, but the memory usage is definitely > >> growing. I left my gmetad running for 2 days, and it was consuming > >> ~500MB and there is only one host. > >> > >> > > > > You're right, I am seeing it also. I will take a look to see if I can spot > what might have caused this. > > > > Brad > > > > > > - > > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > > Don't miss this year's exciting event. There's still time to save $100. > > Use priority code J8TL2D2. > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > > ___ > > Ganglia-developers mailing list > > Ganglia-developers@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > > > - > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Install locations of gmetric and gstat
- Original Message > From: Jesse Becker <[EMAIL PROTECTED]> > To: Bernard Li <[EMAIL PROTECTED]> > Cc: Ganglia Developers > Sent: Wednesday, April 2, 2008 9:17:09 PM > Subject: Re: [Ganglia-developers] Install locations of gmetric and gstat > > On Wed, Apr 2, 2008 at 3:01 PM, Bernard Li wrote: > > > > > On Wed, Apr 2, 2008 at 11:48 AM, Jesse Becker wrote: > > Gmetric injects metrics to the collection framework which gmond/gmetad > > belongs to, so to quote Martin, "by logic", they should belong in the > > same location. > > Well, both ssh and sshd are part of a secure communications framework. > Would you put ssh in /usr/sbin? :-) > Now, "ssh" is the *user* tool that is needed to use the ssh service. It needs to be in the standard user PATH. "gmetric" on the other hand is a tool that does not belong/need-to-be in the hand of common users. Usually only administrators define what metrics should go into the ganglia stream. Therefore its place should be both near to gmond and out of the standard user PATH. And of course "gstat" is a user tool again. Just read access to the data stream, not possible to do any harm. > I'll quote the FHS: > > /usr/sbin : Non-essential standard system binaries > /usr/bin : Most user commands > > Based on that, I'll buy the "gmetric in /usr/sbin" argument. > Actually to me ".../sbin" always stand for "stuff that the dirty masses should not see by default" :-) But then I have never been known for my political correctness :-)) Cheers Martin - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Install locations of gmetric and gstat
Hi Bernard, by logic "gmetric" definitely belongs to ".../sbin". Personally I think "gstat" and "gexec" belong to ".../bin". They are user commands and they are not really part of the collection framework. Oh, it is also good to see that "gstat" moved out of "gmond". That always irritated me. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Bernard Li <[EMAIL PROTECTED]> > To: Ganglia Developers > Sent: Wednesday, April 2, 2008 8:37:30 PM > Subject: [Ganglia-developers] Install locations of gmetric and gstat > > Currently gmetric and gstat are installed in /usr/bin, whereas gmond > and gmetad are installed in /usr/sbin. IMHO I think all binaries > should be installed to /usr/sbin. One might argue that maybe gstat > should be made available to users, but I think gmetric should > definitely be confined to /usr/sbin. > > Thoughts? > > Cheers, > > Bernard > > - > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Commit bugfix for bz #76 into 3.0.X
Hi folks, any objections to commit the attached patch for a longstanding gmetad problem into 3.0.X? I already put it into trunk a few days ago. The fix was developed by Timothy on top of 3.0.6 and is in production use. Please vote. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de Index: server.c === --- server.c (revision 1102) +++ server.c (working copy) @@ -285,8 +285,18 @@ return 0; case SUMMARY: - return source_summary((Source_t*) node, client); +/* use the mutex to avoid reporting incomplete sums -twitham (bug#76) */ + if (((Source_t*)node)->sum_finished) + pthread_mutex_lock(((Source_t*)node)->sum_finished); + + int i = source_summary((Source_t*) node, client); + + if (((Source_t*)node)->sum_finished) + pthread_mutex_unlock(((Source_t*)node)->sum_finished); + + return i; + default: break; } - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
- Original Message > From: Kumar Vaibhav <[EMAIL PROTECTED]> > To: Jesse Becker <[EMAIL PROTECTED]> > Cc: Martin Knoblauch <[EMAIL PROTECTED]>; Ganglia Developers > ; Bernard Li <[EMAIL PROTECTED]> > Sent: Friday, March 21, 2008 8:16:42 AM > Subject: Re: [Ganglia-developers] Memory leak in gmond > > Hi All, > > I am still seeing some memory leak in the nodes > Now the problem is not in the deaf mode but in the mute mode. To reduce the > debugging complexity I am running the 3.0.7 > on 2 nodes one in deaf mode and other in mute mode. The deaf mode is working > fine and the node in mute mode is giving > memory leak. Here is the o/p of the valgrind for the node with mute mode. > Hi Kumar, while I almost assume that some/most of the "leaks" that you are seeing are one-time allocations that just live until process-end, I am at least confused about the ones from "hash_lookup". This is part of a metrics sampling function which should not be called at all in "mute" mode - unless I am not completely wrong. Could you do the valgrind runs twice, with different total run-times. Just to see which of the "leaks" accumulate. > > ==21588== > ==21588== Process terminating with default action of signal 2 (SIGINT) > ==21588==at 0x3F810C485F: poll (in /lib64/libc-2.5.so) > ==21588==by 0x41D7B1: apr_pollset_poll (poll.c:504) > ==21588==by 0x405846: main (gmond.c:1269) > --21588-- Discarding syms at 0x4D41000-0x4F4C000 in > /lib64/libnss_files-2.5.so > due to munmap() > ==21588== > ==21588== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 5 from 1) > --21588-- > --21588-- supp:5 Fedora-Core-6-hack3-ld25 > ==21588== malloc/free: in use at exit: 740,602 bytes in 1,190 blocks. > ==21588== malloc/free: 2,574 allocs, 1,384 frees, 946,209 bytes allocated. > ==21588== > ==21588== searching for pointers to 1,190 not-freed blocks. > ==21588== checked 479,904 bytes. > ==21588== > ==21588== 5 bytes in 1 blocks are still reachable in loss record 1 of 16 > ==21588==at 0x4A05809: malloc (vg_replace_malloc.c:149) > ==21588==by 0x4111FF: cfg_init (confuse.c:1087) > ==21588==by 0x40EB7C: Ganglia_gmond_config_create (libgmond.c:523) > ==21588==by 0x405529: process_configuration_file (gmond.c:180) > ==21588==by 0x405627: main (gmond.c:1815) > ==21588== I think this is a one-time alloc from reading the config file. > ==21588== > ==21588== 19 bytes in 4 blocks are still reachable in loss record 2 of 16 > ==21588==at 0x4A05809: malloc (vg_replace_malloc.c:149) > ==21588==by 0x3F810750E1: strndup (in /lib64/libc-2.5.so) > ==21588==by 0x40806A: hash_lookup (metrics.c:151) > ==21588==by 0x408D75: bytes_out_func (metrics.c:425) > ==21588==by 0x40418C: Ganglia_collection_group_collect (gmond.c:1540) > ==21588==by 0x404FC8: process_collection_groups (gmond.c:1662) > ==21588==by 0x40600E: main (gmond.c:1913) > ==21588== Now, this one is from bytes_out_func. Likely a one-time allcation. How many network interfaces has that system got? What are they named? And I wonder why it is called at all in mute mode. > ==21588== > ==21588== 22 bytes in 2 blocks are still reachable in loss record 3 of 16 > ==21588==at 0x4A05809: malloc (vg_replace_malloc.c:149) > ==21588==by 0x406740: gengetopt_strdup (cmdline.c:64) > ==21588==by 0x40689E: cmdline_parser (cmdline.c:100) > ==21588==by 0x4055BD: main (gmond.c:1780) > ==21588== One-time allocation. > ==21588== > ==21588== 56 bytes in 1 blocks are still reachable in loss record 4 of 16 > ==21588==at 0x4A05809: malloc (vg_replace_malloc.c:149) > ==21588==by 0x4111D2: cfg_init (confuse.c:1083) > ==21588==by 0x40EB7C: Ganglia_gmond_config_create (libgmond.c:523) > ==21588==by 0x405529: process_configuration_file (gmond.c:180) > ==21588==by 0x405627: main (gmond.c:1815) > ==21588== One-time allocation. > ==21588== > ==21588== 192 bytes in 4 blocks are still reachable in loss record 5 of 16 > ==21588==at 0x4A05809: malloc (vg_replace_malloc.c:149) > ==21588==by 0x408057: hash_lookup (metrics.c:144) > ==21588==by 0x408D75: bytes_out_func (metrics.c:425) > ==21588==by 0x40418C: Ganglia_collection_group_collect (gmond.c:1540) > ==21588==by 0x404FC8: process_collection_groups (gmond.c:1662) > ==21588==by 0x40600E: main (gmond.c:1913) > ==21588== See my comment above. That looks like 4 net_dev_stats structures. Likely one-time allcations. But should not happen at all in "mute" mode. Are you running in 32-bit or 64-bit mode? Seems we can save 8-bytes per struct by better sorting the members. > ==21588== > ==21588== 192 bytes in 1 blocks are still r
Re: [Ganglia-developers] Problem with the module configuration directives...
- Original Message > From: Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]> > To: Brad Nicholes <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Thursday, March 20, 2008 7:22:37 PM > Subject: Re: [Ganglia-developers] Problem with the module configuration > directives... > > On Thu, Mar 20, 2008 at 10:26:31AM -0600, Brad Nicholes wrote: > > > > 1. Remove the "pymodule" configuration directive > > +1 > > > 2. Add an optional type name to a module section to designate the interface > type. > >module { } - #No designation, the default is C > >module Python {} - #Python type module interfaced through mod_python > >module Perl {} - #Perl type module interfaced through (what could be) > mod_perl > >etc > > I was thinking about doing an optional token instead : > > module { > lang = python > .. > } > +1 Cheers Martin - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Sanitized versin of linux/metrics.c for 3.1.x
Hi, I just checked in a sanitized version of linux/metrics.c for the 3.1.x stable branch. The code to remove bogus spikes in networking has been #ifdef-ed, as it uses assumptions that are local to certain setups. I also added some FIXME comments on stuff that should be rewritten in the future. - per-interface network metrics - consolidate funtions reading /proc/meminfo and /proc/stat Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Patch for modular graph.php.
for whatever it is worth, I think it is a good idea. But I am too late anyway :-) Cheers Martin - Original Message From: Matt Massie <[EMAIL PROTECTED]> To: Jesse Becker <[EMAIL PROTECTED]> Cc: Ganglia Developers ; Ramon Bastiaans <[EMAIL PROTECTED]> Sent: Wednesday, March 12, 2008 4:38:55 PM Subject: Re: [Ganglia-developers] Patch for modular graph.php. i like the idea too. anything that makes the frontend more modular and extendable is a good thing. -matt On 3/12/08, Jesse Becker <[EMAIL PROTECTED]> wrote:On Wed, Mar 12, 2008 at 5:23 AM, Ramon Bastiaans <[EMAIL PROTECTED]> wrote: > I like this setup a lot, has anyone considered this? Well, I have, but that doesn't count. > Doesn't seem to have made it's way into svn and I saw no more replies on > this topic. I don't think that it applies cleanly to trunk anymore, since there have been some interim fixes for a few things. I'll fix it up, and re-post when I get a chance (crazy busy recently). -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -Inline Attachment Follows- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ -Inline Attachment Follows- ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [patch] change privateclusters auth headerto include clustername
Hi, what was the exact process? We need +2 for checkins into both trunk and 3.0.x, or just 3.0.x? For now I will abstain from checking Ramons patch into trunk. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]>; > "ganglia-developers@lists.sourceforge.net" > ; Ramon Bastiaans <[EMAIL > PROTECTED]> > Sent: Thursday, March 6, 2008 4:57:39 PM > Subject: Re: [Ganglia-developers] [patch] change privateclusters auth > headerto include clustername > > -1 for now. The concern that I have is that by injecting the name of the > cluster as it is pulled from the query string, seems a little dangerous. This > would allow the realm to be altered in any way by just modifying the query > string. Not sure if that is a real issue or not, but it seems dangerous. Can > anybody else clarify this more? > > Brad > > >>> On 3/6/2008 at 5:28 AM, in message > <[EMAIL PROTECTED]>, Martin Knoblauch > wrote: > > Hi Ramon, > > > > looks harmless enough. Could you make a similar patch against trunk please? > > > > From my side "+1" for both trunk and 3.0.X > > > > Cheers > > Martin > > -- > > Martin Knoblauch > > email: k n o b i AT knobisoft DOT de > > www: http://www.knobisoft.de > > > > > > - Original Message > >> From: Ramon Bastiaans > >> To: "ganglia-developers@lists.sourceforge.net" > > > >> Sent: Thursday, March 6, 2008 11:59:36 AM > >> Subject: [Ganglia-developers] [patch] change privateclusters auth header > >> to > > include clustername > >> > >> Hi, > >> > >> I've made a little patch to the webfrontend of 3.0.7. > >> > >> The problem is that Ganglia always says "Ganglia Private Cluster", for > >> ALL private clusters in the authentication header. > >> This way you can't let Firefox or Internet Exporer remember a different > >> password for each cluster. > >> > >> Since the Firefox password manager for example associates the password > >> with the string in the authentication header, you will have to keep on > >> entering your individual private cluster password again and again. > >> > >> I have now changed it to include the cluster name in the authentication > >> header. > >> This way you can now let your browser save/remember/cache different > >> passwords for each individual cluster. > >> > >> Cheers, > >> - Ramon. > >> > >> -- > >> ing. R. Bastiaans > >> > >> Systems Programmer / High Performance Computing & Visualisation / > >> SARA Computing and Networking Services > >> Kruislaan 415 PO Box 194613 > >> 1098 SJ Amsterdam 1090 GP Amsterdam > >> P.+31 (0)20 592 3000 F.+31 (0)20 668 3167 > >> --- > >> There are really only three types of people: > >> > >> Those who make things happen, those who watch things happen > >> and those who say, "What happened?" > >> > >> > >> > >> > >> -Inline Attachment Follows- > >> > >> --- auth.php.org 2008-03-06 11:56:09.542153567 +0100 > >> +++ auth.php 2008-03-06 11:54:27.261229406 +0100 > >> @@ -30,7 +30,11 @@ > >> > #--- > >> function authenticate() > >> { > >> - header("WWW-authenticate: basic realm=\"Ganglia Private Cluster\""); > >> + global $clustername; > >> + > >> + $auth_header = "WWW-authenticate: basic realm=\"Private Ganglia cluster: > >> " > >> . $clustername . "\""; > >> + > >> + header( $auth_header ); > >> header("HTTP/1.0 401 Unauthorized"); > >> #print "> URL=\"../?c=\">"; > >> print " > > You are unauthorized to view the details of this Cluster > > "; > >> > >> > >> > > > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [patch] change privateclusters auth header to include clustername
Hi Ramon, unless someone beats me, I will check it into trunk later today. For 3.0.X we need more votes :-) Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Ramon Bastiaans <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]> > Cc: "ganglia-developers@lists.sourceforge.net" > > Sent: Thursday, March 6, 2008 1:38:38 PM > Subject: Re: [Ganglia-developers] [patch] change privateclusters auth header > to include clustername > > Hi Martin, > > The patch should also work with trunk (justed tested), seems that code > hasn't changed much.. ;) > > - Ramon. > > Martin Knoblauch wrote: > > Hi Ramon, > > > > looks harmless enough. Could you make a similar patch against trunk please? > > > > From my side "+1" for both trunk and 3.0.X > > > > Cheers > > Martin > > -- > > Martin Knoblauch > > email: k n o b i AT knobisoft DOT de > > www: http://www.knobisoft.de > > > > > > - Original Message > > > >> From: Ramon Bastiaans > >> To: "ganglia-developers@lists.sourceforge.net" > > >> Sent: Thursday, March 6, 2008 11:59:36 AM > >> Subject: [Ganglia-developers] [patch] change privateclusters auth header > >> to > include clustername > >> > >> Hi, > >> > >> I've made a little patch to the webfrontend of 3.0.7. > >> > >> The problem is that Ganglia always says "Ganglia Private Cluster", for > >> ALL private clusters in the authentication header. > >> This way you can't let Firefox or Internet Exporer remember a different > >> password for each cluster. > >> > >> Since the Firefox password manager for example associates the password > >> with the string in the authentication header, you will have to keep on > >> entering your individual private cluster password again and again. > >> > >> I have now changed it to include the cluster name in the authentication > >> header. > >> This way you can now let your browser save/remember/cache different > >> passwords for each individual cluster. > >> > >> Cheers, > >> - Ramon. > >> > >> -- > >> ing. R. Bastiaans > >> > >> Systems Programmer / High Performance Computing & Visualisation / > >> SARA Computing and Networking Services > >> Kruislaan 415 PO Box 194613 > >> 1098 SJ Amsterdam 1090 GP Amsterdam > >> P.+31 (0)20 592 3000 F.+31 (0)20 668 3167 > >> --- > >> There are really only three types of people: > >> > >> Those who make things happen, those who watch things happen > >> and those who say, "What happened?" > >> > >> > >> > >> > >> -Inline Attachment Follows- > >> > >> --- auth.php.org 2008-03-06 11:56:09.542153567 +0100 > >> +++ auth.php 2008-03-06 11:54:27.261229406 +0100 > >> @@ -30,7 +30,11 @@ > >> > #--- > >> function authenticate() > >> { > >> - header("WWW-authenticate: basic realm=\"Ganglia Private Cluster\""); > >> + global $clustername; > >> + > >> + $auth_header = "WWW-authenticate: basic realm=\"Private Ganglia cluster: > >> " > >> . $clustername . "\""; > >> + > >> + header( $auth_header ); > >> header("HTTP/1.0 401 Unauthorized"); > >> #print "> URL=\"../?c=\">"; > >> print " > >> > > You are unauthorized to view the details of this Cluster > > "; > > > >> > >> -Inline Attachment Follows- > >> > >> - > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > >> > >> > >> -Inline Attachment Follows- > >> > >> ___ > >> Ganglia-developers mailing list > >> Ganglia-developers@lists.sourceforge.net > >> https://lists.sourceforge.net/lists/listinfo/ganglia-developers > >>
Re: [Ganglia-developers] [patch] change privateclusters auth header to include clustername
Hi Ramon, looks harmless enough. Could you make a similar patch against trunk please? From my side "+1" for both trunk and 3.0.X Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Ramon Bastiaans <[EMAIL PROTECTED]> > To: "ganglia-developers@lists.sourceforge.net" > > Sent: Thursday, March 6, 2008 11:59:36 AM > Subject: [Ganglia-developers] [patch] change privateclusters auth header to > include clustername > > Hi, > > I've made a little patch to the webfrontend of 3.0.7. > > The problem is that Ganglia always says "Ganglia Private Cluster", for > ALL private clusters in the authentication header. > This way you can't let Firefox or Internet Exporer remember a different > password for each cluster. > > Since the Firefox password manager for example associates the password > with the string in the authentication header, you will have to keep on > entering your individual private cluster password again and again. > > I have now changed it to include the cluster name in the authentication > header. > This way you can now let your browser save/remember/cache different > passwords for each individual cluster. > > Cheers, > - Ramon. > > -- > ing. R. Bastiaans > > Systems Programmer / High Performance Computing & Visualisation / > SARA Computing and Networking Services > Kruislaan 415 PO Box 194613 > 1098 SJ Amsterdam 1090 GP Amsterdam > P.+31 (0)20 592 3000 F.+31 (0)20 668 3167 > --- > There are really only three types of people: > > Those who make things happen, those who watch things happen > and those who say, "What happened?" > > > > > -Inline Attachment Follows- > > --- auth.php.org 2008-03-06 11:56:09.542153567 +0100 > +++ auth.php 2008-03-06 11:54:27.261229406 +0100 > @@ -30,7 +30,11 @@ > #--- > function authenticate() > { > - header("WWW-authenticate: basic realm=\"Ganglia Private Cluster\""); > + global $clustername; > + > + $auth_header = "WWW-authenticate: basic realm=\"Private Ganglia cluster: " > . $clustername . "\""; > + > + header( $auth_header ); > header("HTTP/1.0 401 Unauthorized"); > #print "> URL=\"../?c=\">"; > print " You are unauthorized to view the details of this Cluster "; > > > > -Inline Attachment Follows- > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > > > -Inline Attachment Follows- > > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.1 wish list...
Original Message > From: Jesse Becker <[EMAIL PROTECTED]> > To: Ramon Bastiaans <[EMAIL PROTECTED]> > Cc: "ganglia-developers@lists.sourceforge.net" > > Sent: Friday, February 29, 2008 7:53:20 PM > Subject: Re: [Ganglia-developers] Ganglia 3.1 wish list... > > On Fri, Feb 29, 2008 at 8:21 AM, Ramon Bastiaans > wrote: > > > * PHP5 as new requirement > > Are there any particular requirements to move to PHP5? Right now, the > existing code works with PHP4 and PHP5. Dropping support for PHP4 > would also mean dropping native support for distributions of moderate > age (RHEL4, CentOS4, et al). > The main reason would IMO be to prepare for PHP6 which will finally remove some long deprecated PHP4 features (like the global arrays). Cheers Martin - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.1 wish list...
Hi, wouldn't it be time to fork off "Ganglia WEB Frontent TNG(tm)" and put the old stuff into maintenance? Maybe for 3.1.1? It seems there is a lot of cool stuff that can be done, but it likely will destabilyze the frontend for a while? Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Ramon Bastiaans <[EMAIL PROTECTED]> > To: Brad Nicholes <[EMAIL PROTECTED]> > Cc: "ganglia-developers@lists.sourceforge.net" > > Sent: Friday, February 29, 2008 2:30:36 PM > Subject: Re: [Ganglia-developers] Ganglia 3.1 wish list... > > Oh and: > > web) > * session usage for storing settings, filters etc - instead of all the > ugly GETing and parsing of the URL > * more advanced addon/plugin capabilities (executing custom .php code > from with Ganglia's default templates/pages) > > - Ramon. > > Ramon Bastiaans wrote: > > Unfortunately I can't be there, would be fun to meet some of you. > > > > I would like to suggest the following for the wishlist though: > > * License sorting of all the components > > > > Since the Debian packages for example are no longer maintained because > > of licensing conflicts among the different components. > > > > For the web interface: > > * More fancy DHTML and Javascript stuff, we could make it look pretty ;) > > * Ajax - Could only reload graphs etc when really needed, improving > > performance > > \* Could for example only reload host metric graphs when metric type > > is changed, leaving the rest, etc > > * PHP5 as new requirement > > > > Hope you guys have fun and someone takes pictures of the meeting. ;) > > > > Cheers, > > - Ramon. > > > > Brad Nicholes wrote: > > > >> Here is the latest Ganglia 3.1 wish list. We will be discussing this list > during the Ganglia meeting. > >> > >> Brad > >> > >> > >> > > > > > > -- > > ing. R. Bastiaans > > > > Systems Programmer / High Performance Computing & Visualisation / > > SARA Computing and Networking Services > > Kruislaan 415PO Box 194613 > > 1098 SJ Amsterdam1090 GP Amsterdam > > P.+31 (0)20 592 3000 F.+31 (0)20 668 3167 > > --- > > There are really only three types of people: > > > > Those who make things happen, those who watch things happen > > and those who say, "What happened?" > > > > > > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > > ___ > > Ganglia-developers mailing list > > Ganglia-developers@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > > > -- > ing. R. Bastiaans > > Systems Programmer / High Performance Computing & Visualisation / > SARA Computing and Networking Services > Kruislaan 415PO Box 194613 > 1098 SJ Amsterdam1090 GP Amsterdam > P.+31 (0)20 592 3000 F.+31 (0)20 668 3167 > --- > There are really only three types of people: > > Those who make things happen, those who watch things happen > and those who say, "What happened?" > > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Need a script to remove "spikes" from network RRDs
- Original > From: Martin Knoblauch <[EMAIL PROTECTED]> > To: john allspaw <[EMAIL PROTECTED]>; ganglia-developers@lists.sourceforge.net > Cc: ganglia general <[EMAIL PROTECTED]> > Sent: Wednesday, February 27, 2008 8:55:26 AM > Subject: Re: [Ganglia-general] Need a script to remove "spikes" from network > RRDs > > Original Message ---- > > From: john allspaw > > To: Martin Knoblauch ; > ganglia-developers@lists.sourceforge.net > > Cc: ganglia general > > Sent: Tuesday, February 26, 2008 7:38:07 PM > > Subject: Re: [Ganglia-general] Need a script to remove "spikes" from > > network > RRDs > > > > Here is what comes with rrdtool, I've used it with some success... > > > > http://oss.oetiker.ch/rrdtool/pub/contrib/removespikes.tar.gz > > > > -john > > cool. Almost what I need. It seems to be a bit to smart for my purpose, but > making things stupid is easy :-) > Hi John, after adding an option/mode to remove based on value instead of bin-distribution the tool did exactely what I needed. I have pushed back my changes to the rrd people. Thanks a lot. For the meeting: Should we contact the author and ask wheter we can put the script into the distribution under "cool-stuff"? Cheers Martin - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.1 wish list...
- Original Message > From: john allspaw <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]>; Brad Nicholes <[EMAIL PROTECTED]>; > Peter Mui <[EMAIL PROTECTED]>; ganglia-developers@lists.sourceforge.net > Sent: Friday, February 29, 2008 1:39:17 AM > Subject: Re: [Ganglia-developers] Ganglia 3.1 wish list... > > My sincerest apologies for not making it over today, I will be showing my mug > there tomorrow. > > My vote on this: > > " - Add some event notification mechanism if metrics go over a limit. But do > we > want to implement another Nagios?" > > No, no, no, no and no. :) > > All opinion here, but: > > I think to add event notification would be a major mistake, and would pull > attention off of what makes ganglia awesome, > which is the non-judgemental recording of system metrics. There already exist > a > lot of ways to get ganglia's metrics into Nagios, > which has all of the bits that you'd want for a notification system. > > There are so many more cool/good/appropriate things to get into ganglia than > event notification. > -- john allspaw > flickr.com > Hi John, I am actually pretty much of the same opinion. Thats why I put the "But, ..." into my proposal. I have seen the request before and always thought "do we really need another Nagios". Similar with gexec/authd. Do they really belong into the monitoring-core? How does "integrating" them help recording the metrics? Cheers Martin > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.1 wish list...
Hi folks, before I turn off the light, just one or two comments below. See/hear you tomorrow Martin > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Peter Mui <[EMAIL PROTECTED]>; ganglia-developers@lists.sourceforge.net > Sent: Thursday, February 28, 2008 11:43:33 PM > Subject: [Ganglia-developers] Ganglia 3.1 wish list... > > Here is the latest Ganglia 3.1 wish list. We will be discussing this list > during the Ganglia meeting. > > Brad > > > > -Inline Attachment Follows- > > Done > -- > - C module interface as DSO > - mod_python Python module interface > - Dynamically link libraries like expat, apr, libconfuse > - Add TITLE attribute to the XDR data to communicate a human readable name > - Add a GROUP attribute to the XDR data > This would allow metrics to declare the category that they belong to. The > category should be added at the metric definition level and not in the > .conf > file. > - Reimplement the built in metrics as C interface modules > - A cleaner XDR encoding: > The current encoding scheme embeds too much information about which > metrics > gmond collects. The encoding scheme should treat all metrics the same: as > just "a metric". The encoding should not care if the metric is > metric_cpu_speed, metric_swap_total or a user-defined "gmetric" one. > - Flexible method of adding extra metric metadata. > We could include extra metadata, not just "alias"/"title". For example, > some > metrics have a natural minimum and maximum value. Perhaps coming up with > an > extendable way of encoding metric metadata so future changes can be > included > without loosing backwards compatibility. > - Re-organization of RPM packages (libganglia, gmond-python ?) > > > GMond To Do > > - Gmond module repository > - Implement a perl module interface > - Implement a PHP module interface > - Implement a Ruby module interface > - Metric packing: > Simply that a UDP packet can contain multiple metrics (using the usual XDR > stream decoding) up to the size of a UDP packet. This would help reduce > the overheads when sending many metric updates concurrently. It also > preserves the current gmond behaviour where it sends metric updates in > a single UDP packet. > - Support for counters (metrics with +ve slope) > This shouldn't require much work (from memory, make sure the slope-type > information is preserved and patch gmetad to create RRD files with the > correct options). Currently Ganglia doesn't actually support custom > counter metrics, which is an awkward limitation. > - gmond switching to a non-blocking IO model. > If there's a large number of metric updates then gmond must process them > "quickly" or they will be lost. If this happens whilst gmond is sending > XML > data to gmetad there's may be a delay, increasing the risk of metric > update messages being lost. Switching to a non-blocking IO model would > allow > gmond to respond preferentially to the incoming UDP messages. > -* Remove the 4T limit on ganglia metric results > -* Modify all byte count metric to 8 bytes ints > > GMetad To Do > -- > - Support for new RRDTool which allows graphs to have dynamic sizes > - Gilad's stacked graphs > - Changing the units of default metrics to their base > For example disk_free's base unit should be bytes, not GB as rrdtool will > automatically append G,M,K etc.) > - Better support for bigger less frequent updates > one packet every 20 seconds per host for all data? > - Multi PB disk limit > - Better on disk RRD perf (tmpfs is an OK workaround) > -* Name RRD directories based on UUID generated by client gmond > has of MAC address? something else? So that renaming hosts, updating DNS > or > hosts files don't result in history for the phyiscal gmond client being > lost. > - Integration of gexec/authd ? - Could be interesting as some kind of lightweight queueing system. > - Expand gstat nodelist parameter query options (i.e. return all hosts > with <10% iowait, etc.) - Add some event notification mechanism if metrics go over a limit. But do we want to implement another Nagios? > - Interface stats in bits? Self awareness of interface capablity for % > util stats for network. - Link utilization would be a great metrics. - I am not sure about the bit-stats. For the stuff I do, throughput in bytes/sec makes more sense than a bit-rate. But I can see the comms people have a different view - the network stats should be per interface > - Something like a unique per-gmond instance identifier > To help with multi-homing and DNS issues and so the IP address is no > longer the index key. There was discussion of this under the subject > "Overriding hostname" on the Ganglia-general list. > - Give some metrics priority and have them updated more frequently in their > RRDs > tha
[Ganglia-developers] Fix for bogus overflows in linux/metrics.c
Hi, I just checked in a fix to handle bogus overflow events on certain BCM NICs using the bnx2 driver. The fix is to drop any samples where an overflow is detected on any of the four counters. This will work fine in 64-bit mode, as overflow events are relatively rare (once in > 5000 years on a fully saturated 1Gbit NIC). But in 32-bit thaey may happen a lot more frequent (like every 40 seconds). My fix may actually drop to many valid samples. To help this out, one could sample at higher rates like 5 or 10 seconds. This definitely needs review and discussion. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Need a script to remove "spikes" from network RRDs
- Original Message > From: aurbain <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net; ganglia general <[EMAIL > PROTECTED]> > Sent: Wednesday, February 27, 2008 5:11:48 PM > Subject: Re: [Ganglia-general] Need a script to remove "spikes" from network > RRDs > > Thanks for the info Martin. So its not a rollover issue after all. > By the way, this issue also lives in rhel4u4 32 bit with bnx2 version > 1.4.43f > interesting. From my reading only the 64-bit version was affected. Anyway, I have a fix which just throws away any samples where an overflow, correct or bogus, occurs. That is definitely fine in 64-bit land. Even at full speed, a 1GBit NIC would overflow only after >5000 years. Nothing that I worry about much :-) Even 5 years for a future 1Tbit NIC is not that bad... But in 32-bit, a 1Gbit NIC could overflow every 40 seconds. And that is very short. Cheers Martin > Martin Knoblauch wrote: > > - Original Message > >> From: aurbain > >> To: Martin Knoblauch > >> Cc: ganglia-developers@lists.sourceforge.net; ganglia general > > >> Sent: Tuesday, February 26, 2008 8:25:13 PM > >> Subject: Re: [Ganglia-general] Need a script to remove "spikes" from > >> network > RRDs > >> > > > Happens only on 64-bit systems. Now, my fix kills the generation of the > spikes, but my RRD database is now tainted for another 12 month. > > > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Need a script to remove "spikes" from network RRDs
Original Message > From: john allspaw <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]>; > ganglia-developers@lists.sourceforge.net > Cc: ganglia general <[EMAIL PROTECTED]> > Sent: Tuesday, February 26, 2008 7:38:07 PM > Subject: Re: [Ganglia-general] Need a script to remove "spikes" from network > RRDs > > Here is what comes with rrdtool, I've used it with some success... > > http://oss.oetiker.ch/rrdtool/pub/contrib/removespikes.tar.gz > > -john cool. Almost what I need. It seems to be a bit to smart for my purpose, but making things stupid is easy :-) Cheers Martin > - Original Message > > From: Martin Knoblauch > > To: ganglia-developers@lists.sourceforge.net > > Cc: ganglia general > > Sent: Tuesday, February 26, 2008 10:02:20 AM > > Subject: [Ganglia-general] Need a script to remove "spikes" from network > > RRDs > > > > Hi, > > > > one of my clusters has, due to flakey hw/driver combination, spikes in the > > PB/sec range in the network metrics. This makes viewing the larger > > timescales > > pretty much useless (for the next week, month, year) . Does anybody have a > > script to "repair" such rrds? Which of the fields need to be touched? > > > > Cheers > > Martin > > -- > > Martin Knoblauch > > email: k n o b i AT knobisoft DOT de > > www: http://www.knobisoft.de > > > > > > > > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > > ___ > > Ganglia-general mailing list > > [EMAIL PROTECTED] > > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > > > > > > > > Be a better friend, newshound, and > know-it-all with Yahoo! Mobile. Try it now. > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > Ganglia-general mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Need a script to remove "spikes" from network RRDs
- Original Message > From: aurbain <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net; ganglia general <[EMAIL > PROTECTED]> > Sent: Tuesday, February 26, 2008 8:25:13 PM > Subject: Re: [Ganglia-general] Need a script to remove "spikes" from network > RRDs > > I'm getting these spikes in multiple boxes, specifically the ones which > do a lot of network traffic. RHEL4u[4,6], ganglia 2.0.6 > > Perhaps a rollover bug in the network code in gmond? > That was the first thing I suspected and I have since modified the overflow mechanism in my development version to just ignore samples when the overflow happens. After instrumentation it showed that the data in /proc/net/dev was bogus. This is due to this: http://www.mail-archive.com/[EMAIL PROTECTED]/msg59062.html Happens only on 64-bit systems. Now, my fix kills the generation of the spikes, but my RRD database is now tainted for another 12 month. Cheers Martin > > Martin Knoblauch wrote: > > Hi, > > > > one of my clusters has, due to flakey hw/driver combination, spikes in the > PB/sec range in the network metrics. This makes viewing the larger timescales > pretty much useless (for the next week, month, year) . Does anybody have a > script to "repair" such rrds? Which of the fields need to be touched? > > > > Cheers > > Martin > > -- > > Martin Knoblauch > > email: k n o b i AT knobisoft DOT de > > www: http://www.knobisoft.de > > > > > > > > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > > ___ > > Ganglia-general mailing list > > [EMAIL PROTECTED] > > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Need a script to remove "spikes" from network RRDs
Hi, one of my clusters has, due to flakey hw/driver combination, spikes in the PB/sec range in the network metrics. This makes viewing the larger timescales pretty much useless (for the next week, month, year) . Does anybody have a script to "repair" such rrds? Which of the fields need to be touched? Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.0.7?
Hi Bernard, as I said, all my stuff can wait for 3.0.8. As for the ACKs - ACK ACK ACK ACK :-) Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Bernard Li <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Monday, February 25, 2008 8:06:45 PM > Subject: Re: 3.0.7? > > Hi Martin: > > On 2/25/08, Martin Knoblauch wrote: > > > what are your plans for 3.0.7? Any time now ? :-) If not, I would like to > commit a small patch to enable syslogging error mesages for "gmond". But it > can > wait for 3.0.8. > > To be honest I am waiting for more ACKs. But either way it will get > released either tomorrow or Wednesday so please wait until then to > check in the patch for syslogging. > > Thanks, > > Bernard > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] 3.0.7?
Hi Bernard, what are your plans for 3.0.7? Any time now ? :-) If not, I would like to commit a small patch to enable syslogging error mesages for "gmond". But it can wait for 3.0.8. diff -up ~/ganglia-3.0.6.200802141157/gmond/gmond.c gmond/ --- /home/ftt5aa7/ganglia-3.0.6.200802141157/gmond/gmond.c Thu Feb 14 20:58:58 2008 +++ gmond/gmond.c Mon Feb 25 13:26:43 2008 @@ -27,6 +27,7 @@ #include "dtd.h" /* the DTD definition for our XML */ #include "g25_config.h" /* for converting old file formats to new */ #include "daemon_init.h" +#include /* When this gmond was started */ apr_time_t started; @@ -191,6 +192,7 @@ process_configuration_file(void) cleanup_threshold = cfg_getint( tmp, "cleanup_threshold"); } +extern int daemon_proc; /* defined in error.c */ static void daemonize_if_necessary( char *argv[] ) { @@ -213,6 +215,8 @@ daemonize_if_necessary( char *argv[] ) if(!args_info.foreground_flag && should_daemonize && !debug_level) { apr_proc_detach(1); + openlog(argv[0],LOG_PID,LOG_DAEMON); + daemon_proc = 1; } } Also for 3.0.8, I would like to drop in the trunk version of libmetrics/linux/metrics.c. It [will soon] contain a fix for a nasty overflow problem in some Braodcom NICs (BCM5708, bnx2 driver) that leads to spurious petabyte spikes in the network metrics. The problem is fixed in later driver releases, but is present in some popular "enterprise" distros like RHEL4. The risk is minimal and I am running it for more than a week, but it is definitely not for 3.0.7. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Duplicated code in lib/ and libmetrics/
Hi, is there a reason why we have error.* (trunk and 3.0.x) and debug_msg.* (3.0.x) in both "lib" and "libmetrics"? They seem to be almost identical. Should one go? Which? Cheers Martin ---------- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond Spoof memory leak fix
btw. the fix does not apply to trunk. The code looks quite different there. Someone familiar with the spoofing stuff may want to check whether the leak exists in trunk and needs fixing. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Martin Knoblauch <[EMAIL PROTECTED]> > To: Martin Hicks <[EMAIL PROTECTED]>; Bernard Li <[EMAIL PROTECTED]> > Cc: Ganglia Developers > Sent: Wednesday, February 20, 2008 7:58:20 PM > Subject: Re: [Ganglia-developers] gmond Spoof memory leak fix > > Bernard, all, > > I just committed the fix for the spoofing leak from Martin Hicks. Can you > run > a [final] snapshot for 3.0.7? I have something brewing to fix the > petabyte/sec > spikes that one of our customers is seeing, but that needs more testing and > can > wait for 3.0.8. > > Cheers > Martin > > -- > Martin Knoblauch > email: k n o b i AT knobisoft DOT de > www: http://www.knobisoft.de > > - Original Message > > From: Martin Knoblauch > > To: Martin Hicks > > Cc: Ganglia Developers > > Sent: Wednesday, February 20, 2008 7:44:04 PM > > Subject: Re: [Ganglia-developers] gmond Spoof memory leak fix > > > > - Original Message > > > From: Martin Hicks > > > To: Martin Knoblauch > > > Cc: Ganglia Developers > > > Sent: Wednesday, February 20, 2008 7:33:32 PM > > > Subject: Re: [Ganglia-developers] gmond Spoof memory leak fix > > > > > > > > > On Wed, Feb 20, 2008 at 10:27:33AM -0800, Martin Knoblauch wrote: > > > > Hi, > > > > > > > > if you resend it as an attachment, I would apply the fix. > > > > > > You can apply it with my blabbering at the beginning. :) > > > patch ignores the stuff before the --- > > > > > > The patch is attached for your convenience. > > > > > > > My problem is, that my MUA just garbles the white space. So, I prefer > > inlined > > patches. > > > > > > > > > > Cheers > > > > Martin > > > > PS: How is life at SGI nowadays? > > > > > > Seems okay. I just got here recently. :) > > > > > > > I left about 10 years ago. Different place at that time, I think. > > > > > > > > > > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > > ___ > > Ganglia-developers mailing list > > Ganglia-developers@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > > > > > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond Spoof memory leak fix
Bernard, all, I just committed the fix for the spoofing leak from Martin Hicks. Can you run a [final] snapshot for 3.0.7? I have something brewing to fix the petabyte/sec spikes that one of our customers is seeing, but that needs more testing and can wait for 3.0.8. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Martin Knoblauch <[EMAIL PROTECTED]> > To: Martin Hicks <[EMAIL PROTECTED]> > Cc: Ganglia Developers > Sent: Wednesday, February 20, 2008 7:44:04 PM > Subject: Re: [Ganglia-developers] gmond Spoof memory leak fix > > - Original Message > > From: Martin Hicks > > To: Martin Knoblauch > > Cc: Ganglia Developers > > Sent: Wednesday, February 20, 2008 7:33:32 PM > > Subject: Re: [Ganglia-developers] gmond Spoof memory leak fix > > > > > > On Wed, Feb 20, 2008 at 10:27:33AM -0800, Martin Knoblauch wrote: > > > Hi, > > > > > > if you resend it as an attachment, I would apply the fix. > > > > You can apply it with my blabbering at the beginning. :) > > patch ignores the stuff before the --- > > > > The patch is attached for your convenience. > > > > My problem is, that my MUA just garbles the white space. So, I prefer inlined > patches. > > > > > > > Cheers > > > Martin > > > PS: How is life at SGI nowadays? > > > > Seems okay. I just got here recently. :) > > > > I left about 10 years ago. Different place at that time, I think. > > > > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Commiting to the maintenance branch (was:Re: 3.0.7 release)
Hi Brad, you are right. 3.0.X should only take [critical] bug fixes by now. Maybe some obvious optimization. New functionality belongs into trunk. Rules for the web-interface might be more relaxed, as changes there do not endanger the monitoring-core framework. But that is my personal feeleing. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Ulf Lange <[EMAIL PROTECTED]>; ganglia-developers@lists.sourceforge.net > Sent: Wednesday, February 20, 2008 4:46:36 PM > Subject: [Ganglia-developers] Commiting to the maintenance branch (was:Re: > 3.0.7 release) > >Forgive me if I have missed something here, but are these patches intended > for the 3.0.x branch or for trunk? As per Bernard's response below, the > 3.0.x > branch is in maintenance mode only. All new feature should be directed at > trunk > and submitted as unified diff's rather than modified files. If a patch is > determined to be a critical bug fix for a previous version, it will be > backported to the maintenance branch at that point. Since I am unable to > view > the bug in Bugzilla (due to some kind of bugzilla issue), I am not exactly > sure > what these patches are trying to accomplish. So again, forgive me if I have > missed something. > > Brad > > >>> On 2/19/2008 at 11:26 PM, in message <[EMAIL PROTECTED]>, Ulf Lange > wrote: > > Hi, > > > > here are the patches for the 3.0.x snapshot from last week. > > > > It would be okay, to apply the patches at 3.0.8. I' m monitoring a lot > > of AIX Servers and they seem to work well with the patch from Michael. > > > > Part 1/2 > > > > Regards > > Ulf > > > > Jesse Becker schrieb: > >> Any chance you could re-post them as .gz or .zip files, instead of .rar? > >> > >> On Feb 19, 2008 2:31 PM, Ulf Lange wrote: > >> > >>> Hi, > >>> > >>> I don' t want to get on your nerves, but can somebody checkin the > >>> patches from Micheal(bugid 146)? > >>> I included the patched files in my last two mails. > >>> > >>> Regards, > >>> Ulf > >>> > >>> Ulf Lange schrieb: > >>> > >>> > >>>> Hi, > >>>> > >>>> I' ve patched the current release from > >>>> http://therealms.org/oss/ganglia/testing/ with the patches from > >>>> Micheal Perzl. > >>>> > >>>> Up to now, I was not able to test them (no time) as for AIX. The > >>>> problem is that the AIX rpcgen is buggy (see > >>>> http://www.perzl.org/ganglia/ganglia-p5metrics-v3.0.5.html), so you > >>>> need to generate protocol_xdr.c and protocol.h manualy. > >>>> > >>>> One thing I' ve not applied from the patch was the #define SLEEP_TIME > >>>> 1 in test-metrics.c. > >>>> > >>>> The patched files should work on AIX, as far as the protocol_xdr.c and > >>>> protocol.h are created. > >>>> Maybe you can already work with the patch. > >>>> Compiled with: gcc -v > >>>> Reading specs from > >>>> /opt/freeware/lib/gcc-lib/powerpc-ibm-aix5.3.0.0/3.3.2/specs > >>>> Configured with: ../configure --with-as=/usr/bin/as > >>>> --with-ld=/usr/bin/ld --disable-nls --enable-languages=c,c++ > >>>> --prefix=/opt/freeware --enable-threads > >>>> --enable-version-specific-runtime-libs --host=powerpc-ibm-aix5.3.0.0 > >>>> Thread model: aix > >>>> gcc version 3.3.2 > >>>> > >>>> # ./configure --disable-shared --enable-static > >>>> > >>>> Part 1/2 > >>>> > >>>> Regards, > >>>> Ulf > >>>> Bernard Li schrieb: > >>>> > >>>>> Hi Ulf: > >>>>> > >>>>> On 2/13/08, Ulf wrote: > >>>>> > >>>>> > >>>>> > >>>>>> you know, my never ending wish is the integration of > >>>>>> http://wtf.ath.cx/ganglia-dev/custom_graph_addon.tar.gz . The > >>>>>> integration with 3.0.6 still works fine. > >>>>>> > >>>>>> > >>>>> The 3.0.x bran
Re: [Ganglia-developers] gmond Spoof memory leak fix
- Original Message > From: Martin Hicks <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]> > Cc: Ganglia Developers > Sent: Wednesday, February 20, 2008 7:33:32 PM > Subject: Re: [Ganglia-developers] gmond Spoof memory leak fix > > > On Wed, Feb 20, 2008 at 10:27:33AM -0800, Martin Knoblauch wrote: > > Hi, > > > > if you resend it as an attachment, I would apply the fix. > > You can apply it with my blabbering at the beginning. :) > patch ignores the stuff before the --- > > The patch is attached for your convenience. > My problem is, that my MUA just garbles the white space. So, I prefer inlined patches. > > > > Cheers > > Martin > > PS: How is life at SGI nowadays? > > Seems okay. I just got here recently. :) > I left about 10 years ago. Different place at that time, I think. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
- Original Message > From: Jesse Becker <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]> > Cc: Ganglia Developers > Sent: Wednesday, February 20, 2008 4:01:26 PM > Subject: Re: [Ganglia-developers] Memory leak in gmond > > On Feb 19, 2008 7:39 PM, Martin Knoblauch wrote: > > - Original Message > > > From: Jesse Becker > > > To: Ganglia Developers > > > Sent: Tuesday, February 19, 2008 11:25:54 PM > > > Subject: Re: [Ganglia-developers] Memory leak in gmond > > > > > > I'm not sure if this is right--I've only take a really quick check in > > > libmetrics/linux/metrics.c, and my C-fu is rusty. > > > > > > It looks like strndup() is called in linux/metrics.c:hash_lookup > > > (about line 131) to dupliate an interface name, which is included in > > > the stats structure as stats->name. The net_dev_stats function will > > > return this struct. > > > > > > The function is called in a number of places pkts_in_func, > > > pkts_out_func, bytes_out_func and bytes_in_func. The variable "*ns" > > > is assigned the output of hash_lookup (e.g. the struct). Since the > > > 'name' element is malloc()ed, but not explictly freed, it will not go > > > away when *ns goes out of scope. This is the leak, isn't it? All > > > four of these functions are very similar, and need to be fixed if this > > > is the case. > > > > > > Or did I miss something obvious? :) > > > > > > > Lines 137, 148 and 159 ? :-) > > I saw those. :-P I meant after the struct has been returned, outside > the function, the memory is never freed. Inside that function, it's > okay. > We actually had a memory leak in that area. The four networking functions would alllocate and then leak the device-names. But that has been fixed in both trunk and 3.0.X about 10 days ago. > > The memory allocated in line 151 is never freed, indeed. But it is only > > allocated once per interface and stays alive for the entire lifetime of the > > gmond process. So, it is not leaked. > > Ah, that makes more sense, especially if those variables exist for the > lifetime of the program. > Yup. It is really important to know that the lifetime of those structures. We actually might have a problem in the case when hot-unplugging network cards. But I guess that the resulting "leak" might be tolerable :-) > So, I've just run gmond under valgrind and duma (a fork of the old > Electric Fence memory debugger), and I can't seem to reproduce the > problem now. Neither one of them is showing any obvious leaks, at > least not in the 15 minute tests I've run. The test system(s) are > CentOS4.6 boxes. > These things happen. Cheers Martin - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond Spoof memory leak fix
Hi, if you resend it as an attachment, I would apply the fix. Cheers Martin PS: How is life at SGI nowadays? -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Martin Hicks <[EMAIL PROTECTED]> > To: Ganglia Developers > Sent: Wednesday, February 20, 2008 7:06:41 PM > Subject: [Ganglia-developers] gmond Spoof memory leak fix > > > Hi, > > Here's a patch against ganglia-3.0.6.200802141157 that fixes a memory > leak when using user defined metrics with spoofing. > > The problem was that the spmetric was being copied out, ignoring the > spheader. The strings that were allocated inside the spheader were > dropped. > > mh > > --- ganglia-3.0.6.200802141157/gmond/gmond.c2008-02-14 14:58:58.0 > -0500 > +++ ganglia-3.0.6.200802141157.mod/gmond/gmond.c2008-02-20 > 11:46:23.0 -0500 > @@ -831,11 +831,13 @@ Ganglia_message_save( Ganglia_host *host >/* Copy in the data */ >// Yemi >if(message->id == spoof_metric){ > -// Store data as regular gmetric in hash table!! > + /* Store data as regular gmetric in hash table!! > + * Free the Spoof-related strings. > + */ > > - metric->message.id = metric_user_defined; > + metric->message.id = metric_user_defined; >metric->message.Ganglia_message_u.gmetric = > message->Ganglia_message_u.spmetric.gmetric; > - > + xdr_free(xdr_Ganglia_spoof_header, > &message->Ganglia_message_u.spmetric.spheader); > >}else{ >memcpy(&(metric->message), message, sizeof(Ganglia_message)); > > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
- Original Message > From: Jesse Becker <[EMAIL PROTECTED]> > To: Ganglia Developers > Sent: Tuesday, February 19, 2008 11:25:54 PM > Subject: Re: [Ganglia-developers] Memory leak in gmond > > I'm not sure if this is right--I've only take a really quick check in > libmetrics/linux/metrics.c, and my C-fu is rusty. > > It looks like strndup() is called in linux/metrics.c:hash_lookup > (about line 131) to dupliate an interface name, which is included in > the stats structure as stats->name. The net_dev_stats function will > return this struct. > > The function is called in a number of places pkts_in_func, > pkts_out_func, bytes_out_func and bytes_in_func. The variable "*ns" > is assigned the output of hash_lookup (e.g. the struct). Since the > 'name' element is malloc()ed, but not explictly freed, it will not go > away when *ns goes out of scope. This is the leak, isn't it? All > four of these functions are very similar, and need to be fixed if this > is the case. > > Or did I miss something obvious? :) > Lines 137, 148 and 159 ? :-) The memory allocated in line 151 is never freed, indeed. But it is only allocated once per interface and stays alive for the entire lifetime of the gmond process. So, it is not leaked. Cheers Martin - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Move to getimeofday precision for slurpfile (Linux)
Hi, just checked in a change that allows microsecond precision for slurpfile. Besides potentially allowing subsecond sampling, it gives a bit better precision for the networking rates. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Consolidation of network metrics functions for Linux
Hi, I just checked into trunk a first cut on removing duplicate code in the linux/metric.c file. I started working on the network functions, because I am also trying to track down a problem where we are seeing petabyte/sec spikes every few hours, which I attribute to some problem in the overflow handling for the counters. Observer on x86_64 in 64-bit mode. First measure is to do all important math integer only, next I may decide to just drop samples where counters are overflowing. I also checked in two small fixes to the "test-metrics.c" code. A missing "," and a logically wrong "#ifdef CYGWIN". Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
Hi folks, ACK from my side too. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Kumar Vaibhav <[EMAIL PROTECTED]> > To: Bernard Li <[EMAIL PROTECTED]> > Cc: Brad Nicholes <[EMAIL PROTECTED]>; > ganglia-developers@lists.sourceforge.net; Carlo Marcelo Arenas Belon <[EMAIL > PROTECTED]>; Martin Knoblauch <[EMAIL PROTECTED]> > Sent: Monday, February 18, 2008 5:07:49 AM > Subject: Re: [Ganglia-developers] Memory leak in gmond > > Hi Bernard, > > I think the problem is solved. I don't see any rise in memory of gmond > for the last three days. Thanks for the fix. I will be waiting for 3.0.7 > with this patch. > > Once again thanks a lot. > Vaibhav > Bernard Li wrote: > > Hi Vaibhav: > > > > On 2/15/08, Kumar Vaibhav wrote: > > > >> I am testing the new release on my systems. Initial results are > >> encouraging. I can tell the final words after weekend since I am keeping > >> it for the test over the weekend. > > > > Sure, please update us after the weekend, we'll likely release 3.0.7 then. > > > > Cheers, > > > > Bernard > > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
Hi Bernard, as explained elsewhere, the leak it is only indirectly depending on the configuration. Any non-mute config will call the network functions and leak the "devname" memory. Just an update, over tha last twenty hours an unpatched 3.0.4 gmond has grown by 350 KB, while a patched one is still at its initial size (it is even 1 page smaller than the unpatched gmond initially was :-) This is a real nasty one and I am surprised that it was only detected lately. As far as I know, that bug has been in the Linux metrics code since day one. Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Bernard Li <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net; Brad Nicholes <[EMAIL > PROTECTED]>; Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]> > Sent: Thursday, February 14, 2008 11:01:46 PM > Subject: Re: [Ganglia-developers] Memory leak in gmond > > Hi all: > > On 2/14/08, Martin Knoblauch wrote: > > > thanks. My tests are still running. The new binaries do not grow anymore. > > Or > at least a lot slower than the original 3.0.4 > > Since I can't reproduce this, can someone please explain to me what > configuration triggers this? I'll put this in the release notes. > > Thanks, > > Bernard > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
Hi Brad, thanks. My tests are still running. The new binaries do not grow anymore. Or at least a lot slower than the original 3.0.4 Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Bernard Li <[EMAIL PROTECTED]> > Cc: Martin Knoblauch <[EMAIL PROTECTED]>; > ganglia-developers@lists.sourceforge.net; Carlo Marcelo Arenas Belon <[EMAIL > PROTECTED]> > Sent: Thursday, February 14, 2008 8:54:15 PM > Subject: Re: [Ganglia-developers] Memory leak in gmond > > > > I > reverted > the > patch > that > I > had > originally > in > trunk > and > have > committed > Martin's > patch > to > both > trunk > and > the > monitor-core-3.0 > branch. > There > has > been > some > testing > done > already, > but > it > would > probably > be > good > to > test > a > bit > more. > > Brad > > >>> > On > 2/14/2008 > at > 11:50 > AM, > in > message > , > "Bernard > Li" > > wrote: > > > Hi > guys: > > > > > On > 2/14/08, > Brad > Nicholes > > wrote: > > > >> > > > It > doesn't > really > matter > to > me > which > patch > we > use. > The > most > important > > > thing > is > consistency. > If > you > feel > like > your > patch > is > more > complete, > I > would > > > suggest > that > you > drop > it > into > trunk > first > and > then > backport > it > into > the > 3.0.x > > > branch. > We > probably > need > to > look > at > the > other > platforms > as > well > and > cross > > > port > the > patch > if > required. > I > think > your > other > suggestion > about > migrating > > > duplicate > code > into > common > functions > is > a > good > one > as > well. > > > > > I > would > like > to > release > 3.0.7 > soonish, > so > if > you > guys > could > decide > > > which > patch > to > use > and > check > them > into > the > branch > and > trunk, > I'd > > > appreciate > it. > > > > > I'll > roll > a > snapshot > and > have > folks > test > before > doing > the > release. > > > > > Thanks, > > > > > Bernard > > > > > - > > > This > SF.net > email > is > sponsored > by: > Microsoft > > > Defy > all > challenges. > Microsoft(R) > Visual > Studio > 2008. > > > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > > > ___ > > > Ganglia-developers > mailing > list > > > Ganglia-developers@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
Brad, definitely, one of the two patches should go into 3.0.X. Both seem to do the same. See other comments elsewhere. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Kumar Vaibhav <[EMAIL PROTECTED]>; Martin Knoblauch <[EMAIL PROTECTED]>; > Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Thursday, February 14, 2008 4:40:10 PM > Subject: Re: [Ganglia-developers] Memory leak in gmond > >This was already fixed in trunk about a week ago along with several other > memory leaks that were more specific to 3.1 rather than 3.0. We should > probably > just backport the trunk patch to 3.0.7 to maintain consistency. > > Brad > > >>> On 2/14/2008 at 6:29 AM, in message > <[EMAIL PROTECTED]>, Martin Knoblauch > wrote: > > Hi, > > > > maybe attached patch (based on 3.0.4) can fix the leak. The daemon runs > > and > > reports metrics. It is of course to early to say. > > > > When looking at the linux metrics file, I just realized hom much code > > duplication there is. Basically all funtion-groups that grok the same > > /proc/xxx files should be rewritten to use common code. This ist true for > > cpu, load and network. Maybe others. > > > > Cheers > > Martin > > ---------- > > Martin Knoblauch > > email: k n o b i AT knobisoft DOT de > > www: http://www.knobisoft.de > > > > - Original Message > >> From: Martin Knoblauch > >> To: Kumar Vaibhav ; Carlo Marcelo Arenas Belon > > > >> Cc: ganglia-developers@lists.sourceforge.net > >> Sent: Thursday, February 14, 2008 11:36:37 AM > >> Subject: Re: [Ganglia-developers] Memory leak in gmond > >> > >> Hi, > >> > >> after looking at one of my employerss customers installations, it > > definitely > >> seems that metrics-collecting/non-mute "gmond"s are growing > >> (substantially) > > over > >> time. Pure listeners seem to be unaffected. > >> > >> If I remember correctly, Kumars valgrind traces found that "strndup" > >> might > >> allocate later leaked memory. If I look at the 3.0.4 > > libmetrics/linux/metrics.c > >> I have the strong feeling that all four network functions are careless > >> about > > the > >> memory allocated by strndup: > >> > >> 217: char *devname, *src; > >> 228: devname = strndup(src, n); > >> 238: net_dev_stats *ns = hash_lookup(devname, 1, > >> > >> 305: char *devname, *src; > >> 316: devname = strndup(src, n); > >> 326: net_dev_stats *ns = hash_lookup(devname, 1, > >> > >> 393: char *devname, *src; > >> 404: devname = strndup(src, n); > >> 414: net_dev_stats *ns = hash_lookup(devname, 1, > >> > >> 481: char *devname, *src; > >> 492: devname = strndup(src, n); > >> 502: net_dev_stats *ns = hash_lookup(devname, 1, > >> > >> > >> Have to look at it some more. > >> > >> Cheers > >> Martin > >> -- > >> Martin Knoblauch > >> email: k n o b i AT knobisoft DOT de > >> www: http://www.knobisoft.de > >> > >> - Original Message > >> > From: Kumar Vaibhav > >> > To: Carlo Marcelo Arenas Belon > >> > Cc: ganglia-developers@lists.sourceforge.net > >> > Sent: Saturday, February 9, 2008 8:59:18 AM > >> > Subject: Re: [Ganglia-developers] Memory leak in gmond > >> > > >> > Carlo Marcelo Arenas Belon wrote: > >> > > On Tue, Jan 22, 2008 at 04:17:07PM +0530, Kumar Vaibhav wrote: > >> > >> I am using ganglia-3.0.5 on a woodcrest processor cluster. and I see > >> > >> that after running for weeks the memory consumption of the gmond > >> > >> process > > >> > >> is something about 400 MB. > >> > > > >> > > did you check what was the size 1 hour after all gmond proceses in your > >> > > cluster were started?, if you are using multicast and h
Re: [Ganglia-developers] Memory leak in gmond
Hi Brad, as far as I can see, both patches achieve the same result. I like mine a bit more, because it concentrates the whole allocation/freeing of the temporary "devname" inside the hash_lookup routine. Less code, less chance to forget it again. But the result counts. I can now attest, that my version fixes the leak in 3.0.x. Cheers Martin ------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Kumar Vaibhav <[EMAIL PROTECTED]>; Martin Knoblauch <[EMAIL PROTECTED]>; > Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Thursday, February 14, 2008 4:44:23 PM > Subject: Re: [Ganglia-developers] Memory leak in gmond > >Here is the patch diff in SVN > > http://ganglia.svn.sourceforge.net/viewvc/ganglia/trunk/monitor-core/libmetrics/linux/metrics.c?r1=860&r2=933 > > > I haven't looked at any of the other platforms besides linux. Do we have the > same problem there? > > > Brad > > >>> On 2/14/2008 at 6:29 AM, in message > <[EMAIL PROTECTED]>, Martin Knoblauch > wrote: > > Hi, > > > > maybe attached patch (based on 3.0.4) can fix the leak. The daemon runs > > and > > reports metrics. It is of course to early to say. > > > > When looking at the linux metrics file, I just realized hom much code > > duplication there is. Basically all funtion-groups that grok the same > > /proc/xxx files should be rewritten to use common code. This ist true for > > cpu, load and network. Maybe others. > > > > Cheers > > Martin > > ------ > > Martin Knoblauch > > email: k n o b i AT knobisoft DOT de > > www: http://www.knobisoft.de > > > > - Original Message > >> From: Martin Knoblauch > >> To: Kumar Vaibhav ; Carlo Marcelo Arenas Belon > > > >> Cc: ganglia-developers@lists.sourceforge.net > >> Sent: Thursday, February 14, 2008 11:36:37 AM > >> Subject: Re: [Ganglia-developers] Memory leak in gmond > >> > >> Hi, > >> > >> after looking at one of my employerss customers installations, it > > definitely > >> seems that metrics-collecting/non-mute "gmond"s are growing > >> (substantially) > > over > >> time. Pure listeners seem to be unaffected. > >> > >> If I remember correctly, Kumars valgrind traces found that "strndup" > >> might > >> allocate later leaked memory. If I look at the 3.0.4 > > libmetrics/linux/metrics.c > >> I have the strong feeling that all four network functions are careless > >> about > > the > >> memory allocated by strndup: > >> > >> 217: char *devname, *src; > >> 228: devname = strndup(src, n); > >> 238: net_dev_stats *ns = hash_lookup(devname, 1, > >> > >> 305: char *devname, *src; > >> 316: devname = strndup(src, n); > >> 326: net_dev_stats *ns = hash_lookup(devname, 1, > >> > >> 393: char *devname, *src; > >> 404: devname = strndup(src, n); > >> 414: net_dev_stats *ns = hash_lookup(devname, 1, > >> > >> 481: char *devname, *src; > >> 492: devname = strndup(src, n); > >> 502: net_dev_stats *ns = hash_lookup(devname, 1, > >> > >> > >> Have to look at it some more. > >> > >> Cheers > >> Martin > >> -- > >> Martin Knoblauch > >> email: k n o b i AT knobisoft DOT de > >> www: http://www.knobisoft.de > >> > >> - Original Message > >> > From: Kumar Vaibhav > >> > To: Carlo Marcelo Arenas Belon > >> > Cc: ganglia-developers@lists.sourceforge.net > >> > Sent: Saturday, February 9, 2008 8:59:18 AM > >> > Subject: Re: [Ganglia-developers] Memory leak in gmond > >> > > >> > Carlo Marcelo Arenas Belon wrote: > >> > > On Tue, Jan 22, 2008 at 04:17:07PM +0530, Kumar Vaibhav wrote: > >> > >> I am using ganglia-3.0.5 on a woodcrest processor cluster. and I see > >> > >> that after running for weeks the memory consumption of the gmond >
Re: [Ganglia-developers] Memory leak in gmond
Hi, maybe attached patch (based on 3.0.4) can fix the leak. The daemon runs and reports metrics. It is of course to early to say. When looking at the linux metrics file, I just realized hom much code duplication there is. Basically all funtion-groups that grok the same /proc/xxx files should be rewritten to use common code. This ist true for cpu, load and network. Maybe others. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Martin Knoblauch <[EMAIL PROTECTED]> > To: Kumar Vaibhav <[EMAIL PROTECTED]>; Carlo Marcelo Arenas Belon <[EMAIL > PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Thursday, February 14, 2008 11:36:37 AM > Subject: Re: [Ganglia-developers] Memory leak in gmond > > Hi, > > after looking at one of my employerss customers installations, it definitely > seems that metrics-collecting/non-mute "gmond"s are growing (substantially) > over > time. Pure listeners seem to be unaffected. > > If I remember correctly, Kumars valgrind traces found that "strndup" might > allocate later leaked memory. If I look at the 3.0.4 > libmetrics/linux/metrics.c > I have the strong feeling that all four network functions are careless about > the > memory allocated by strndup: > > 217: char *devname, *src; > 228: devname = strndup(src, n); > 238: net_dev_stats *ns = hash_lookup(devname, 1, > > 305: char *devname, *src; > 316: devname = strndup(src, n); > 326: net_dev_stats *ns = hash_lookup(devname, 1, > > 393: char *devname, *src; > 404: devname = strndup(src, n); > 414: net_dev_stats *ns = hash_lookup(devname, 1, > > 481: char *devname, *src; > 492: devname = strndup(src, n); > 502: net_dev_stats *ns = hash_lookup(devname, 1, > > > Have to look at it some more. > > Cheers > Martin > -- > Martin Knoblauch > email: k n o b i AT knobisoft DOT de > www: http://www.knobisoft.de > > - Original Message > > From: Kumar Vaibhav > > To: Carlo Marcelo Arenas Belon > > Cc: ganglia-developers@lists.sourceforge.net > > Sent: Saturday, February 9, 2008 8:59:18 AM > > Subject: Re: [Ganglia-developers] Memory leak in gmond > > > > Carlo Marcelo Arenas Belon wrote: > > > On Tue, Jan 22, 2008 at 04:17:07PM +0530, Kumar Vaibhav wrote: > > >> I am using ganglia-3.0.5 on a woodcrest processor cluster. and I see > > >> that after running for weeks the memory consumption of the gmond process > > >> is something about 400 MB. > > > > > > did you check what was the size 1 hour after all gmond proceses in your > > > cluster were started?, if you are using multicast and have a large number > > > of > > > nodes/metrics then that is the ammount of memory that is needed to hold > > > all > > > those metrics from all nodes most likely. > > I Checked it . The memory size increases with Time. i Tried ps -eo > > cmd,rss and can see the size of gmond increases with time. > > > > > >> ==2381== LEAK SUMMARY: > > >> ==2381==definitely lost: 69 bytes in 16 blocks. > > >> ==2381== possibly lost: 0 bytes in 0 blocks. > > > > > > that means there is no memory leak (execpt for 69 bytes) > > This is so because I had run it for few minutes only. > > > > > >> ==2381==still reachable: 1,446,276 bytes in 1,463 blocks. > > > > > > that is the RSS of your process > > by memory I mean RSS only. > > > > > > Here are some new tests I have done. > > > > I isolated two nodes of the cluster by changing their multicast address. > > On one I run gmond in mute mode and on one in deaf mode. The RSS of > > gmond in deaf node continues to increase. But the RSS of gmond on mute > > mode stablises after some. time. And it didn't increase for a week. > > > > Hope this will help you to solve the problem. > > > > > > Carlo > > > > Vaibhav > > > > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > > ___ > > Ganglia-developers mailing
Re: [Ganglia-developers] Memory leak in gmond
Hi, after looking at one of my employerss customers installations, it definitely seems that metrics-collecting/non-mute "gmond"s are growing (substantially) over time. Pure listeners seem to be unaffected. If I remember correctly, Kumars valgrind traces found that "strndup" might allocate later leaked memory. If I look at the 3.0.4 libmetrics/linux/metrics.c I have the strong feeling that all four network functions are careless about the memory allocated by strndup: 217: char *devname, *src; 228: devname = strndup(src, n); 238: net_dev_stats *ns = hash_lookup(devname, 1, 305: char *devname, *src; 316: devname = strndup(src, n); 326: net_dev_stats *ns = hash_lookup(devname, 1, 393: char *devname, *src; 404: devname = strndup(src, n); 414: net_dev_stats *ns = hash_lookup(devname, 1, 481: char *devname, *src; 492: devname = strndup(src, n); 502: net_dev_stats *ns = hash_lookup(devname, 1, Have to look at it some more. Cheers Martin ---------- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Kumar Vaibhav <[EMAIL PROTECTED]> > To: Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Saturday, February 9, 2008 8:59:18 AM > Subject: Re: [Ganglia-developers] Memory leak in gmond > > Carlo Marcelo Arenas Belon wrote: > > On Tue, Jan 22, 2008 at 04:17:07PM +0530, Kumar Vaibhav wrote: > >> I am using ganglia-3.0.5 on a woodcrest processor cluster. and I see > >> that after running for weeks the memory consumption of the gmond process > >> is something about 400 MB. > > > > did you check what was the size 1 hour after all gmond proceses in your > > cluster were started?, if you are using multicast and have a large number of > > nodes/metrics then that is the ammount of memory that is needed to hold all > > those metrics from all nodes most likely. > I Checked it . The memory size increases with Time. i Tried ps -eo > cmd,rss and can see the size of gmond increases with time. > > > >> ==2381== LEAK SUMMARY: > >> ==2381==definitely lost: 69 bytes in 16 blocks. > >> ==2381== possibly lost: 0 bytes in 0 blocks. > > > > that means there is no memory leak (execpt for 69 bytes) > This is so because I had run it for few minutes only. > > > >> ==2381==still reachable: 1,446,276 bytes in 1,463 blocks. > > > > that is the RSS of your process > by memory I mean RSS only. > > > Here are some new tests I have done. > > I isolated two nodes of the cluster by changing their multicast address. > On one I run gmond in mute mode and on one in deaf mode. The RSS of > gmond in deaf node continues to increase. But the RSS of gmond on mute > mode stablises after some. time. And it didn't increase for a week. > > Hope this will help you to solve the problem. > > > > Carlo > > Vaibhav > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ganglia-web-3.0.6-1 on SLES10 SP1. Missing Requirement
--- [EMAIL PROTECTED] wrote: > Quoting Martin Knoblauch <[EMAIL PROTECTED]>: > > > Hi Bernard, > > > > just by chance I had to install 3.0.6 on Sles10sp1 this week. I > got > > the same problem and installing the ctype package for php5 solved > the > > issue. > > > > Cheers > > Martin > > Is the presence of the php-ctype package in the RPM database enough > to > confirm that the ctype_* functions are available to PHP? > > I'm not familiar with SuSE, but on Red Hat the PHP sub-packages drop > > bits of configuration in /etc/php.d when they are installed. These > contain the needed configuration lines for PHP to install the new > modules provided by the sub-package. You could have a situation > where > a sub-package is installed, but the configuration file has been > removed, so it's present in the RPM database but not loaded into PHP. > > Is there a similar situation in SuSE? > > alex > Alex, in my case the RPM package for php5-ctype was just missing. No broken setup. Installing the RPM solved the issue. As Bernard wrote - we cannot forsee all possible breakages. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ganglia-web-3.0.6-1 on SLES10 SP1. Missing Requirement
Hi Bernard, just by chance I had to install 3.0.6 on Sles10sp1 this week. I got the same problem and installing the ctype package for php5 solved the issue. Cheers Martin --- Bernard Li <[EMAIL PROTECTED]> wrote: > Hi guys: > > On 2/4/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > > > I' ve just installed ganglia-web-3.0.6-1 (created the package > from source). > > > My OS: SLES 10 SP1 > > > But no graphs will be generated: > > > [Mon Feb 04 13:28:38 2008] [error] [client 127.0.0.1] PHP Fatal > > > error: Call to undefined function ctype_digit() in > > > /srv/www/htdocs/ganglia/functions.php on line 428 > > > > > > To fix the problem you have to install the package > > > php5-ctype-5.1.2-29.50.x86_64.rpm. > > > > > > Is it possible to check it in the spec file? Probably to > complicated. > > > Alternative would be to write a smal pre script for the rpm? > > > > Hi. I chose the ctype extension for input validations since it's > been > > enabled by default in PHP since 4.2.0, and it tends to be faster > than > > other methods of validating input (regular expressions or is_* > > functions). Knowing that ctype is a sub-package on SuSE makes me > > wonder if there's a way to get the same effect which will always be > > enabled. > > > > Using is_numeric() rather than ctype_digit() would be an option. > That > > would always be available, as it's part of the core language. > > Ulf: If you can file a bug in bugzilla, I will take care of it. > Alex: Can you please work with Ulf to confirm that having the package > installed means the extension is enabled? > > Thanks, > > Bernard > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Moving all built-in metrics to metric modules...
Hi Brad, that seems to be a pretty useful move. Seems it is time that I really start looking closely at 3.1.x Cheers Martin Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: ganglia-developers@lists.sourceforge.net; [EMAIL PROTECTED] > Sent: Tuesday, December 18, 2007 11:44:45 PM > Subject: [Ganglia-developers] Moving all built-in metrics to metric modules... > >I just committed a rather substantial patch to Ganglia 3.1.0 > trunk > which will affect the way that gmond 3.1.x is deployed. I am > posting > this to both the developer list and the general list so that all will > be > aware of the changes and why they are important. The primary > purpose > for the patch was to remove all of the built in metrics out of the > gmond > binary and allow them to be built as loadable modules. The > following > is a more detailed list of what has changed. Hopefully from a > user > perspective, gmond will continue to work as it has in the past. But > going > forward, it will be much more flexible with regards to the core set > of > metrics. > > * All built-in metrics have been removed from the gmond binary > - A new set of core metric modules have been created that > represent > the same set metrics that gmond has always gathered. These new > core > modules are mod_cpu.so, mod_disk.so, mod_load.so, mod_mem.so, > mod_net.so, > mod_proc.so and mod_sys.so. Each of these modules is basically > a > wrapper around the metric functions that exist in libmetrics. > Being > wrappers, they still make the same metric function calls as have always > been > made. And since libmetrics contains all of the platform specific > metric > code, the metric function calls made by the core modules will > continue > to do the right thing for all of the platforms that have > been > previously supported. > - There is also an extra module called core_metrics which contains > the > heartbeat, location and gexec metrics. Even though this module > could > be dynamically loaded in the same manner as the others, it is > always > statically linked simply because gmond would not be able to > function > properly without these metrics so there is no real reason to allow > these > metrics to be dynamically loaded. > - Some additional configuration has been added to the > gmond.conf > file. Because the core metrics are now implemented as modules, > this > requires a module configuration block that instructs gmond to load > each > module. A set of module blocks has been added to the default > gmond.conf > file. > > * All metric specific metadata definitions have been removed > from > protocol.x > - With the refactoring of the XDR data and removal of the > builtin > metrics, there is no longer any need for XDR to have intimate > knowledge > of the core metrics. Therefore the metric structure array and enum > have > been removed and are now part of the core metric modules themselves. > > * --enable-static-build statically links the core metric modules > - Building gmond statically will statically link not only APR, > expat > and libconfuse, it will also statically link all of the core > metric > modules into the gmond binary. The result should be a gmond binary > that > looks and feels just like the old 3.0.x statically linked gmond > binary. > The one exception is that a module statement is still required in > the > gmond.conf file. The difference between the module > configuration > block for dynamically loaded modules and the module blocks for > statically > linked modules is whether or not a path to the .so is included. > The > configure script and makefiles have been modified to > detect > --enable-static-build and build the default gmond.conf file appropriately. > > * --enable-static-build + --enable-python statically links the > python > module > - One of the downsides of building gmond 3.1.x statically was > that > doing so would disable all of the dynamically loadable module > capability. > The reason for this is the need for both gmond and the > pluggable > modules to dynamically link with libapr1. However, if > both > --enable-static-build and --enable-python are specified during configure, a > gmond > binary will be built with mod_python statically linked. This > provides > gmond with the ability to continue to load and run python metric modules > in > the same manner as the non-static build. In o
Re: [Ganglia-developers] [Ganglia-general] Ganglia 3.0.6 (Foss) released
Bernard, great job from you and the team. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Bernard Li <[EMAIL PROTECTED]> > To: Ganglia General <[EMAIL PROTECTED]>; Ganglia Developers > ; [EMAIL PROTECTED] > Sent: Monday, December 17, 2007 7:35:09 AM > Subject: [Ganglia-general] Ganglia 3.0.6 (Foss) released > > The Ganglia development team is pleased to release Ganglia > 3.0.6 (Foss) which is available for immediate download from: > > http://sourceforge.net/project/showfiles.php?group_id=43021&package_id=35280 > > This release includes a security fix for web frontend > cross-scripting vulnerability. > > All Ganglia web frontend users are strongly recommended to > upgrade to this version. In most cases the version of the > frontend does not need to match the version of gmetad and/or > gmond -- if problem arises, please drop us a note at > [EMAIL PROTECTED] > > Special thanks to Romain Wartel at CERN for discovering the > vulnerability and reporting it to us and to Alex Dean for > stepping up with the fix so quickly. > > Bernard, > > on behalf of the Ganglia Development Team > > - > SF.Net email is sponsored by: > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services > for just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > ___ > Ganglia-general mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] web front-end cross-scripting vulnerability
- Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Matt Massie <[EMAIL PROTECTED]>; Bernard Li <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Wednesday, December 5, 2007 10:59:42 PM > Subject: Re: [Ganglia-developers] web front-end cross-scripting vulnerability > > >>> On 12/5/2007 at 12:22 PM, in message > , > "Bernard > Li" > wrote: > > Hi guys: > > > > On 12/5/07, Matt Massie wrote: > > > >> outstanding! > >> > >> i'll send all the details to you in a separate email. thanks > for > stepping > >> up! > > > > I guess we should re-open the 3.0.x branch, backport the fixes from > > trunk and release 3.0.6 as a security bugfix release? > > > > Absolutely, as soon as Alex is finished with his review and fixes > we > should plan on releasing 3.0.6. I would suggest that it be just > a > security release and that we don't try to push in anything else. > > Brad > Hi Folks, I tend to agree. Unless there is a critical functional bug in 3.0.5, we should just do a security release. Cheers Martin - SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Patch to graph.php for bits/sec in network graphs
Hi Bernard, as far as I remember, there has been no more discussion on the topic. Making the units configurable would definitely be an option, but I think that is 3.1 material. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message > From: Bernard Li <[EMAIL PROTECTED]> > To: Martin Knoblauch <[EMAIL PROTECTED]> > Cc: Caleb Epstein <[EMAIL PROTECTED]>; > ganglia-developers@lists.sourceforge.net > Sent: Saturday, November 17, 2007 3:22:12 AM > Subject: Re: [Ganglia-developers] Patch to graph.php for bits/sec in network > graphs > > Caleb, Martin: > > Any more discussions regarding this? If not, I would probably just > leave it as is and close the ticket (unless there is a strong reason > to switch). > > P.S. How about having a configuration parameter to switch between > the > two? > > Cheers, > > Bernard > > On 11/2/07, Martin Knoblauch wrote: > > > > Hi, > > > > not sure here. I personally view bytes_in/_out as data > throughput, > where Bytes/sec makes more sese. > > > > Cheers > > Martin > > > > -- > > > > Martin Knoblauch > > > > email: k n o b i AT knobisoft DOT de > > > > www: http://www.knobisoft.de > > > > > > > > - Original Message > > > > From: Caleb Epstein > > > > To: ganglia-developers@lists.sourceforge.net > > > > Sent: Friday, November 2, 2007 9:52:18 PM > > > > Subject: [Ganglia-developers] Patch to graph.php for bits/sec > in > network graphs > > > > > > > > Attached patch to ganglia-3.0.5 causes the network graphs to > be > rendered as bits/sec instead of bytes/sec. > > > > > > > > Seeing as network capacties are usually measured in bits/sec, > this > seems like a sensible default. > > > > > > > > -- > > > > Caleb Epstein > > > > > > > > -Inline Attachment Follows- > > > > > > > > diff -ur > ganglia-3.0.5/web/graph.php > /pub/www/monitor/ganglia/graph.php > > > > --- ganglia-3.0.5/web/graph.php2007-10-03 > 00:48:43.0 > -0400 > > > > +++ /pub/www/monitor/ganglia/graph.php > 2007-11-02 > 16:26:50.88059 -0400 > > > > @@ -217,12 +217,14 @@ > > > > > > > > $lower_limit = "--lower-limit 0 --rigid"; > > > > $extras = "--base 1024"; > > > > -$vertical_label = "--vertical-label 'Bytes/sec'"; > > > > +$vertical_label = "--vertical-label 'bits/sec'"; > > > > > > > > $series > = > "DEF:'bytes_in'='${rrd_dir}/bytes_in.rrd':'sum':AVERAGE " > > > > > ."DEF:'bytes_out'='${rrd_dir}/bytes_out.rrd':'sum':AVERAGE " > > > > - ."LINE2:'bytes_in'#$mem_cached_color:'In' " > > > > - ."LINE2:'bytes_out'#$mem_used_color:'Out' "; > > > > + ."CDEF:'bits_in'='bytes_in',8,* " > > > > + ."CDEF:'bits_out'='bytes_out',8,* " > > > > + ."LINE2:'bits_in'#$mem_cached_color:'In' " > > > > + ."LINE2:'bits_out'#$mem_used_color:'Out' "; > > > > } > > > >else if ($graph == "packet_report") > > > > { > > > > @@ -285,6 +287,18 @@ > > > >$rrd_file = "$rrd_dir/$metricname.rrd"; > > > >$series = "DEF:'sum'='$rrd_file':'sum':AVERAGE " > > > > ."AREA:'sum'#$default_metric_color:'$subtitle' "; > > > > + > > > > + // Make network graphs bits/sec > > > > + if ($metricname == "bytes_in" or $metricname == "bytes_out") > > > > + { > > > > + $series = "DEF:'sum'='$rrd_file':'sum':AVERAGE " > > > > + ."CDEF:'bits'='sum',8,* " > > > > +
Re: [Ganglia-developers] Patch to graph.php for bits/sec in network graphs
Hi, not sure here. I personally view bytes_in/_out as data throughput, where Bytes/sec makes more sese. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message From: Caleb Epstein <[EMAIL PROTECTED]> To: ganglia-developers@lists.sourceforge.net Sent: Friday, November 2, 2007 9:52:18 PM Subject: [Ganglia-developers] Patch to graph.php for bits/sec in network graphs Attached patch to ganglia-3.0.5 causes the network graphs to be rendered as bits/sec instead of bytes/sec. Seeing as network capacties are usually measured in bits/sec, this seems like a sensible default. -- Caleb Epstein -Inline Attachment Follows- diff -ur ganglia-3.0.5/web/graph.php /pub/www/monitor/ganglia/graph.php --- ganglia-3.0.5/web/graph.php2007-10-03 00:48:43.0 -0400 +++ /pub/www/monitor/ganglia/graph.php2007-11-02 16:26:50.88059 -0400 @@ -217,12 +217,14 @@ $lower_limit = "--lower-limit 0 --rigid"; $extras = "--base 1024"; -$vertical_label = "--vertical-label 'Bytes/sec'"; +$vertical_label = "--vertical-label 'bits/sec'"; $series = "DEF:'bytes_in'='${rrd_dir}/bytes_in.rrd':'sum':AVERAGE " ."DEF:'bytes_out'='${rrd_dir}/bytes_out.rrd':'sum':AVERAGE " - ."LINE2:'bytes_in'#$mem_cached_color:'In' " - ."LINE2:'bytes_out'#$mem_used_color:'Out' "; + ."CDEF:'bits_in'='bytes_in',8,* " + ."CDEF:'bits_out'='bytes_out',8,* " + ."LINE2:'bits_in'#$mem_cached_color:'In' " + ."LINE2:'bits_out'#$mem_used_color:'Out' "; } else if ($graph == "packet_report") { @@ -285,6 +287,18 @@ $rrd_file = "$rrd_dir/$metricname.rrd"; $series = "DEF:'sum'='$rrd_file':'sum':AVERAGE " ."AREA:'sum'#$default_metric_color:'$subtitle' "; + + // Make network graphs bits/sec + if ($metricname == "bytes_in" or $metricname == "bytes_out") + { + $series = "DEF:'sum'='$rrd_file':'sum':AVERAGE " + ."CDEF:'bits'='sum',8,* " + ."AREA:'bits'#$default_metric_color:'$subtitle' "; + + $metricname = "network" . substr ($metricname, 5); + $vertical_label = "--vertical-label 'bits/sec'"; + } + if ($jobstart) $series .= "VRULE:$jobstart#$jobstart_color "; } - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Moving on with Ganglia 3.1...
- Original Message > From: Bernard Li <[EMAIL PROTECTED]> > To: Brad Nicholes <[EMAIL PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Wednesday, October 31, 2007 11:53:46 PM > Subject: Re: [Ganglia-developers] Moving on with Ganglia 3.1... > > Hi Brad: > > Good job in compiling the list! > > I would like to complete my updates to the spec file before you do any > massive check-ins (or modifications to the spec file). So as a group, > can we answer the following questions: > > 1) Do we want to allow multiple versions of libganglia to be installed > on the same server yes. > 2) All versions of libganglia 3.1.x (eg.) should be compatible with > each other, i.e. 3.1.0 is compatible with 3.1.1 but not 3.2.x I am not sure. What happens if we have to fix a severe bug in 3.1.X+1 that involves changeing some of the APIs exposed by libganglia-3.1.X? Would that forced us to do a 3.2.0 release? But it would definitely be *desirable* that version 3.1.X can use libganglia version 3.1.X+ > 3) Do we want to name libganglia package like libganglia_3_1-3.1.0 > according to Novell's packaging rules I have no opinion on this one > 4) Split python related DSO modules to ganglia-gmond-python -- and > hopefully in the future we'll have ganglia-gmond-perl > I am not sure whether a language split is needed or useful. Implementation languages are all the same for me. What I would do is split along the lines of basic-framework vs. core-modules vs. special-modules. Cheers Martin > Let's try to wrap this up within the week, thanks all! > > Cheers, > > Bernard > > On 10/31/07, Brad Nicholes wrote: > >I took a quick look over the wish-list items that were proposed > on > the mailing list and tried to determine which items would > break > compatibility and therefore must be completed before we release 3.1.0. > I > have identified three tasks for which I am planning on completing > and > commiting the code to trunk over the next few weeks. These tasks include: > > > > 1-* Add TITLE attribute to the XDR data to communicate a > human > readable name > >There is another task on the wish list which makes this > more > general which is: > >-* Flexible method of adding extra metric metadata. > >We could include extra metadata, not just > "alias"/"title". > For example, some > >metrics have a natural minimum and maximum value. > Perhaps > coming up with an > >extendable way of encoding metric metadata so future > changes > can be included > >without losing backward compatibility. > >I would rather implement the more flexible method of adding > extra > metric metadata but I am not really sure how to do that with XDR. > If > somebody has a good idea of how that could be done with XDR, please > let > me know. Otherwise I will probably just add the attribute to > the > existing set of attributes. > > > > 2-* Add a GROUP attribute (comma delimited) to the XDR data > > This would allow metrics to declare the category that they > belong > to. The category should be added at the metric definition level > within > the metric module rather than a directive in the .conf file. Again > if > there were a more flexible way to add extra metric metadata to the > XDR > package, that would be the preferred method. Short of that, I > just > plan to add an attribute that would hold a comma delimited list of > group > names that a metric can belong to. > > > > 3-* Modify all byte count metric to 8 byte integers > >At this point I am assuming that this is one of the issues that > is > causing the 4T limit problem. For now this is just a temporary > fix. > The real fix would be to move all of the built in metrics out of > gmond > itself and implement them as C interface modules which define > the > correct counter size. If somebody wants to tackle porting the built > in > metrics rather than applying the temporary fix now, please feel free > and > let me know that you are doing it. Otherwise, I will try to take care > of > at least getting the sizing right and then port the metrics > sometime > later. > > > > > >I have attached a rough compilation of the tasks that > were > identified through the wish list. This list is not very detailed and > should > probably be used as a jumping off point for adding all of > these > enhancements into bugzilla. Once in bugzilla, more detail should be added > to > each enhancement so that we can have a good discussion about each > one, > prioritize them and get them implemented. > > > > Brad > > > > > > > - > > This SF.net email is sponsored by: Splunk Inc. > > Still grepping through log files to find problems? Stop. > > Now Search log events and configuration files using AJAX and > a > browser. > > Download your FREE copy of Splunk now >> http://get.splunk.com/ > > _
Re: [Ganglia-developers] Lets discuss the wish-list and make 3.1.0 happen (was:Re: [Ganglia-general] 4T limit on memory?)
Hi Matt, please do not hold back the meeting due to my schedule. Together with my job priorities I now have a personal matter that makes it more or less impossible for me to do any travel planning. Cheers Martin - Original Message From: Matt Massie <[EMAIL PROTECTED]> To: Bernard Li <[EMAIL PROTECTED]> Cc: Ganglia Developers ; Brad Nicholes <[EMAIL PROTECTED]> Sent: Wednesday, October 31, 2007 1:18:04 AM Subject: Re: [Ganglia-developers] Lets discuss the wish-list and make 3.1.0 happen (was:Re: [Ganglia-general] 4T limit on memory?) On 10/30/07, Bernard Li <[EMAIL PROTECTED]> wrote: Matt mentioned that GroundWork Open Source has some monies that could be used to fly some developers to the Bay Area and host a meetup -- I wonder if that offer is still on the plate (Matt?) -- as far as i know, the offer still stands. I am somewhat busy for the next two months (SuperComputing, etc.) so I think the earliest I can attend a meeting would be January. However, if the schedule is right, I could potentially fit it in November/December (the meeting will probably be a day or two I would think). i think a day or two is what i was thinking as well. it looks like february will be the earliest we could do it given martin's schedule. -matt - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia spec file cleanup
Hi Brad, - Original Message > From: Brad Nicholes <[EMAIL PROTECTED]> > To: Marcus Rueckert <[EMAIL PROTECTED]> > Cc: Ganglia Developers ; Alex > <[EMAIL PROTECTED]> > Sent: Wednesday, October 17, 2007 5:52:57 PM > Subject: Re: [Ganglia-developers] Ganglia spec file cleanup > > >>> On 10/17/2007 at 5:31 AM, in > message > <[EMAIL PROTECTED]>, Marcus > Rueckert wrote: > > On 2007-10-16 17:06:47 -0600, Brad Nicholes wrote: > >> >>> On 10/11/2007 at 4:11 PM, in message > >> > , > "Bernard Li" > >> wrote: > >> > Hi Alex: > >> > > >> > On 10/11/07, Alex wrote: > >> > > >> >> > - new subpackage modules-python which contains all > the > DSO/python > >> >> > modules (not really happy with the naming, so > suggestions > welcome!) > >> >> > > >> >> How about extensions-python? > >> > > >> > Actually I guess I also have concerns about the division of > the > files, > >> > since gmond contains the C modules and the modules-python contains > >> > just the python modules -- I wonder if this division is > necessary. > I > >> > guess I'll wait for some feedback from Brad since he's the one who > >> > came up with the code. > >> > > >> > >> I would rather have all of the metric modules (both C and python) > >> installed with the gmond package. But not all of them have to be > >> enabled. > > > > but that would still require to install python although i might > not > even > > use it. for the moment i would propose using a subpackage for it. > > > >> My vision moving forward (just my 2 cents) would be that the > >> ganglia community embrace the python interface as the preferred > way > to > >> extend gmond with new metric types. To promote this, installing and > >> configuring mod_python by default would encourage the use of the > >> python interface. I've mentioned this idea before on this list, I > >> would also like to see a python metric module repository as part of > >> the ganglia project that would allow the ganglia community to upload > >> and share metric modules similar to the gmetric repository. > > > > if an use wants a python based metric type he can easily install the > > package. > > > >> In our own internal RPM builds, we have been installing disabled > >> python modules to an "extra" directory. In other words, a disabled > >> python module .pyconf file would be installed to > >> /etc/ganglia/conf.d/extra and the corresponding .py module > file > would > >> be installed to /usr/lib/ganglia/python_modules/extra. This allows > >> the user to simply move the .pyconf from extra to conf.d and the .py > >> module from extra to python_modules. Then restart gmond and new > >> metrics appear. Another option would be to install the .pyconf as > >> .pyconf.off and the .py to the python_modules directory. With the > >> config file named .pyconf.off, the gmond configuration file parser > >> will ignore it during startup. The downside of this is that the .py > >> module will always be loaded just because it exists in the > >> python_modules directory, even if it isn't being used or > referenced > by > >> a configuration file. Of course without a corresponding > >> configuration, even if the .py module is loaded, it's metrics > won't > be > >> produced or appear in the -m metric list. > > > > you can/should do that even with the python module splitted out. > as > the > > user might not want all python metrics enabled. > > > >> Now after having said all of that, there is an option that could be > >> adopted later. If myself or anybody else enabled gmond with other > >> scripting language modules such as perl, PHP, TCL, etc., then > it > might > >> make more sense to split the different enabling modules with their > >> associated metric plugins, into separate RPM packages. But for now, > >> including the python enabling module along with the python metric > >> modules with gmond, seems more convenient. > > > > from a packager/dependency point of view it makes sense to split > it > out > > to give the user the choice if they want python or not. > > > > darix > > > Actually the scenario that I am proposing would eliminate all of > the > built in hardcoded metrics and move them out of gmond as python > modules. I definitely agree that the notion of "core metrics" should go in 3.1. At least they should no longer be hardcoded, but loadable. What I do not agree (and you probably didn't mean it that way) is to replace everything by Python code. C Modules should still be allowed and be first class citizens :-) > This would allow gmond to be just the collection and transport > daemon > as it should be. Then the user would have full control over > which > metrics they want to allow in their system, which version of a metric > they > want to use, allow them the ability to easily tweak a metric for > their > particular platform if necessary without having to get into the > guts > of gmond to do it. Absolutely. > It would also elim
Re: [Ganglia-developers] ganglia-webfrontend package hidden
Hi Bernard, --- Bernard Li <[EMAIL PROTECTED]> wrote: > Dear all: > > Just FYI I went ahead and "hid" the deprecated ganglia-webfrontend > package on SF.net: > > http://sourceforge.net/project/showfiles.php?group_id=43021 > > The ganglia-web component has been part of the ganglia monitoring > core > package since 3.0.0 and is integrated in the ganglia-.tar.gz > tarball. > Very good. > Any objections if I go ahead and rename "ganglia monitoring core" to > just "ganglia"? > Fine with me. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Release notes for 3.0.5
Bernard, good job. Hope this will be a worthy "last" 3.0.x release. Cheers Martin --- Bernard Li <[EMAIL PROTECTED]> wrote: > Hi guys: > > Ganglia 3.0.5 is ready, I have prepared the release notes here: > > --- > The Ganglia development team is proud to release version 3.0.5 > (Louis) of the > popular Ganglia monitoring software. Ganglia is a scalable > distributed monitoring > system for high-performance computing systems such as clusters and > Grids. > > The following is a summary of changes in this release. For detailed > changelog please > refer to the ChangeLog file in the release distribution tarball: > > - [gmetad] Fixed a bug where messages are being discarded in MacOSX > and thus causing data > from clients not being consistently and accurately saved to the rrd > files (Mike Walker) > - [win32] Include documentation (README.WIN) for building under > Windows > - [webfrontend] Enlarge graphs by clicking on them (Ulf) > - [webfrontend] Include RRDTool version in frontend footer (Matthew > Chambers) > - [webfrontend] Only set the grid stack cookie if it hasn't been set > before (Matt Ryan) > - [webfrontend] New feature to allow sorting by hosts up and hosts > down in meta context > (Bernard Li, Eli Stair, Timothy D Witham) > - [gstat] New option "-n" to show numeric addresses instead of > hostname (Bernard Li) > - Builds under Yellog Dog Linux on Sony PlayStation 3 ppc64 (Bernard > Li) > - Do not automatically start services (gmond, gmetad) after RPM > installation (Bernard Li) > - Add y-labels for some metrics. Needed to fix width of RRD images. > (Martin Knoblauch) > - Build system (Autotools) enhancements (Carlo Marcelo Arenas Belon) > - Misc bug fixes > > Work is underway for the next (3.1.0) release of Ganglia which will > allow metrics to be dynamically > loaded via DSO. These metrics can be written either in C or in > Python > making it extremely easy to > create plugins for monitoring metrics not already present by default. > Apr, expat and libconfuse will > be built dynamically in the new release which will make packaging for > distributions easier. > --- > > I will be releasing this to SourceForge shortly, please let me know > if > you see any issues with the above wording. > > Thanks! > > Bernard > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > _______ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] web site and ganglia 3.1.0
--- Matt Massie <[EMAIL PROTECTED]> wrote: > guys- > > i hope all of you in the united states enjoyed a non-laborious labor > day. > for our peers in the rest of the world, i hope your day of the moon > was a > good one. > > as you might have noticed, we have an updated web site now ( > http://ganglia.info/). i plan to add the wiki and make some updates > (thanks > to feedback from bernard) soon. please feel free to let me know what > changes you'd like to see to the site. my hope is to make it easier > for > people to find the information they need. thanks again to bernard > for the > mail-archive idea. > > lastly, i spoke with groundwork open source and they suggested we > talk about > having a ganglia 3.1.0 ganglia get-together. they offered to help > with > transportation costs for some of our group (e.g. martin in germany). > we > should get together and work to push 3.1.0 out. would you guys like > to > gather in san francisco to meet and release the 3.1.0 release of > ganglia? > let me know what you think about it. > Hi Matt, first of all, the new web site looks very good. Good job. As for a 3.1.x meeting, I belive that it is a great idea. Some brainstorming on what should happen in the is really needed. And if your company helps people with traveling it is even better. As for me, it really depends on the when. I am not 100% master of my time. May day-job employer has some say about it and it might be difficult for me to go away for a week before next February. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmetad segfaults upgrading from 3.0.3 to 3.0.4
--- Andrea Capriotti <[EMAIL PROTECTED]> wrote: > Il giorno mer, 18/07/2007 alle 11.11 -0700, Bernard Li ha scritto: > > Well, looks like Charles no longer works for Oracle (his email > address bounced). > > > > Anyways, anything thing I would like you to try is to run gmetad in > > debug mode and see if it gives us any hints to why it segfaulted. > > # ./gmetad -d 5 > Going to run as user nobody > Sources are ... > Source: [Cray_XD1_Linux_Cluster, step 25] has 1 sources > xxx.xxx.xxx.xxx > Source: [Front_End_Cluster, step 25] has 1 sources > xxx.xxx.xxx.xxx > Source: [GNU_Linux_Cluster, step 25] has 1 sources > xxx.xxx.xxx.xxx > Source: [BCX_Linux_Cluster, step 25] has 1 sources > xxx.xxx.xxx.xxx > Source: [SP5, step 25] has 1 sources > xxx.xxx.xxx.xxx > Source: [BCC_Linux_Cluster, step 25] has 1 sources > xxx.xxx.xxx.xxx > xml listening on port 8651 > interactive xml listening on port 8652 > Data thread 1090386864 is monitoring [Cray_XD1_Linux_Cluster] data > source > Data thread 1092488112 is monitoring [Front_End_Cluster] data source > xxx.xxx.xxx.xxx > Data thread 1094589360 is monitoring [GNU_Linux_Cluster] data source > xxx.xxx.xxx.xxx > xxx.xxx.xxx.xxx > Data thread 1096690608 is monitoring [BCX_Linux_Cluster] data source > Data thread 1099959216 is monitoring [SP5] data source > xxx.xxx.xxx.xxx > xxx.xxx.xxx.xxx > Data thread 1102060464 is monitoring [BCC_Linux_Cluster] data source > xxx.xxx.xxx.xxx > cleanup thread has been started > [Front_End_Cluster] is a 2.5 or later data stream > hash_create size = 1024 > hash->size is 1031 > hash_create size = 50 > hash->size is 53 > hash_create size = 50 > hash->size is 53 > Updating host node01.fec.cineca.it, metric disk_free > Updating host node01.fec.cineca.it, metric bytes_out > Updating host node01.fec.cineca.it, metric proc_total > [..] > Writing Summary data for source Front_End_Cluster, metric swap_total > Updating host node057.clx.cineca.it, metric cpu_idle > Updating host node028, metric cpu_idle > Updating host sp062, metric cpu_num > Writing Summary data for source Front_End_Cluster, metric > part_max_used > Updating host node057.clx.cineca.it, metric cpu_user > Updating host ch476-n5.xd1.cineca.it, metric mem_total > Updating host sp062, metric load_fifteen > Updating host node028, metric cpu_user > Updating host node057.clx.cineca.it, metric swap_free > Segmentation fault > > If I try again it segfaults in a different point: > > # ./gmetad -d5 > [..] > Updating host ch472-n3.xd1.cineca.it, metric cpu_nice > hash_create size = 50 > hash->size is 53 > Updating host node0964.bcx.cineca.it, metric disk_free > Updating host sp061, metric mem_cached > Writing Summary data for source Front_End_Cluster, metric swap_total > Updating host node380.clx.cineca.it, metric mem_cached > Updating host sp061, metric load_five > Writing Summary data for source Front_End_Cluster, metric > part_max_used > Updating host node038, metric pkts_out > Updating host sp061, metric cpu_num > Updating host ch472-n3.xd1.cineca.it, metric cpu_speed > Segmentation fault > > Let me know if you need the whole log. > > Best Regards > -- > Andrea Capriotti > System Management Group - Cineca - www.cineca.it > [EMAIL PROTECTED] - Tel +39 051 6171890 > Andrea, do you have a chance to run gmetad under control of a debugger to see where exactely the segfault happens? Apparently the pointer that is NULLified by the patch for bz#56 gets referenced later on, leading to the problem. Thanks Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Web frontend graph sizes
Matt, cool. This would make the size calculations for the web frontend much much easier. I always wondered why RRDTOOL does not have a way to specify the size of the whole image. Cheers Martin --- Matthew Chambers <[EMAIL PROTECTED]> wrote: > I have made changes to the development version of rrdtool's graph > mode that > introduces a "--full-size-mode" option which, when enabled, will > cause the > -width and -height parameters to refer to that actual size of the > output > image. The main graph area is automatically adjusted based on the > space > necessary for the legend, border spacing, graph title, axis labels, > and the > y axis name (and also the pie chart, but does anybody use that?). > Without > --full-size-mode, the -width and -height parameters control the main > graph > area's dimensions as usual. > > > > RRDTOOL folks, is this contribution something that would be useful in > a > future release? > > > > HTH, > > Matt Chambers > > > - > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/> ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Removing the static dependancy on APR fromGanglia...
Folks, I tend to agree with Nick. If we move to use apr-1.2.x we can just upgrade the static code. When we moved from 0.9.2 (or .4) to 0.9.7, I considered going 1.2.x instead, but did not have the bandwidth to make the necessary changes. But now the code changes are done anyway for trunk. Cheers Martin --- Nick Galbreath <[EMAIL PROTECTED]> wrote: > Hi Brad... > > RE: Apr 0.9X vs Apr 1.2.X > > I guess I'm a bit confused. I like the configure switch, but why not > nuke > the 0.9.7 and put in the 1.2.X in srclib then no ifdefs are needed > and every > knows what version to use. To make a patch now, I have to pull two > copies > of APR and compare differences. Even if we defer linking to dynamic > libraries it seems like using the new apr bits (statically) is still > a good > step. Or what I am I missing? > > thanks! > > --nickg > > On 4/25/07, Brad Nicholes <[EMAIL PROTECTED]> wrote: > > > > I have committed the patches to add --with-libapr to configure.in > which > > allows the project to build against the distro version of libapr > 1.2.x or > > to specify an alternate 1.2.x build. If > --with-libapr= > > si specified, it will build and link with the libapr found in the > specified > > path. For now if --with-libapr is not specified at configure time, > it will > > still build and statically link against the 0.9.7 version found in > the > > srclib/apr directory. In order to move from apr 0.9.7 to 1.2.x, I > had to > > add some #ifdef's in gmond.c and apr_net.c to handle the > > differences. Once we decide to remove apr 0.9.7 completely and > only link > > dynamically to apr 1.2.x, these #ifdef's can be removed. > > > > Now that this move to APR 1.2.x has been done, this should pave the > way > > for several things: > > > > - allow any plugable metrics module to use APR functions as well > > - eliminate libexpat and use the expat functions from APR-Util > > - replace the multicast functions in apr_net.c with the APR > multicast > > functions > > > > I plan to work on these tasks as I find time, but if somebody else > want to > > tackle them, please speak up and go ahead. > > > > Brad > > > > >>> On 4/24/2007 at 8:44 AM, in message > <[EMAIL PROTECTED]>, > > "Brad > > Nicholes" <[EMAIL PROTECTED]> wrote: > > > FYI, I am working on removing the static dependancy on APR from > GMOND > > and > > > other ganglia binaries. In the process I am also moving Ganglia > from > > APR > > > 0.9.7 to APR 1.2.x. This first pass will add a --with-libapr > option to > > > configure which will be interpreted as linking dynamically to the > > distro's > > > version of APR rather than the internal static APR library. In > follow > > on > > > patches, I would like to see the static version of APR removed > > completely and > > > allow the --with-libapr to specify which APR library to link with > if you > > would > > > rather link with your own built version of APR or use the > distro's > > version. > > > The main reasoning behind this move is so that the metrics > modules that > > are > > > plugged into gmond, can also take advantage of APR. Thinking > further > > ahead, > > > I would also like to see libexpat removed in favor of using the > expat > > > functionality built into APR-Util. > > > > > > Comments? > > > > > > Brad > > > > > > > > > - > > This SF.net email is sponsored by DB2 Express > > Download DB2 Express C - the FREE version of DB2 express and take > > control of your XML. No limits. Just data. Click to get it now. > > http://sourceforge.net/powerbar/db2/ > > ___ > > Ganglia-developers mailing list > > Ganglia-developers@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > > - > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/> ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de
Re: [Ganglia-developers] Removing the static dependancy on APR from Ganglia...
Hi Brad, GO GO GO :-) Really great that you are looking into it. This has been a complaint from several people for some time now. Cheers Martin --- Brad Nicholes <[EMAIL PROTECTED]> wrote: > FYI, I am working on removing the static dependancy on APR from > GMOND and other ganglia binaries. In the process I am also moving > Ganglia from APR 0.9.7 to APR 1.2.x. This first pass will add a > --with-libapr option to configure which will be interpreted as > linking dynamically to the distro's version of APR rather than the > internal static APR library. In follow on patches, I would like to > see the static version of APR removed completely and allow the > --with-libapr to specify which APR library to link with if you would > rather link with your own built version of APR or use the distro's > version. The main reasoning behind this move is so that the metrics > modules that are plugged into gmond, can also take advantage of APR. > Thinking further ahead, I would also like to see libexpat removed in > favor of using the expat functionality built into APR-Util. > > Comments? > > Brad > > > - > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de
Re: [Ganglia-developers] Updates to ganglia.spec.in (/etc/ganglia/conf.d)
Hi Folks, just lets say that the next "trunk" release will be 3.1.0. We can break compatibility of config files (and locations) in this move. We just need to keep compatibility of the datastreams, And I think moving the main config files to "/etc/ganglia" (is this configurable ?) makes sense. Although distributions may think different :-) Cheers Martin --- Brad Nicholes <[EMAIL PROTECTED]> wrote: > >>> On 4/11/2007 at 12:01 PM, in message > <[EMAIL PROTECTED]>, > "Bernard Li" > <[EMAIL PROTECTED]> wrote: > > Hi Brad: > > > > Thanks for checking in the changes for accommodating > /etc/ganglia/conf.d. > > > > In the future, when you update ganglia.spec.in, can you please make > > sure you add a blurb to %changelog specifying what you have > changed? > > I know it is sort of redundant since the spec file is under > revision > > control, however, for RPM users, they will at least be able to see > > what has changed using the rpm --changlog option. > > > > Sorry, I meant to do that and then forgot just before I checked in > the changes. > > > > While we're doing a bit of re-organization, do we want to move > > gmond.conf and gmetad.conf to /etc/ganglia as well? > > > > I think that would be a good idea. I didn't do it up front because > of the impact that moving the .conf file would have on backward > compatibility. However, I do think that it would be cleaner to have > all of the configuration files under an /etc/ganglia subdirectory. > > Brad > > > - > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to > share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de
Re: [Ganglia-developers] AIX Consolidation
Hi Michael, I guess the POWER5 extensions would be good candidates for dynamic loading into the gmond stream. In any case, I see no reason not to keep them in the core code, even if they are not enabled by default. One thing that I like more with the current code are the "combined" functions for retrieving related metrics (get all cpu and network stats) at one point in time. The reduce syscall overhead and keep metrics together (important for CPU usage). Cheers Martin --- Michael Perzl <[EMAIL PROTECTED]> wrote: > Hi Martin, > > if possible I would like to somehow take my version (after some > reviewing) :-), as it contains all the new POWER5 stuff already. > > My understanding is - as it would require some changes to protocol.x > - > that my changes won't have a chance to get into the core Ganglia > source > code until version 3.1 comes along. > > This code and everything else (RPMs) can be found on my website > http://www.perzl.org/ganglia/. > > This stuff is actually in use at quite many customer sites already > (runs > on AIX 4.3.3, 5.1, 5.2 and 5.3) so I would like to keep that > POWER5-stuff in if possible. Actually, an AIX gmond implementation > without the POWER5-stuff based on my implementation could be done > very > easy (just stripping off the POWER5-addons). > > Regards, > Michael > > Martin Knoblauch wrote: > > Michael, Andreas, > > > > any chance that you could consolidate the two versions of the AIX > > metrics that seem to be around? Seem you are the ones who have > worked > > most recently on the AIX implementation. > > > > Cheers > > Martin > > > > --- Michael Perzl <[EMAIL PROTECTED]> wrote: > > > > > >> Andreas, > >> > >> thank you for taking the blame but you are off the hook here. ;-) > >> > >> If I understood David correctly, he is using my AIX Ganglia RPM > >> packages > >> with POWER5 extensions. Here most if not all implementation of how > >> the > >> metrics are collected under AIX have been changed. Everything is > >> documented on my homepage (http://www.perzl.org/ganglia/) though. > >> So everything what goes wrong here is entiremy my fault :-[ > >> > >> After some investigating and some discussions with Nigel I have > come > >> to > >> terms with the following facts regarding the bytes_in/bytes_out > >> problem: > >> - libperfstat (the library on AIX which obtains all the system > >> performance data) uses u_longlong_t data types (these are > definitely > >> 64-bit large). > >> - The AIX kernel internally, though, may probably not be using > 64-bit > >> > >> data types - more realistic is probably unsigned 32-bit - in order > >> not > >> to break compatibility (my personal opinion) > >> - The consequence now is that integer overrun may occur much > easier > >> with > >> 32-bit data types than with 64-bit data types (we all probably > don't > >> live long enough to see that happen). > >> > >> Please take a look at my implementation of the bytes_in metric > (the > >> bytes_out implementation is accordingly): > >> > >> 01 g_val_t > >> 02 bytes_in_func( void ) > >> 03 { > >> 04 g_val_t val; > >> 05 perfstat_netinterface_total_t n; > >> 06 static u_longlong_t last_bytes_in = 0, bytes_in; > >> 07 static double last_time = 0.0; > >> 08 double now, delta_t; > >> 09 struct timeval timeValue; > >> 10 struct timezone timeZone; > >> 11 > >> 12 gettimeofday( &timeValue, &timeZone ); > >> 13 > >> 14 now = (double) (timeValue.tv_sec - boottime) + > >> (timeValue.tv_usec > >> / 100.0); > >> 15 > >> 16 if (perfstat_netinterface_total( NULL, &n, sizeof( > >> perfstat_netinterface_total_t ), 1 ) == -1) > >> 17val.f = 0.0; > >> 18 else > >> 19 { > >> 20bytes_in = n.ibytes; > >> 21 > >> 22delta_t = now - last_time; > >> 23 > >> 24if ( delta_t ) > >> 25 val.f = (double) (bytes_in - last_bytes_in) / > delta_t; > >> 26else > >> 27 val.f = 0.0; > >> 28 > >> 29last_bytes_in = bytes_in; > >> 30 } > >> 31 > >> 32 last_time = now; > >> 33 > >> 34 return( val ); > >> 35 } > >> > >&
[Ganglia-developers] AIX Consolidation (was: Re: [Ganglia-general] Help! I have a petabyte/s network (Martin Knoblauch))
Michael, Andreas, any chance that you could consolidate the two versions of the AIX metrics that seem to be around? Seem you are the ones who have worked most recently on the AIX implementation. Cheers Martin --- Michael Perzl <[EMAIL PROTECTED]> wrote: > Andreas, > > thank you for taking the blame but you are off the hook here. ;-) > > If I understood David correctly, he is using my AIX Ganglia RPM > packages > with POWER5 extensions. Here most if not all implementation of how > the > metrics are collected under AIX have been changed. Everything is > documented on my homepage (http://www.perzl.org/ganglia/) though. > So everything what goes wrong here is entiremy my fault :-[ > > After some investigating and some discussions with Nigel I have come > to > terms with the following facts regarding the bytes_in/bytes_out > problem: > - libperfstat (the library on AIX which obtains all the system > performance data) uses u_longlong_t data types (these are definitely > 64-bit large). > - The AIX kernel internally, though, may probably not be using 64-bit > > data types - more realistic is probably unsigned 32-bit - in order > not > to break compatibility (my personal opinion) > - The consequence now is that integer overrun may occur much easier > with > 32-bit data types than with 64-bit data types (we all probably don't > live long enough to see that happen). > > Please take a look at my implementation of the bytes_in metric (the > bytes_out implementation is accordingly): > > 01 g_val_t > 02 bytes_in_func( void ) > 03 { > 04 g_val_t val; > 05 perfstat_netinterface_total_t n; > 06 static u_longlong_t last_bytes_in = 0, bytes_in; > 07 static double last_time = 0.0; > 08 double now, delta_t; > 09 struct timeval timeValue; > 10 struct timezone timeZone; > 11 > 12 gettimeofday( &timeValue, &timeZone ); > 13 > 14 now = (double) (timeValue.tv_sec - boottime) + > (timeValue.tv_usec > / 100.0); > 15 > 16 if (perfstat_netinterface_total( NULL, &n, sizeof( > perfstat_netinterface_total_t ), 1 ) == -1) > 17val.f = 0.0; > 18 else > 19 { > 20bytes_in = n.ibytes; > 21 > 22delta_t = now - last_time; > 23 > 24if ( delta_t ) > 25 val.f = (double) (bytes_in - last_bytes_in) / delta_t; > 26else > 27 val.f = 0.0; > 28 > 29last_bytes_in = bytes_in; > 30 } > 31 > 32 last_time = now; > 33 > 34 return( val ); > 35 } > > In my opinion the overrun occurs in line #25 when "bytes_in < > last_bytes_in". > In my naivity I had assumed as both are of type u_longlong_t that an > integer overrun might never happen. > > So to solve the overrun a check for "bytes_in < last_bytes_in" must > be > introduced, something like: > > u_longlong_t d; > d = bytes_in - last_bytes_in; > if (d < 0) d += ULONG_MAX; > > and line #25 would essentially become > 25 val.f = (double) d / delta_t; > > Comments ? > > Regards, > Michael > > PS: David, the reason why you don't see it happen with pkts_in and > pkts_out is that probably no overrun so far has occurred but at some > point it will also happen. > > PPS: David, if this is a solution (I want some comments on that > before, > though) then I would be building new RPMs with the then hopefully > correct code. > > Andreas Schoenfeld wrote: > > Hi David and Martin, > > > > I suppose the network code is still the code I wrote, so there are > two > > problems I know of: > > 1. yes there is a problem with owerflows > > 2. the shown network traffic is the sum of all network interfaces > > including local loopback devices (lo0...). > > > > Both Problems could lead to astonishing data transfer rate in > ganglia. > > > > Sorry I had promised to fix the problems, but there was to much > other > > work ... > > > > Best regards > >Andreas > > > > > >> Date: Thu, 29 Mar 2007 08:21:38 -0700 (PDT) > >> From: Martin Knoblauch <[EMAIL PROTECTED]> > >> Subject: Re: [Ganglia-general] Help! I have a petabyte/s network > >> To: David Wong <[EMAIL PROTECTED]>, > [EMAIL PROTECTED], > >>[EMAIL PROTECTED] > >> Message-ID: <[EMAIL PROTECTED]> > >> Content-Type: text/plain; charset=iso-8859-1 > >> > >> David, > >> > >> good catch. I will have to look at it for a bit. > >> > >> Cheers > >> Martin > >> --- David Wong <[EMAIL
Re: [Ganglia-developers] Adding extensibility to gmond...
Hi Matt, --- Matt Massie <[EMAIL PROTECTED]> wrote: > i just added brad to the list of developers with write access to our > subversion repository. welcome to the team, brad! > let me join you to welcome Brad to the crowd. His work looks exciting and needs to go in. > i've created a branch called "monitor-core-3.0-beta" for 3.0.x > development from what is currently in trunk. we might not need quite > as rigorous a process as apache but i think we need to start using > branches for development. That is definitely true. Lets not overdo it, but some kind of structuring of the repository is needed. > all 3.0.x code development should be out > of ./tags/monitor-core-3.0-beta instead of trunk. just as before, > with each new release we'll create a branch called > monitor-core-3.0.n. > makes sense to me. 3.0.X is in a kind of maintenance mode in my opinion, so putting developement of cool stuff into trunk is OK. > we should keep the trunk open for more creative and radical play. i > recommend that brad put his updates into the trunk for members to > start testing. the sooner we do this the better because it > will help us decide where to put the new code once it's fleshed out > (monitor-core-3.0-beta 3.1-beta or elsewhere). > see above. > does that seems reasonable martin et al? i don't want to hold up > brad's new code and now our current 3.0 branch is safely isolated. > since we don't have a formal voting mechanism, i suggest that if we > don't hear objections by wednesday of this week.. the trunk is open > for brad to drop in his enhancements. i really hope though that > others chime in on their thoughts on this. > No objections from me. Brad should go ahead as he sees fit. Martin > -matt > > On Mon, 2007-03-26 at 11:21 -0600, Brad Nicholes wrote: > > >>> On 3/26/2007 at 9:38 AM, in message > > <[EMAIL PROTECTED]>, Martin Knoblauch > > <[EMAIL PROTECTED]> wrote: > > > Hi Brad, > > > > > > for the extensibility stuff, I believe we have not yet decided > how to > > > proceed: > > > > > > - put it in 3.0 - only possible if it does not break > compatibility with > > > existing gmond datastreams. We are very careful here. > > > - branch of 3.1 - needed when the core metrics array is changed > > > - branch of 4.0 - needed on major architectural change > > > > > > How would you personally rate the extensibility functions. > > > > > > Cheers > > > Martin > > > > > > > The gmond extensibility is backward compatible. I have actually > run > > both the enhanced gmond and the original 3.0.4 version of gmond in > the > > same cluster without any problem. Since the modular metrics show > up > > as SOURCE="GMETRICS", a 3.0.4 version of gmond doesn't know the > > difference. The only issue would be cross-platform compatibility > in > > terms of the actual source code. I don't believe that I have > > introduced any problems but since I can't test on all platforms > that > > are supported by Ganglia, I can't be sure. > > > > Since I haven't been around the Ganglia project for long, I am not > sure > > what all of your policies are with regards to new development and > how > > to integrate it with existing code. However, I can describe to > you > > the way we do it in the Apache HTTPD project which seems to work > very > > well. You may already have policies in place to allow for new > > development to happen in the same tree. If anything that I > describe > > here works for the Ganglia project, it could probably be adapted > into > > your existing policies. > > > > The Apache HTTPD project takes advantage of both trunk and > branches. > > Trunk is always considered to be unstable and considered to be the > > next major version of the source code. In other words, trunk for > the > > Ganglia project would be considered version 4.0. At the time when > a > > major or minor version release is considered by the community, > (ie. > > 3.0-beta, 3.2-beta, 4.0-beta, 4.2-beta, etc.), a stable branch is > > created with the base code being the rolled alpha or beta release. > A > > STATUS file is also created and placed in the branch. I will > explain > > what the STATUS file is for a little later. > > > > All development including enhancements and bug fixes, continues to > be > > checked into trunk. However the concepts of Commit-Then-Review > (CTR) > > and Review-T
Re: [Ganglia-developers] Adding extensibility to gmond...
Hi Brad, for the extensibility stuff, I believe we have not yet decided how to proceed: - put it in 3.0 - only possible if it does not break compatibility with existing gmond datastreams. We are very careful here. - branch of 3.1 - needed when the core metrics array is changed - branch of 4.0 - needed on major architectural change How would you personally rate the extensibility functions. Cheers Martin --- Brad Nicholes <[EMAIL PROTECTED]> wrote: > >>> On 3/1/2007 at 1:49 PM, in message > <[EMAIL PROTECTED]>, "Brad > Nicholes" <[EMAIL PROTECTED]> wrote: > > All, > > I have just added an enhancement request to bugzilla (#129) for > adding > > modular metric extensibility to gmond. I have also attached a > patch file and > > example module to the bug report that add this functionality. > Hopefully you > > will find this enhancement useful and commit it to the ganglia SVN > > repository. Let me know if you have any questions or issues with > the > > patches. > > > > Brad > > > > It's been several weeks since I submitted the patch to add > extensibility to gmond through metrics modules. I haven't seen much > feedback nor have I seen the code committed to the repository. Has > the community had a chance to review the extensibility code? I have > also submitted other patches through bugzilla which I haven't seen > any feedback on either. I also have additional patches which I > would like to see committed to the repository that I believe will > enhance the functionality of Ganglia. It would be much easier to > submit these new patches if the patches that I have already proposed > could be committed. I would be happy to commit the patches myself if > I were given commit rights. > > thanks, > Brad > > > - > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to > share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de