Re: [Ganglia-general] error building in FreeBSD like system: " don't know how to make all"
On Fri, 11 Apr 2014 20:30:43 -0400 Rita Rita wrote: Hi Rita, > what patch did you apply? IIRC I had to define the OneFS OS in config.guess so it knows that it's a BSD system. > did you get it to work on OneFS? > > That worked and installed a new ganglia (3.4). > > unfortunately, this new version still reports wrong values /negative > > memory and no network inforamtion at all)... Also, gmond was using 99% of cpu... :-( Cheers, Arnau -- Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] custom log file for gmetad
Hi, is there any way for creating a custom file for gmetad? It's using messages by default. I've not found any reference to syslog in doc or gmetad.conf . TIA, Arnau -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Combining metrics from several RRD files
On Fri, 31 Jan 2014 10:37:19 +0100 Martin Knoblauch wrote: > Hi friends, Hi, > hope somebody already had this problem and solved it. So I have a > cluster were we monitor the status (size, used, free) for several > filesystems using Ganglia. Looks all great in the browser, but now > the customer wants to have those data sets combined into one. In > order to not loose the data we have, I want to combine those into one > RRD. All the "source" RRDs have identical structure (RRAs) and > timestamps. > Any solution? Ideas? If I've understood you property: 1.-) use the "Aggregate Graphs" from ganglia's web. 2.-) create a custom grpah and add it to one host : quick google search: http://sourceforge.net/mailarchive/forum.php?thread_name=503E2A47.6020705%40gmail.com&forum_name=ganglia-general 3.-) as they are RRDs you can mix them using your own script (bash, perl, python) HTH, > Cheers > Martin Arnau -- WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance and go from zero to informed in seconds. http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] error building in FreeBSD like system: " don't know how to make all"
On Tue, 7 Jan 2014 12:22:32 + Nicholas Satterly wrote: > Hi, Hi! > Did you run "./bootstrap" and "./configure" before you tried "make > all-recursive"? Nope... :-) I had to run it in libmetric and fix , add a patch to build/config.guess, run autoreconf --force --install (problems with libtool), and then, from root directory: ./configure --with-libapr=/usr/local/apache2/bin/apr-1-config --with-libconfuse=/usr/local/ make all-recursive make install That worked and installed a new ganglia (3.4). unfortunately, this new version still reports wrong values /negative memory and no network inforamtion at all)... I'll try to debug and/or open a new threat for that. > --Nick. Many thanks for your help! Arnau -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] error building in FreeBSD like system: " don't know how to make all"
Hi all, We're running "default" ganglia in one OneFS 6.5 system (FreeBSD like). The problem is that is quite old : # pkg_info|grep ganglia ganglia-monitor-core-3.1.1_3 Ganglia cluster monitor, monitoring daemon and the plots are incorrect (i.e no network traffic and negative memory). So, I wanted to recompile a newer version: 3.4 (3.6 is too new). after adding the new OS to build/config.guess (it now reports as freebsd): echo x86_64-unknown-freebsd exit ;; , I did the make and it fails with the error: make all-recursive Making all in lib Making all in libmetrics make: don't know how to make all. Stop *** Error code 1 Stop in /usr/local/crg/ganglia-3.4.0. *** Error code 1 Stop in /usr/local/crg/ganglia-3.4.0. I don't know what it means and how to continue at this point. Any other user ot a developer could give me hand on this? TIA, Arnau -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] dwoo dir was [Re: Missing host stacked graphs]
Hi again, I think I've found the source of the problem but I don't understand what's happening. I has been related to the custom file: /var/lib/ganglia-web/conf/host/host_node.domain.json don't know why, but that file (that worked for a long time) was creating problems. HTH other people. Cheers, Arnau -- AlienVault Unified Security Management (USM) platform delivers complete security visibility with the essential security capabilities. Easily and efficiently configure, manage, and operate all of your security controls from a single console and one unified framework. Download a free trial. http://p.sf.net/sfu/alienvault_d2d ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] Missing host stacked graphs
Hi all, I can't say since when, but my hosts are not longer showing stacked per host graphs (it works per group basis)... I see no errors in http nor messages. I'm running ganglia-web-3.5.4-1.noarch . Anyone has seen this before? any hint on where to start looking at? I know this is a vague question, but I don't know how to start debugging this... TIA, Arnau -- AlienVault Unified Security Management (USM) platform delivers complete security visibility with the essential security capabilities. Easily and efficiently configure, manage, and operate all of your security controls from a single console and one unified framework. Download a free trial. http://p.sf.net/sfu/alienvault_d2d ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia network monitoring information problems
On Wed, 5 Dec 2012 23:10:36 +0800 Hong Wayne wrote: Hi, [...] > On the other hand, the question about second problem. > > What's the difference between byte_in.rrd and pkts_in.rrd? number of bytes / number of packets? > Wayne. Arnau -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia network monitoring information problems
On Tue, 4 Dec 2012 23:15:44 +0800 Hong Wayne wrote: > Dear all: Hi Wayne, > I already install Ganglia in Ubuntu 12.04 LTS, but I have some > problems to ask. > > The first problem is when I browse ganglia website which the IP > address is IP_Address/ganglia > > I got the information about > ''Forbidden You don't have permission to access /ganglia on this > server.'' this is apache related question. Check its logs and ganglia's conf file: DocumentRoot "/var/www/html/ganglia" ServerName ganglia.domain DirectoryIndex index.php Options Indexes FollowSymLinks Order deny,allow Deny from all Allow from all ServerAdmin $MAIL ErrorLog /etc/httpd/logs/ganglia_error_log CustomLog /etc/httpd/logs/ganglia_access_log combined > It looks like SELinux not disabled. So I commanded the following > lines: > > sudo /etc/init.d/apparmor stop > sudo /etc/init.d/apparmor teardown > sudo update-rc.d -f apparmor remove > > It still not work!! So what's the problem? > > And the second problem is whether the file bytes_in.rrd is the network > monitoring information from Ganglia or not? that file should contain network 'in' traffic from one of your hosts. it is a rrd file, nothing to do with configuration. > If it's not, then what's the correct network monitoring information > for Ganglia? What are you looking for? I don't understand your question. > Thanks for helping. > > > Wayne. Cheers, Arnau -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia 3.4.0 network traffic wrong
Hi, I've found the source of the problem and it's in the driver. Time to time those network card gives an error like: Oct 18 02:58:50 dc016 kernel: do_IRQ: 10.66 No irq handler for vector (irq -1) Oct 18 02:58:56 dc016 kernel: [ cut here ] Oct 18 02:58:56 dc016 kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26d/0x280() (Not tainted) Oct 18 02:58:56 dc016 kernel: Hardware name: ProLiant BL460c G6 Oct 18 02:58:56 dc016 kernel: NETDEV WATCHDOG: eth0 (bnx2x): transmit queue 6 timed out Oct 18 02:58:56 dc016 kernel: Modules linked in: ipmi_devintf nfs lockd fscache nfs_acl auth_rpcgss sunrpc pcc_cpufreq ipv6 xfs exportfs tcp_htcp power_meter ipmi_si ipmi_msghandler hpwdt hpilo bnx2x libcrc32c mdio microcode serio_raw iTCO_wdt iTCO_vendor_support sg i7core_edac edac_core shpchp dm_round_robin ext3 jbd mbcache sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Oct 18 02:58:56 dc016 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-279.5.1.el6.x86_64 #1 Oct 18 02:58:56 dc016 kernel: Call Trace: Oct 18 02:58:56 dc016 kernel: [] ? warn_slowpath_common+0x87/0xc0 [...] Oct 18 02:58:58 dc016 kernel: bnx2x: [bnx2x_clean_tx_queue:1382(eth0)]timeout waiting for queue[6]: txdata->tx_pkt_prod(4970) != txdata->tx_pkt_cons(4723) Oct 18 02:59:08 dc016 kernel: bnx2x: [bnx2x_state_wait:337(eth0)]timeout waiting for state 7 Oct 18 02:59:09 dc016 kernel: bnx2x :02:00.0: eth0: using MSI-X IRQs: sp 58 fp[0] 60 ... fp[14] 74 Oct 18 02:59:10 dc016 kernel: bnx2x :02:00.0: eth0: NIC Link is Up, 1 Mbps full duplex, Flow control: ON - receive & transmit and from then, it give those Peta/Tera bytes peaks. Cheers, Arnau -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia 3.4.0 network traffic wrong
On Thu, 11 Oct 2012 13:17:54 +0200 Jordi Funollet wrote: > Hi Arnau, Hi Jordi, > Just to get more information, you can try the 'multi_interface' > Python module. > > https://github.com/ganglia/gmond_python_modules/tree/master/network/multi_interface I've installed it. tx_bytes_eth1 show good values (about 100M) but Bytes Sent still show petabites peaks ... seems that ganglia is not doing the sum correctly. will take a look into source code... not expecting to understand much, but I'll try. Cheers, Arnau -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia 3.4.0 network traffic wrong
Sorry, I missed some extra info about our host: # ifconfig eth1 eth1 Link encap:Ethernet HWaddr 00:25:B3:A8:98:5C inet addr:193.109.172.3 Bcast:193.109.172.127 Mask:255.255.255.128 inet6 addr: fe80::225:b3ff:fea8:985c/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 RX packets:8934881006 errors:0 dropped:0 overruns:0 frame:0 TX packets:6698459450 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:5 RX bytes:53557237568441 (48.7 TiB) TX bytes:34716302471941 (31.5 TiB) Interrupt:40 Memory:fa00-fa7f # cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed lo:347338000910 39495163000 0 0 0 347338000910 39495163000 0 0 0 eth0: 0 0000 0 0 00 0000 0 0 0 eth1:53561590928109 8934936792000 0 0 35383 34716607052645 6698508232000 0 0 0 -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] ganglia 3.4.0 network traffic wrong
Hi all, We're running ganglia 3.4.0 (build from sources) and we're seeing wrong traffic network information. First, we suffer the old problem "Petabytes in netwrok traffic": http://www.mail-archive.com/ganglia-general@lists.sourceforge.net/msg07394.html I've patched ganglia's code (the patch is in the threat) and petabytes peaks disappeared, but now I see Terabyte peaks :-), so seems that there are still some issues with network counter. I've read the code but, honestly, I don't understand many things. I'd like to understand how ganglia collects network info (seems that uses /proc/net/dev) and how calculates it cause I'd like to know if the problem is in ganglia or in host. Anyone could write few lines about how ganglia collects net info? Anyone is seeing this problem? any solution? TIA, Cheers, Arnau -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] gmond[7771]: slurpfile() read() buffer overflow on file /proc/stat
> Hi Arnau, Hi, > Did you get anywhere with this? I'm getting the same annoying > messages... nop, only restarting gmond removes that error. > Cheers; > > MAO Cheers, Arnau -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] gmond[7771]: slurpfile() read() buffer overflow on file /proc/stat
Hi all, some of our RH 6.3 servers running ganglia-gmond-3.4.0-1.x86_64 give some errors about buffer overflow. (I'm not the admin of the host, but I'm trying to debug this error). # tail /var/log/messages Sep 7 12:28:14 dc020 /usr/sbin/gmond[7771]: slurpfile() read() buffer overflow on file /proc/stat Sep 7 12:28:34 dc020 /usr/sbin/gmond[7771]: slurpfile() read() buffer overflow on file /proc/stat Sep 7 12:28:54 dc020 /usr/sbin/gmond[7771]: slurpfile() read() buffer overflow on file /proc/stat I've been googling for the error and the replies I got recommend users to decreae gmetric's data sent, so they point the error in the user side. But I've been checking our local cron that report gmetrics and none run every 20 seconds, so I guess it must be some gmond internal metric. (it's a guess). I've run the daemon by hand with debug mode but then the error does not appear... I'm wondering if someone could give me some tips for debugging the error and find the source of the problem. TIA, Arnau This is hos /proc/stat looks like: # cat /proc/stat cpu 10756289 137 17430341 33643959 72529266 185 1291792 0 0 cpu0 1080607 36 2179004 1176363 3541632 40 432424 0 0 cpu1 753115 3 1175853 1723232 4777266 17 36135 0 0 cpu2 695025 7 1054593 2039309 4656867 6 33521 0 0 cpu3 633919 1 939243 2308043 4569676 36 32775 0 0 cpu4 1095742 33 2131379 1103066 3632162 55 452535 0 0 cpu5 765130 0 1147698 1718313 4808386 5 36539 0 0 cpu6 701556 2 1021776 2007871 4716173 2 32668 0 0 cpu7 638705 1 907393 2383885 4524930 1 28372 0 0 cpu8 776253 12 1393343 1281476 4993761 1 41341 0 0 cpu9 538701 10 791948 2313163 4826601 4 24482 0 0 cpu10 466077 2 698011 2769755 4544424 2 20526 0 0 cpu11 417917 2 608419 3088402 4368895 2 17170 0 0 cpu12 792239 13 1349030 1255988 5046114 1 41835 0 0 cpu13 528900 2 770950 2459342 4712915 2 23169 0 0 cpu14 462870 2 673045 2810720 4530676 2 20637 0 0 cpu15 409527 1 588648 3205026 4278781 1 17655 0 0 intr 1202851090 440 2 0 2 2 0 0 0 1 0 0 0 4 0 0 0 0 0 2 0 0 43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 147057779 105867365 19646406 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1078666 931197 546754 3513348 926899 546054 322593 477873 724921 1385440 582666 458788 535118 553596 435680 523281 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1945437 2431038 1406127 1969231 1622643 1537097 1145583 1182591 1387759 2126492 1184920 1143093 1135380 1279714 1086706 927836 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Re: [Ganglia-general] Defining custom graphs via json
On Wed, 29 Aug 2012 20:12:15 +0530 Abhijeet R wrote: > Hi guys, Hi! [...] > files including network_report.json. But, even then I am not able to > see any network graph that shows me both in/out traffic. I could > start creating my own file for what I want but right now, even the > network_report.json (example) is not working. > > Do I have to configure anything else to get it working? What am I > missing? I am using 3.5.2 of ganglia-web. I'm also using 3.5.2 and works for me. I've created some json files like: # cat torque_eff_80_by_group_report.json { "report_name" : "80_eff_by_group", "report_type" : "standard", "title" : "80_eff_by_group", "vertical_label" : "Shares", "series" : [ { "metric": "Eff_80_Group_atlas", "color": "00CED1", "label": "atlas", "stack_width": "2", "type": "stack" }, { "metric": "Eff_80_Group_atpilot", "color": "00FA9A", "label": "atpilot", "stack_width": "2", "type": "stack" }, { "metric": "Eff_80_Group_atprd", "color": "00FF00", "label": "atprd", "stack_width": "2", "type": "stack" }, [...] { "metric": "Eff_80_Group_vip", "color": "7CFC00", "label": "vip", "stack_width": "2", "type": "stack" } ] } be areful here cause syntax is really important. and also a custom file for my host is needed: # cat /var/lib/ganglia-web/conf/host_pbs04.pic.es.json { "included_reports": ["torque_eff_report","maui_report","torque_running_queue_report","torque_queued_queue_report","torque_total_report","torque_running_group_report","torque_queued_group_report","torque_eff_20_by_group_report","torque_eff_50_by_group_report","torque_eff_80_by_group_report"] } This works fine here. HTH, Arnau -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] gweb redirects to ganglia servers without domain
On Wed, 23 May 2012 07:20:03 -0600 Aaron Nichols wrote: [...] > Within the gmetad.conf for each of the remote grids there should be an > authority line which specifies the URL to reach that specific > ganglia-web instance - is that set to the fully qualified URL? nope, but now yes :-) thanks a lot!!! and me looking at php code... Cheers, Arnau -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] gweb redirects to ganglia servers without domain
Hi all, we have one ganglia-web server that collects info from 3 gmetad servers. # rpm -qa|egrep 'ganglia|gweb' gweb-3.3.1-1.noarch ganglia-3.1.7-3.el6.x86_64 ganglia-gmond-3.1.7-3.el6.x86_64 ganglia-gmetad-3.1.7-3.el6.x86_64 #/etc/ganglia/gmetad.conf [...] data_source "Computing Grid"server02.domain.com:8651 data_source "Storage Grid" server01.domain.com:8651 data_source "Other" server03.domain.com:8651 [...] It works great, but on the web, when you click on one of those grids, it redirects to the host without domain. I.e, if I click on "Other" Grid, it redirects to server03/gweb/ I've been reading some php code, but I don't find where this redirection is been done... Anyone, who better knows php, could tell me where is this redirection done and how may I add the domain? TIA, Arnau -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia 3.3.5 and sl5 (el5)
On Wed, 2 May 2012 11:32:46 +0200 Arnau Bria wrote: Hi, > 2.-) dependecy problem with 3.3.5 and sl5 because of expat. My fault. I did compile ganglia on sl57 and tried to install on sl53. If I compile ganglia on sl53 it works. Sorry for the confusion. Cheers, Arnau -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] negative swap memory when starting gmetad
On Fri, 27 Apr 2012 16:50:52 +0200 Ramon Bastiaans wrote: > Hi, Hi Ramon, > This is probably because the time_threshold setting for the > collection_group that has mem_swap amongst other things is every 1200 > seconds by default. Which is 20 minutes. yep, swap_total is under a collection group which time_threshold is 1200. > That is because the amount of memory etc in a system does not change > that often. But I don't understand why it's causing negative swap value. >From man page: The time_threshold is the maximum amount of time that can pass before gmond sends all metrics specified in the collection_group to all configured udp_send_channels. Also, gmond sends info about collection groups at boot time, so swap_total should be a positive value _always_ . Could you please explain me why time_threshold is causing this negative value ? > Kind regards, > - Ramon. Many thanks for your reply Ramon, Cheers, Arnau -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] ganglia 3.3.5 and sl5 (el5)
Hi all, After installing new ganglia servers I'd like to upgrade our ganglia version from 3.1.7 (rpmforge) to 3.3.5 (self compiled). I'm facing some problem between old package and new package: 1.-) installing new version conflicts with old one: file /usr/lib64/ganglia/modcpu.so from install of ganglia-gmond-3.3.5-1.x86_64 conflicts with file from package ganglia-3.1.7-3.el6.x86_64 seems that new package does not obsoletes old one. I can skip this problem by removing/installing and not just upgrading. 2.-) dependecy problem with 3.3.5 and sl5 because of expat. I've succefully compiled 3.3.5 on a sl5 system, but now I cannnot install it beacuse I needs expat > 1 but sl5 provides expat-0.5: --> Missing Dependency: libexpat.so.1()(64bit) is needed by package libganglia-3.3.5-1.x86_64 (/libganglia-3.3.5-1.x86_64) # ls -lsa /lib/libexpat.so.0 4 lrwxrwxrwx 1 root root 17 Apr 25 12:45 /lib/libexpat.so.0 -> libexpat.so.0.5.0 # rpm -ql expat-1.95.8-8.3.el5_5.3.i386 /lib/libexpat.so.0 /lib/libexpat.so.0.5.0 [...] I've not seen any note about this restriction in release notes (https://github.com/ganglia/monitor-core/wiki/Release-Notes) so, anyone already faced this problem and knows how to solve it? (I could modify spec file and remove expat > 1 restriction, but I'm sure this will cause several problems, am I right?) TIA, Arnau -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Petabyte network peaks
On Fri, 27 Apr 2012 11:45:28 +0200 Sergio Ballestrero wrote: > Hello Arnau, Hi Sergio, > We've done better - exabytes per second! ;-) Wow!! > We've seen that on some specific types of NICs on Scientific Linux 5, > running Ganglia 3.2.0 I have updated Roger's patch (attached), and > that seems to have mostly solved the issue. I recently saw another > occurrence of it but didn't have time to investigate it yet. Ok, I've downloaded last ganglia version and recompiled its source after applying Roger's patch. Next Wednesday I'll apply the update. > Cheers, > Sergio Many thanks for sharing your experience with us! Cheers, Arnau -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] Petabyte network peaks
Hi all, I'm seeing petabyte network peaks in our current conf. I've read this old mail: http://www.mail-archive.com/ganglia-general@lists.sourceforge.net/msg04261.html where Roger explains our same problem and offers a patch. I've downloaded last ganglia version (3.3.5) and I'm not seeing his lines in libmetrics.c, so I'm wondering if new (last) gmond version will still suffer from this bug, or if I have to apply Roger's patch. Anyone is seeing this problem and has solved in some way? server: ganglia-gmond-3.1.7-3.el6.x86_64 (sl6.1) client: ganglia-gmond-3.1.7-3.el6.x86_64 (sl6.0) At this moment, I've not seen this problem in other hosts/arch/versions. TIA, Arnau -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] negative swap memory when starting gmetad
Hi all, I've configured a couple of new ganglia servers (ganglia-gmetad-3.1.7-3.el6.x86_64) on sl6 and I'm seeing a strange negative swap memory aggregate when restarting server's gmetad. after about 20 minutes, memory is correctly showed, but I'm wondering what could be causing this strange behaivour. In my gmetad I only have data_source, gridname, authority, trusted_hosts and case sensitive. I don't know what other info I could send. Anyone is seeing this behaivour too? TIA, Arnau -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] from multicas to unicast
Hi all, We're currently using multicas and I'm looking for some unicast configuration advice. We have several clusters and some of them have about 300 hosts (I don't know if this number is high enough for extra ganglia tuning, for these reason my question). I must configure as unicast receiver(s) one of the host belonging to same cluster, but: - is it going to see its load really increased? - is there any conf option where reciever could be one host outside of the cluster? - do I need some extra conf for cluster of that size? - How many receivers backup may I configure? TIA, Arnau -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] grid of grids (ganglia-3.3.1)
On Wed, 28 Mar 2012 17:02:59 +0200 Alexander Karner wrote: > Hi Arnau! Hi Alexander! > Well, some weeks ago there was some kind of discussion about the > setup that you want to use. I promise that I've googled for my problem and also searched inside mailing list. > It seems that all Ganglia versions > 3.1.7 contain a bug, that > preventrs gmetad to collect data from other gmetad's. > --> You should run your central gmetad system with Version 3.1.7, > your remote gmetad's could be installed on any level that you prefer. Thanks, I've downgrade ganglia01's versions and now it works! many thanks! > You'll find a more detailed bug report in the bugzilla area. > > > Mit freundlichen Grüßen / Kind regards > > Alexander Karner > > Program Manager System Check & Kundentag > IBM Accredited Senior IT Specialist > Global Technology Services Cheers, Arnau -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] grid of grids (ganglia-3.3.1)
Hi all, I'm new to list, so let me first say hello to everyone. My name is Arnau Bria and I'm quite newbie when talking about ganglia. We've been working with ganglia for a long time, but we have never studied all its options. We use multicast and one gmetad that collects from several clusters. We are now planning a new fresh install (ganglia 3.3.1 on SL61 build from tarball) and we want to introduce 2 big changes: unicast and some grids (gmetads/gwebs) collected by one frontend. This first mail is about configuring grid of grid. I've not found any clear doc explaning how to configure it (I've found many about one gmetad reading from many clusters, but non about one gmetad collecting from other gmetads), so if someone knows one feel free to send me the link and stop reading at this point :-) If no one knows a link, let me explain my environment: ganglia01 -> Grid of Grids collector. ganglia02 -> it has it own grid (and cluster) and gmond is running here. I'd like ganglia01 to collect info from gangli02 gmetad's and to publish it into its gweb. My conf: ganglia01: gmetad.conf: # grep . /etc/ganglia/gmetad.conf|grep -v "#" data_source "MyTest2" ganglia02.pic.es:8651 gridname "PICGrid" authority "http://ganglia01.domain/gweb/"; case_sensitive_hostnames 0 ganglia02: gmetad.conf: # grep . /etc/ganglia/gmetad.conf|grep -v "#" data_source "Ganglia2" localhost:8649 gridname "MyTest2" authority "http://ganglia01.pic.es/gweb/"; trusted_hosts 127.0.0.1 193.109.175.116 ganglia01.domain case_sensitive_hostnames 0 # gmond info: [...] cluster { name = "cluster1-ganglia2" owner = "unspecified" latlong = "unspecified" url = "unspecified" } [...] ganglia02 has its own gweb and if I go to http://ganglia02/gweb I can see: MyTest2 Grid -> cluster1-ganglia2 -> ganglia02 (which is the only node belonging to that cluster). This is fine. But ganglia01 only shows: PICGrid Grid -> So, it's not collectic any ganglis02 gmetad info. ganglia01 log says: poll() timeout from source 0 for [MyTest2] data source after 0 bytes read and ganglia02 : Got a malformed path request from 193.109.175.116 If I telnet from ganglia01 to ganglia02 at port 8651: @ganglia01 html]# telnet ganglia02.pic.es 8651 Trying 193.109.175.117... Connected to ganglia02.pic.es. Escape character is '^]'. [...] http://ganglia01.domain/gweb/"; LOCALTIME="1332945021"> So seems that ganglia02's gmetad is publishing its info correctly, but ganglia01 not able to collect it. running gmetad in debug mode: # gmetad -d10 Going to run as user nobody Sources are ... Source: [MyTest2, step 15] has 1 sources 193.109.175.117 -> ganglia02 IP xml listening on port 8651 interactive xml listening on port 8652 cleanup thread has been started Data thread 140307471931136 is monitoring [MyTest2] data source 193.109.175.117 [MyTest2] is a 2.5 or later data stream hash_create size = 50 hash->size is 53 Found a , depth is now 1 Found a , depth is now 0 [MyTest2] is a 2.5 or later data stream hash_create size = 50 [repeat for rever] I see no error on syslog, so, could someone help me to understand what's wrong with my conf? TIA, Arnau -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general