[Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend
Hey there folks. I have three nodes whos's gmond.conf is clearly specified as cluster A. Lets call them node1a, node2a, node3a. I have on the gmetad.conf configured access for only one node of the cluster (I just let it download the entire cluster information from one node instead of chatting with all of them). That entry looks like: data_source cluster A 45 node1a.saksdirect.com:8649 node2a and node3a report properly to cluster A, but node1a reports to some other completely unrelated cluster like cluster C I see other examples where I have to go hunting around for cluster members that aren't reporting into the proper cluster. Any ideas? Saks's cluster is now 28 servers strong with 4 clusters. Thanks! Ron Cavallo Sr. Director, Infrastructure Saks Fifth Avenue / Saks Direct 12 East 49th Street New York, NY 10017 212-451-3807 (O) 212-940-5079 (fax) 646-315-0119(C) www.saks.com -Original Message- From: Bostjan Skufca [mailto:bost...@a2o.si] Sent: Monday, March 21, 2011 9:15 PM To: Bernard Li Cc: ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Network interface byte count over 4GB on 32bit linux causes missing data Can you explain how it would have made your life easier for rebasing code if we were using Git instead of SVN? First off - maybe because I am not a SVN expert. But my experience has shown me that where I have had to struggle with SVN to manage branches, individual commit cherry picks and merging, git just did it. Today again, I had to manually reformat the patch to apply it to trunk. Again, I am not a heavy SVN user, I switched to git when I started with advanced repo stuff. With git, my workflow would be: - clone repo - checkout tag 3.1.7 - apply external patch - commit - rebase to HEAD - format patch and send it (or even better, send a pull request) Do you see a comparatively easy approach in SVN where original patch just won't apply because code offsets were through the roof compared to context included in a diff and manual page states you can't increase fuzz factor (offset search) above number of context lines? Fact is - occasional contributors will usually not bother creating patches against trunk in the first place, but against code they have at hand and which they are trying to fix. Complicating the patch submission procedure will only decrease the number of patches flowing upstream. Take care, b. PS: These are just personal thoughts, nothing more. I just realised I am much like you for one of my OSS mini projects, but then again I assumed, that unlike mine project, ganglia consumes relevant part of your work life, which may not be the case. b. On 22 March 2011 01:27, Bernard Li bern...@vanhpc.org wrote: Hi Bostjan: On Mon, Mar 21, 2011 at 5:16 PM, Bostjan Skufca bost...@a2o.si wrote: Heh, my first reaction was He must be joking... :) Anyway, done. However rebasing patches with SVN is major PITA and you should consider yourself lucky that I persisted:) Do you plan moving onto something better (hg, git)? I am glad you persisted, and thanks again. We'll look into it shortly. We have thought about moving to Git but not really sure if now is the right time to do it -- at least I do not see it buying us much at this point. Can you explain how it would have made your life easier for rebasing code if we were using Git instead of SVN? Thanks, Bernard -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend
On Mar 22, 2011, at 10:53 AM, Ron Cavallo wrote: I see other examples where I have to go hunting around for cluster members that aren't reporting into the proper cluster. Any ideas? Double check the ports in use in the gmond.conf on the machines that are misbehaving. Also note that machines tend to linger in an old cluster they were reporting to, even if their config file says otherwise. If you look at the XML dump from the gmetad, you may find that a given machine appears twice. The web frontend gives fairly random results when this happens. These stale entries do eventually expire (default is 30 days I believe), but a restart of all gmond processes and gmetad will clean it up instantly. -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend
Hey there. I used the script that I wrote to STOP all gmonds, STOP gmetad, then START All gmonds and START gmetads, and I still have non-cluster members reporting into the wrong clusters. Any other ideas that you may have would be appreciated! -Regards Ron Cavallo Sr. Director, Infrastructure Saks Fifth Avenue / Saks Direct 12 East 49th Street New York, NY 10017 212-451-3807 (O) 212-940-5079 (fax) 646-315-0119(C) www.saks.com -Original Message- From: Seth Graham [mailto:set...@fnal.gov] Sent: Tuesday, March 22, 2011 12:05 PM To: Ron Cavallo Cc: Bostjan Skufca; Bernard Li; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend On Mar 22, 2011, at 10:53 AM, Ron Cavallo wrote: I see other examples where I have to go hunting around for cluster members that aren't reporting into the proper cluster. Any ideas? Double check the ports in use in the gmond.conf on the machines that are misbehaving. Also note that machines tend to linger in an old cluster they were reporting to, even if their config file says otherwise. If you look at the XML dump from the gmetad, you may find that a given machine appears twice. The web frontend gives fairly random results when this happens. These stale entries do eventually expire (default is 30 days I believe), but a restart of all gmond processes and gmetad will clean it up instantly. -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend
Hi Ron: On Tue, Mar 22, 2011 at 2:25 PM, Ron Cavallo ron_cava...@s5a.com wrote: I used the script that I wrote to STOP all gmonds, STOP gmetad, then START All gmonds and START gmetads, and I still have non-cluster members reporting into the wrong clusters. Any other ideas that you may have would be appreciated! Are you using unicast or multicast in this instance? If you're using multicast, you need to make sure the cluster uses a different port (i.e. other than the default 8649 gmond port) because that's how they are clustered together. Pick something like 8650 in gmond.conf, restart all the daemons and it should work. Cheers, Bernard -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend
I am using Multicast. So each cluster needs it's own port is that what you are saying? -Regards Ron Cavallo Sr. Director, Infrastructure Saks Fifth Avenue / Saks Direct 12 East 49th Street New York, NY 10017 212-451-3807 (O) 212-940-5079 (fax) 646-315-0119(C) www.saks.com -Original Message- From: Bernard Li [mailto:bern...@vanhpc.org] Sent: Tuesday, March 22, 2011 5:38 PM To: Ron Cavallo Cc: Seth Graham; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend Hi Ron: On Tue, Mar 22, 2011 at 2:25 PM, Ron Cavallo ron_cava...@s5a.com wrote: I used the script that I wrote to STOP all gmonds, STOP gmetad, then START All gmonds and START gmetads, and I still have non-cluster members reporting into the wrong clusters. Any other ideas that you may have would be appreciated! Are you using unicast or multicast in this instance? If you're using multicast, you need to make sure the cluster uses a different port (i.e. other than the default 8649 gmond port) because that's how they are clustered together. Pick something like 8650 in gmond.conf, restart all the daemons and it should work. Cheers, Bernard -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend
Or its own multicast address. That can make it easier to segment things by switch if you need to (e.g., if you're modeling racks as clusters, using a multicast address or port that matches up with your rack# can make things easier to track down). Also, your switches will propagate based on address, not port, generally, so you get less network traffic if you assign an address per cluster than a port per cluster (again assuming your cluster boundaries match up with your networking boundaries). Just another option -- -- ReC On Tue, Mar 22, 2011 at 2:39 PM, Ron Cavallo ron_cava...@s5a.com wrote: I am using Multicast. So each cluster needs it's own port is that what you are saying? -Regards Ron Cavallo Sr. Director, Infrastructure Saks Fifth Avenue / Saks Direct 12 East 49th Street New York, NY 10017 212-451-3807 (O) 212-940-5079 (fax) 646-315-0119(C) www.saks.com -Original Message- From: Bernard Li [mailto:bern...@vanhpc.org] Sent: Tuesday, March 22, 2011 5:38 PM To: Ron Cavallo Cc: Seth Graham; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend Hi Ron: On Tue, Mar 22, 2011 at 2:25 PM, Ron Cavallo ron_cava...@s5a.com wrote: I used the script that I wrote to STOP all gmonds, STOP gmetad, then START All gmonds and START gmetads, and I still have non-cluster members reporting into the wrong clusters. Any other ideas that you may have would be appreciated! Are you using unicast or multicast in this instance? If you're using multicast, you need to make sure the cluster uses a different port (i.e. other than the default 8649 gmond port) because that's how they are clustered together. Pick something like 8650 in gmond.conf, restart all the daemons and it should work. Cheers, Bernard -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend
Fantastic thanks I will try! Ron Cavallo Sr. Director, Infrastructure Saks Fifth Avenue / Saks Direct 12 East 49th Street New York, NY 10017 212-451-3807 (O) 212-451-3510 (fax) 646-315-0119(C) www.saks.com http://www.saks.com/ - Original Message - From: cob...@gmail.com cob...@gmail.com To: Ron Cavallo Cc: Bernard Li bern...@vanhpc.org; ganglia-general@lists.sourceforge.net ganglia-general@lists.sourceforge.net Sent: Tue Mar 22 18:21:36 2011 Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend Or its own multicast address. That can make it easier to segment things by switch if you need to (e.g., if you're modeling racks as clusters, using a multicast address or port that matches up with your rack# can make things easier to track down). Also, your switches will propagate based on address, not port, generally, so you get less network traffic if you assign an address per cluster than a port per cluster (again assuming your cluster boundaries match up with your networking boundaries). Just another option -- -- ReC On Tue, Mar 22, 2011 at 2:39 PM, Ron Cavallo ron_cava...@s5a.com wrote: I am using Multicast. So each cluster needs it's own port is that what you are saying? -Regards Ron Cavallo Sr. Director, Infrastructure Saks Fifth Avenue / Saks Direct 12 East 49th Street New York, NY 10017 212-451-3807 (O) 212-940-5079 (fax) 646-315-0119(C) www.saks.com -Original Message- From: Bernard Li [mailto:bern...@vanhpc.org] Sent: Tuesday, March 22, 2011 5:38 PM To: Ron Cavallo Cc: Seth Graham; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend Hi Ron: On Tue, Mar 22, 2011 at 2:25 PM, Ron Cavallo ron_cava...@s5a.com wrote: I used the script that I wrote to STOP all gmonds, STOP gmetad, then START All gmonds and START gmetads, and I still have non-cluster members reporting into the wrong clusters. Any other ideas that you may have would be appreciated! Are you using unicast or multicast in this instance? If you're using multicast, you need to make sure the cluster uses a different port (i.e. other than the default 8649 gmond port) because that's how they are clustered together. Pick something like 8650 in gmond.conf, restart all the daemons and it should work. Cheers, Bernard -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general