[Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend

2011-03-22 Thread Ron Cavallo
Hey there folks.

I have three nodes whos's gmond.conf is clearly specified as cluster A. Lets 
call them node1a, node2a, node3a.

I have on the gmetad.conf configured access for only one node of the cluster (I 
just let it download the entire cluster information from one node instead of 
chatting with all of them). That entry looks like:

data_source cluster A 45 node1a.saksdirect.com:8649 

node2a and node3a report properly to cluster A, but node1a reports to some 
other completely unrelated cluster like cluster C

I see other examples where I have to go hunting around for cluster members that 
aren't reporting into the proper cluster.

Any ideas?

Saks's cluster is now 28 servers strong with 4 clusters.

Thanks!


Ron Cavallo 
Sr. Director, Infrastructure
Saks Fifth Avenue / Saks Direct
12 East 49th Street
New York, NY 10017
212-451-3807 (O)
212-940-5079 (fax) 
646-315-0119(C) 
www.saks.com
 

-Original Message-
From: Bostjan Skufca [mailto:bost...@a2o.si] 
Sent: Monday, March 21, 2011 9:15 PM
To: Bernard Li
Cc: ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Network interface byte count over 4GB on 32bit 
linux causes missing data

 Can you explain how it would have made your life easier for rebasing
 code if we were using Git instead of SVN?

First off - maybe because I am not a SVN expert. But my experience has
shown me that where I have had to struggle with SVN to manage
branches, individual commit cherry picks and merging, git just did
it. Today again, I had to manually reformat the patch to apply it to
trunk. Again, I am not a heavy SVN user, I switched to git when I
started with advanced repo stuff.

With git, my workflow would be:
- clone repo
- checkout tag 3.1.7
- apply external patch
- commit
- rebase to HEAD
- format patch and send it (or even better, send a pull request)

Do you see a comparatively easy approach in SVN where original patch
just won't apply because code offsets were through the roof compared
to context included in a diff and manual page states you can't
increase fuzz factor (offset search) above number of context lines?

Fact is - occasional contributors will usually not bother creating
patches against trunk in the first place, but against code they have
at hand and which they are trying to fix. Complicating the patch
submission procedure will only decrease the number of patches flowing
upstream.

Take care,
b.

PS: These are just personal thoughts, nothing more. I just realised I
am much like you for one of my OSS mini projects, but then again I
assumed, that unlike mine project, ganglia consumes relevant part of
your work life, which may not be the case.

b.


On 22 March 2011 01:27, Bernard Li bern...@vanhpc.org wrote:
 Hi Bostjan:

 On Mon, Mar 21, 2011 at 5:16 PM, Bostjan Skufca bost...@a2o.si wrote:

 Heh, my first reaction was He must be joking... :)
 Anyway, done. However rebasing patches with SVN is major PITA and you
 should consider yourself lucky that I persisted:)
 Do you plan moving onto something better (hg, git)?

 I am glad you persisted, and thanks again.  We'll look into it shortly.

 We have thought about moving to Git but not really sure if now is the
 right time to do it -- at least I do not see it buying us much at this
 point.

 Can you explain how it would have made your life easier for rebasing
 code if we were using Git instead of SVN?

 Thanks,

 Bernard


--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend

2011-03-22 Thread Seth Graham

On Mar 22, 2011, at 10:53 AM, Ron Cavallo wrote:
 
 I see other examples where I have to go hunting around for cluster members 
 that aren't reporting into the proper cluster.
 
 Any ideas?

Double check the ports in use in the gmond.conf on the machines that are 
misbehaving. 

Also note that machines tend to linger in an old cluster they were reporting 
to, even if their config file says otherwise. If you look at the XML dump from 
the gmetad, you may find that a given machine appears twice. The web frontend 
gives fairly random results when this happens.

These stale entries do eventually expire (default is 30 days I believe), but a 
restart of all gmond processes and gmetad will clean it up instantly.


--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend

2011-03-22 Thread Ron Cavallo
Hey there.

I used the script that I wrote to STOP all gmonds, STOP gmetad, then
START All gmonds and START gmetads, and I still have non-cluster members
reporting into the wrong clusters. Any other ideas that you may have
would be appreciated!

-Regards

Ron Cavallo 
Sr. Director, Infrastructure
Saks Fifth Avenue / Saks Direct
12 East 49th Street
New York, NY 10017
212-451-3807 (O)
212-940-5079 (fax) 
646-315-0119(C) 
www.saks.com
 

-Original Message-
From: Seth Graham [mailto:set...@fnal.gov] 
Sent: Tuesday, March 22, 2011 12:05 PM
To: Ron Cavallo
Cc: Bostjan Skufca; Bernard Li; ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong
clusters in web frontend


On Mar 22, 2011, at 10:53 AM, Ron Cavallo wrote:
 
 I see other examples where I have to go hunting around for cluster
members that aren't reporting into the proper cluster.
 
 Any ideas?

Double check the ports in use in the gmond.conf on the machines that are
misbehaving. 

Also note that machines tend to linger in an old cluster they were
reporting to, even if their config file says otherwise. If you look at
the XML dump from the gmetad, you may find that a given machine appears
twice. The web frontend gives fairly random results when this happens.

These stale entries do eventually expire (default is 30 days I believe),
but a restart of all gmond processes and gmetad will clean it up
instantly.


--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend

2011-03-22 Thread Bernard Li
Hi Ron:

On Tue, Mar 22, 2011 at 2:25 PM, Ron Cavallo ron_cava...@s5a.com wrote:

 I used the script that I wrote to STOP all gmonds, STOP gmetad, then
 START All gmonds and START gmetads, and I still have non-cluster members
 reporting into the wrong clusters. Any other ideas that you may have
 would be appreciated!

Are you using unicast or multicast in this instance?  If you're using
multicast, you need to make sure the cluster uses a different port
(i.e. other than the default 8649 gmond port) because that's how they
are clustered together.  Pick something like 8650 in gmond.conf,
restart all the daemons and it should work.

Cheers,

Bernard

--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend

2011-03-22 Thread Ron Cavallo
I am using Multicast.

So each cluster needs it's own port is that what you are saying?

-Regards

Ron Cavallo 
Sr. Director, Infrastructure
Saks Fifth Avenue / Saks Direct
12 East 49th Street
New York, NY 10017
212-451-3807 (O)
212-940-5079 (fax) 
646-315-0119(C) 
www.saks.com
 

-Original Message-
From: Bernard Li [mailto:bern...@vanhpc.org] 
Sent: Tuesday, March 22, 2011 5:38 PM
To: Ron Cavallo
Cc: Seth Graham; ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong
clusters in web frontend

Hi Ron:

On Tue, Mar 22, 2011 at 2:25 PM, Ron Cavallo ron_cava...@s5a.com
wrote:

 I used the script that I wrote to STOP all gmonds, STOP gmetad, then
 START All gmonds and START gmetads, and I still have non-cluster
members
 reporting into the wrong clusters. Any other ideas that you may have
 would be appreciated!

Are you using unicast or multicast in this instance?  If you're using
multicast, you need to make sure the cluster uses a different port
(i.e. other than the default 8649 gmond port) because that's how they
are clustered together.  Pick something like 8650 in gmond.conf,
restart all the daemons and it should work.

Cheers,

Bernard

--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend

2011-03-22 Thread Rick Cobb
Or its own multicast address. That can make it easier to segment
things by switch if you need to (e.g., if you're modeling racks as
clusters, using a multicast address or port that matches up with your
rack# can make things easier to track down).  Also, your switches will
propagate based on address, not port, generally, so you get less
network traffic if you assign an address per cluster than a port per
cluster (again assuming your cluster boundaries match up with your
networking boundaries).

Just another option --
-- ReC

On Tue, Mar 22, 2011 at 2:39 PM, Ron Cavallo ron_cava...@s5a.com wrote:
 I am using Multicast.

 So each cluster needs it's own port is that what you are saying?

 -Regards

 Ron Cavallo
 Sr. Director, Infrastructure
 Saks Fifth Avenue / Saks Direct
 12 East 49th Street
 New York, NY 10017
 212-451-3807 (O)
 212-940-5079 (fax)
 646-315-0119(C)
 www.saks.com


 -Original Message-
 From: Bernard Li [mailto:bern...@vanhpc.org]
 Sent: Tuesday, March 22, 2011 5:38 PM
 To: Ron Cavallo
 Cc: Seth Graham; ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong
 clusters in web frontend

 Hi Ron:

 On Tue, Mar 22, 2011 at 2:25 PM, Ron Cavallo ron_cava...@s5a.com
 wrote:

 I used the script that I wrote to STOP all gmonds, STOP gmetad, then
 START All gmonds and START gmetads, and I still have non-cluster
 members
 reporting into the wrong clusters. Any other ideas that you may have
 would be appreciated!

 Are you using unicast or multicast in this instance?  If you're using
 multicast, you need to make sure the cluster uses a different port
 (i.e. other than the default 8649 gmond port) because that's how they
 are clustered together.  Pick something like 8650 in gmond.conf,
 restart all the daemons and it should work.

 Cheers,

 Bernard

 --
 Enable your software for Intel(R) Active Management Technology to meet the
 growing manageability and security demands of your customers. Businesses
 are taking advantage of Intel(R) vPro (TM) technology - will your software
 be a part of the solution? Download the Intel(R) Manageability Checker
 today! http://p.sf.net/sfu/intel-dev2devmar
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in web frontend

2011-03-22 Thread Ron Cavallo
Fantastic thanks I will try!
Ron Cavallo
Sr. Director, Infrastructure
Saks Fifth Avenue / Saks Direct
12 East 49th Street
New York, NY 10017
212-451-3807 (O)
212-451-3510 (fax)
646-315-0119(C)
www.saks.com http://www.saks.com/

- Original Message -
From: cob...@gmail.com cob...@gmail.com
To: Ron Cavallo
Cc: Bernard Li bern...@vanhpc.org; ganglia-general@lists.sourceforge.net 
ganglia-general@lists.sourceforge.net
Sent: Tue Mar 22 18:21:36 2011
Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong clusters in 
web frontend

Or its own multicast address. That can make it easier to segment
things by switch if you need to (e.g., if you're modeling racks as
clusters, using a multicast address or port that matches up with your
rack# can make things easier to track down).  Also, your switches will
propagate based on address, not port, generally, so you get less
network traffic if you assign an address per cluster than a port per
cluster (again assuming your cluster boundaries match up with your
networking boundaries).

Just another option --
-- ReC

On Tue, Mar 22, 2011 at 2:39 PM, Ron Cavallo ron_cava...@s5a.com wrote:
 I am using Multicast.

 So each cluster needs it's own port is that what you are saying?

 -Regards

 Ron Cavallo
 Sr. Director, Infrastructure
 Saks Fifth Avenue / Saks Direct
 12 East 49th Street
 New York, NY 10017
 212-451-3807 (O)
 212-940-5079 (fax)
 646-315-0119(C)
 www.saks.com


 -Original Message-
 From: Bernard Li [mailto:bern...@vanhpc.org]
 Sent: Tuesday, March 22, 2011 5:38 PM
 To: Ron Cavallo
 Cc: Seth Graham; ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Ganglia: Nodes showing up in wrong
 clusters in web frontend

 Hi Ron:

 On Tue, Mar 22, 2011 at 2:25 PM, Ron Cavallo ron_cava...@s5a.com
 wrote:

 I used the script that I wrote to STOP all gmonds, STOP gmetad, then
 START All gmonds and START gmetads, and I still have non-cluster
 members
 reporting into the wrong clusters. Any other ideas that you may have
 would be appreciated!

 Are you using unicast or multicast in this instance?  If you're using
 multicast, you need to make sure the cluster uses a different port
 (i.e. other than the default 8649 gmond port) because that's how they
 are clustered together.  Pick something like 8650 in gmond.conf,
 restart all the daemons and it should work.

 Cheers,

 Bernard

 --
 Enable your software for Intel(R) Active Management Technology to meet the
 growing manageability and security demands of your customers. Businesses
 are taking advantage of Intel(R) vPro (TM) technology - will your software
 be a part of the solution? Download the Intel(R) Manageability Checker
 today! http://p.sf.net/sfu/intel-dev2devmar
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general

--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general