Hi Demetri,

Could you try building from my personal development branch? It is an
up-to-date merge with Ganglia master with one additional potential bug fix (
https://github.com/satterly/monitor-core/commit/ed3ad9d57b1d582503ef0104e17f7919044c7617
).

If this version runs without segfaulting I'll push it to the ganglia
feature/cloud branch.

And thanks for the pull request. It seems that it needs to be rebased with
master. However, if your testing of the above branch proves successful we
can rebase your patch against that.

Let me know how you get on.

Regards,
Nick


On Mon, Jun 17, 2013 at 11:53 PM, Demetri Mouratis <dmour...@gmail.com>wrote:

> Nicholas Satterly <nfsatterly <at> gmail.com> writes:
> >
> > [1]
> https://github.com/ganglia/monitor-core/compare/master...feature/cloud
> >
>
>
> Nick,
>
> Thanks for your work in implementing this feature.  I'm in the same boat
> with a
> larg(ish) EC2 (VPC) deployment and sorely missing ganglia in this new
> environment.
>
> I've found and fixed one bug pertaining to localtime versus GMT in the EC2
> apr
> request:
>
> https://github.com/ganglia/monitor-core/pull/112
>
> Amazon expects all timestamps to be in GMT.  Some of my hosts have non-GMT
> set
> localtimes (don't ask).
>
> Now I'm facing a consistent sefgfault when the number of nodes in the
> cluster is
> large (>= 17).
>
> The error looks like:
>
> [discovery.ec2] Found 17 matching instances [discovery.ec2] adding
> i-10ad3c25,
> udp send channel private_ip 10.10.1.211:8649 [discovery.ec2] adding
> i-34296506,
> udp send channel private_ip 10.10.1.204:8649 [discovery.ec2] adding
> i-1894ff2a,
> udp send channel private_ip 10.10.1.240:8649 [discovery.ec2] adding
> i-1a94ff28,
> udp send channel private_ip 10.10.1.241:8649 [discovery.ec2] adding
> i-cc99f2fe,
> udp send channel private_ip 10.10.1.214:8649 [discovery.ec2] adding
> i-c81c8dfd,
> udp send channel private_ip 10.10.2.115:8649 [discovery.ec2] adding
> i-a2d36990,
> udp send channel private_ip 10.10.1.116:8649 [discovery.ec2] adding
> i-24235016,
> udp send channel private_ip 10.10.1.234:8649 [discovery.ec2] adding
> i-2401bc11,
> udp send channel private_ip 10.10.2.216:8649 [discovery.ec2] adding
> i-2a235018,
> udp send channel private_ip 10.10.1.235:8649 [discovery.ec2] adding
> i-3a01bc0f,
> udp send channel private_ip 10.10.2.217:8649 [discovery.ec2] adding
> i-3801bc0d,
> udp send channel private_ip 10.10.2.218:8649 [discovery.ec2] adding
> i-d27015e7,
> udp send channel private_ip 10.10.2.164:8649 [discovery.ec2] adding
> i-2823501a,
> udp send channel private_ip 10.10.1.238:8649 [discovery.ec2] adding
> i-3a07620f,
> udp send channel private_ip 10.10.2.177:8649 [discovery.ec2] adding
> i-422a4f77,
> udp send channel private_ip 10.10.2.64:8649 [discovery.ec2] adding
> i-3890f10a,
> udp send channel private_ip 10.10.1.102:8649 .  .  .
>
> [discovery.ec2] Refreshing node list...  [discovery.cloud] access
> key=AKIAJNY4GBUKJRXY4JDA, secret
> key=************************************DxvJ
> [discovery.ec2] using host_type [private_ip], tags [environment= TEST],
> groups
> [], availability_zones [] [discovery.ec2] using endpoint
> ec2.us-west-2.amazonaws.com -> ec2.us-west-2.amazonaws.com [discovery.ec2]
> URL-encoded API request ec2.us-west-2.amazonaws.com?
> AWSAccessKeyId=AKIAJNY4GBUKJRXY4JDA&Action=DescribeInstances&Filter.1.Name
> =
> instance-state-
> name&Filter.1.Value=running&Filter.2.Name
> =tag%3Aenvironment&Filter.2.Value=
> TEST&SignatureMet
> hod=HmacSHA256&SignatureVersion=2&Timestamp=2013-06-
> 17T22%3A41%3A39Z&Version=2012-08-
> 15&Signature=
> O7qmbgbbZnMk8njNQiEo4YLlDIVhM9NAF4171NoMTj4%3D [discovery.ec2] HTTP
> response code 200, 99664 bytes retrieved Segmentation fault
>
> The crash is reproducible, happens in about 2 minutes after start and can
> be
> avoided by renaming one of the hosts environment= tags to remove it from
> the
> cluster.
>
> I haven't been able to come up with a fix for this issue but I'm
> sufficiently
> out of my depth at this point to ask for help.
>
> Thanks.
>
> -D
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Ganglia-developers mailing list
> Ganglia-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/ganglia-developers
>



-- 
gpg: using PGP trust model
pub   4096R/1EE38BD9 2013-01-06 [expires: 2018-01-06]
      Key fingerprint = 3EE9 550D D9D8 DB65 58C2  B58D CE78 EC6C 1EE3 8BD9
uid                  Nicholas Satterly (Debian Key) <nfsatte...@gmail.com>
sub   4096R/23804EE9 2013-01-06 [expires: 2018-01-06]
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to