Guys, in general, please do a detailed logical analysis when the cluster
fails while in QA.

Making ballpark gestimates does not help. And if you are not sure, you then
need to gather facts, like message flows, logs, loads on nodes etc, based
on the behavior applicable.


On Wed, Aug 7, 2013 at 2:55 PM, Afkham Azeez <az...@wso2.com> wrote:

> Search for "Cluster initialization completed" log message in the two IS
> nodes. See whether that is printed, and yet the member didn't join the ELB.
>
>
> On Wed, Aug 7, 2013 at 11:00 AM, Chamara Ariyarathne <chama...@wso2.com>wrote:
>
>> Here having a setup with One ELB and Two IS nodes (As Mgt nodes).
>>
>> When one IS node starts up; the log in the ELB
>>
>> [2013-08-07 04:51:21,806]  INFO - HazelcastGroupManagementAgent Member
>> joined [e7792e38-54a7-4212-9d19-de25dbb7afd9]: /192.168.17.20:4001
>> [2013-08-07 04:51:21,806]  INFO - HazelcastGroupManagementAgent Member
>> joined [e7792e38-54a7-4212-9d19-de25dbb7afd9]: /192.168.17.20:4001
>> [2013-08-07 04:51:24,933]  INFO - MemberUtils Added member:
>> Host:192.168.17.20, Remote Host:null, Port: 4001, HTTP:9764, HTTPS:9444,
>> Domain: wso2.identity.domain, Sub-domain:mgt, Active:true
>> [2013-08-07 04:51:24,934]  INFO - HazelcastGroupManagementAgent
>> Application member Host:192.168.17.20, Remote Host:null, Port: 4001,
>> HTTP:9764, HTTPS:9444, Domain: wso2.identity.domain, Sub-domain:mgt,
>> Active:true joined application cluster
>> [2013-08-07 04:51:32,112]  INFO - TimeoutHandler This engine will expire
>> all callbacks after : 86400 seconds, irrespective of the timeout action,
>> after the specified or optional timeout
>>
>> So that this member joined. After that when the next IS node starts up.
>>
>> [2013-08-07 04:53:07,524]  INFO - HazelcastGroupManagementAgent Member
>> joined [882bca71-8802-44f4-89ee-ebe6334e42b1]: /192.168.17.19:4002
>> [2013-08-07 04:53:07,524]  INFO - HazelcastGroupManagementAgent Member
>> joined [882bca71-8802-44f4-89ee-ebe6334e42b1]: /192.168.17.19:4002
>>
>> So not sure if this one really joined. But after shutting down the first
>> IS node;
>>
>> [2013-08-07 04:53:38,782]  INFO - HazelcastGroupManagementAgent Member
>> left [e7792e38-54a7-4212-9d19-de25dbb7afd9]: /192.168.17.20:4001
>> [2013-08-07 04:53:38,782]  INFO - HazelcastGroupManagementAgent Member
>> left [e7792e38-54a7-4212-9d19-de25dbb7afd9]: /192.168.17.20:4001
>> [2013-08-07 04:53:43,252]  INFO - HazelcastGroupManagementAgent
>> Application member Host:192.168.17.19, Remote Host:null, Port: 4002,
>> HTTP:9764, HTTPS:9444, Domain: wso2.identity.domain, Sub-domain:mgt,
>> Active:true joined application cluster
>>
>> Now the second node have joined. So now if I startup the first node again;
>>
>> [2013-08-07 04:54:48,456]  INFO - HazelcastGroupManagementAgent Member
>> joined [99dad4be-b2f1-4c43-a862-3b4c7b1dfb51]: /192.168.17.20:4001
>> [2013-08-07 04:54:48,456]  INFO - HazelcastGroupManagementAgent Member
>> joined [99dad4be-b2f1-4c43-a862-3b4c7b1dfb51]: /192.168.17.20:4001
>>
>>
>> It seems that at a given time only one IS node can join.
>>
>> (Configurations attached.)
>>
>>
>>
>> --
>> *Chamara Ariyarathne*
>> Senior Software Engineer - QA;
>> WSO2 Inc; http://www.wso2.com/.
>> Mobile; *+94772786766*
>>
>
>
>
> --
> *Afkham Azeez*
> Director of Architecture; WSO2, Inc.; http://wso2.com
> Member; Apache Software Foundation; http://www.apache.org/
> * <http://www.apache.org/>**
> email: **az...@wso2.com* <az...@wso2.com>* cell: +94 77 3320919
> blog: **http://blog.afkham.org* <http://blog.afkham.org>*
> twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez>
> *
> linked-in: **http://lk.linkedin.com/in/afkhamazeez*
> *
> *
> *Lean . Enterprise . Middleware*
>



-- 

Thanks,
Samisa...

Samisa Abeysinghe
VP Engineering
WSO2 Inc.
http://wso2.com
http://wso2.org
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to