Hi,

+1 for having monitoring system. Otherwise it may cause serious problems in
production environments.


On Wed, Jul 30, 2014 at 9:55 AM, Udara Liyanage <ud...@wso2.com> wrote:

> Hi Akila,
>
> +1
> The core reason for this is Stratos function is heavily depending on the
> message broker. Losing messages, or unavailability of the MB causes system
> go into a problematic state. $subject is one of example scenario.
> We should have a health monitoring system which does not depends on the MB.
>
>
> On Wed, Jul 30, 2014 at 9:45 AM, Akila Ravihansa Perera <
> raviha...@wso2.com> wrote:
>
>> Hi Devs,
>>
>> Current Stratos architecture relies heavily on high availability of
>> the message broker. We faced a situation when MB is down, some of the
>> messages published will get lost forever and the system state will
>> never be recovered.
>>
>> One such example is, when a cartridge instance goes down the CEP
>> component will identify this event and publish a MemberFault event to
>> the MB's summarized-health-stat topic. But the problem is CEP
>> component creates its own list of cartridge instance members by
>> looking at health-stats published to MB - it does not consider the
>> topology. Hence, when a cartridge instance goes down, MemberFault
>> event will get fired only once. But if the MB is down at this time, it
>> will cause this message to be lost forever resulting in an un-stable
>> system state in which Stratos thinks a member exists but in reality it
>> is not the case.
>>
>> We can introduce a simple house keeping task to check whether every
>> member is alive. Ideally this should be auto-scaler's responsibility.
>> It will allow the system to recover itself from an un-stable
>> situation. I think this is a critical bug and should be given high
>> priority.
>>
>> Please share your thoughts.
>>
>> --
>> Akila Ravihansa Perera
>> Software Engineer
>> WSO2 Inc.
>> http://wso2.com
>>
>> Blog: http://ravihansa3000.blogspot.com
>>
>
>
>
> --
>
> Udara Liyanage
> Software Engineer
> WSO2, Inc.: http://wso2.com
> lean. enterprise. middleware
>
> web: http://udaraliyanage.wordpress.com
> phone: +94 71 443 6897
>



-- 
Regards,
Manula Chathurika Thantriwatte
Software Engineer
WSO2 Inc. : http://wso2.com
lean . enterprise . middleware

email : manu...@wso2.com / man...@apache.org
phone : +94 772492511
blog : http://manulachathurika.blogspot.com/

Reply via email to