Hi, +1 for having monitoring system. Otherwise it may cause serious problems in production environments.
On Wed, Jul 30, 2014 at 9:55 AM, Udara Liyanage <ud...@wso2.com> wrote: > Hi Akila, > > +1 > The core reason for this is Stratos function is heavily depending on the > message broker. Losing messages, or unavailability of the MB causes system > go into a problematic state. $subject is one of example scenario. > We should have a health monitoring system which does not depends on the MB. > > > On Wed, Jul 30, 2014 at 9:45 AM, Akila Ravihansa Perera < > raviha...@wso2.com> wrote: > >> Hi Devs, >> >> Current Stratos architecture relies heavily on high availability of >> the message broker. We faced a situation when MB is down, some of the >> messages published will get lost forever and the system state will >> never be recovered. >> >> One such example is, when a cartridge instance goes down the CEP >> component will identify this event and publish a MemberFault event to >> the MB's summarized-health-stat topic. But the problem is CEP >> component creates its own list of cartridge instance members by >> looking at health-stats published to MB - it does not consider the >> topology. Hence, when a cartridge instance goes down, MemberFault >> event will get fired only once. But if the MB is down at this time, it >> will cause this message to be lost forever resulting in an un-stable >> system state in which Stratos thinks a member exists but in reality it >> is not the case. >> >> We can introduce a simple house keeping task to check whether every >> member is alive. Ideally this should be auto-scaler's responsibility. >> It will allow the system to recover itself from an un-stable >> situation. I think this is a critical bug and should be given high >> priority. >> >> Please share your thoughts. >> >> -- >> Akila Ravihansa Perera >> Software Engineer >> WSO2 Inc. >> http://wso2.com >> >> Blog: http://ravihansa3000.blogspot.com >> > > > > -- > > Udara Liyanage > Software Engineer > WSO2, Inc.: http://wso2.com > lean. enterprise. middleware > > web: http://udaraliyanage.wordpress.com > phone: +94 71 443 6897 > -- Regards, Manula Chathurika Thantriwatte Software Engineer WSO2 Inc. : http://wso2.com lean . enterprise . middleware email : manu...@wso2.com / man...@apache.org phone : +94 772492511 blog : http://manulachathurika.blogspot.com/