HI Imesh, Thanks for replying,
> This issue might occur if the cartridge agent start processing member events > before consuming Complete Topology event. The issue happened way after that, I had Stratos running for a day or so, and in the logs I saw some “waiting for complete topology event ..” but they went away pretty quickly (way before this happened). Is this the code that’s supposed to do the updates? https://github.com/apache/stratos/blob/4.0.0/components/org.apache.stratos.cartridge.agent/src/main/java/org/apache/stratos/cartridge/agent/extensions/DefaultExtensionHandler.java#L328 Because I don’t see anything that actually updates anything (beyond function-local variables like ‘env').. Michiel On 20 Apr 2015, at 18:13, Imesh Gunaratne <im...@apache.org> wrote: > Hi Michiel, > > This issue might occur if the cartridge agent start processing member events > before consuming Complete Topology event. > > This is how the topology get initialized in any component that listen to > topology topic in message broker; First of all when the component starts up > it waits for the Complete Topology event to receive. This event is > periodically published by Cloud Controller with the entire topology of a > given moment of time. > > Once it is received the component would initialize the local topology and > start listening to other events. Since Complete Topology event has given the > latest state of the topology now the component can consume any other event > published afterwards. > > Thanks > > > > On Mon, Apr 20, 2015 at 7:44 PM, Michiel Blokzijl (mblokzij) > <mblok...@cisco.com> wrote: > Hi, > I’m looking at an issue with Stratos 4.0.0 code, and I’m having an issue with > the cartridge agent. It complains about the topology being inconsistent, > triggered by this code [1]. > > This causes the extension handler not to fire for cartridges going down. > > [2015-04-19 07:19:22,486] INFO - [MemberTerminatedMessageProcessor] Member > terminated: [service] XXX [cluster] XXX [member] > XXX-0.dom2a4618d5-edd9-4a99-9d9c-918715c761bd > [2015-04-19 07:19:22,486] INFO - [DefaultExtensionHandler] Member terminated > event received: [service] XXX [cluster] XX [member] > XXX-0.dom2a4618d5-edd9-4a99-9d9c-918715c761bd > [2015-04-19 07:19:22,486] ERROR - [ExtensionUtils] Member id not found in > topology [member] XXXX.dom2a4618d5-edd9-4a99-9d9c-918715c761bd > [2015-04-19 07:19:22,486] ERROR - [DefaultExtensionHandler] Topology is > inconsistent...failed to execute member terminated event > > Any idea what’s going wrong here? > > I assume the topology isn’t being maintained correctly for some reason, but I > haven’t quite figured out how/if the topology is being maintained at all. > Looking at the complete topology event handler [2] for example, it doesn’t > actually update the internally stored topology.. There’s nothing in the > cartridge agent that calls the topology manager’s acquireWriteLock function.. > > Best regards, > > Michiel > > [1] > https://github.com/apache/stratos/blob/4.0.0/components/org.apache.stratos.cartridge.agent/src/main/java/org/apache/stratos/cartridge/agent/extensions/DefaultExtensionHandler.java#L374 > > [2] > https://github.com/apache/stratos/blob/4.0.0/components/org.apache.stratos.cartridge.agent/src/main/java/org/apache/stratos/cartridge/agent/extensions/DefaultExtensionHandler.java#L328 > > > > -- > Imesh Gunaratne > > Technical Lead, WSO2 > Committer & PMC Member, Apache Stratos
signature.asc
Description: Message signed with OpenPGP using GPGMail