HI Imesh,

Thanks for replying,

> This issue might occur if the cartridge agent start processing member events 
> before consuming Complete Topology event.


The issue happened way after that, I had Stratos running for a day or so, and 
in the logs I saw some “waiting for complete topology event ..” but they went 
away pretty quickly (way before this happened).

Is this the code that’s supposed to do the updates? 
https://github.com/apache/stratos/blob/4.0.0/components/org.apache.stratos.cartridge.agent/src/main/java/org/apache/stratos/cartridge/agent/extensions/DefaultExtensionHandler.java#L328

Because I don’t see anything that actually updates anything (beyond 
function-local variables like ‘env')..

Michiel

On 20 Apr 2015, at 18:13, Imesh Gunaratne <im...@apache.org> wrote:

> Hi Michiel,
> 
> This issue might occur if the cartridge agent start processing member events 
> before consuming Complete Topology event.
> 
> This is how the topology get initialized in any component that listen to 
> topology topic in message broker; First of all when the component starts up 
> it waits for the Complete Topology event to receive. This event is 
> periodically published by Cloud Controller with the entire topology of a 
> given moment of time. 
> 
> Once it is received the component would initialize the local topology and 
> start listening to other events. Since Complete Topology event has given the 
> latest state of the topology now the component can consume any other event 
> published afterwards.
> 
> Thanks
> 
> 
> 
> On Mon, Apr 20, 2015 at 7:44 PM, Michiel Blokzijl (mblokzij) 
> <mblok...@cisco.com> wrote:
> Hi,
> I’m looking at an issue with Stratos 4.0.0 code, and I’m having an issue with 
> the cartridge agent. It complains about the topology being inconsistent, 
> triggered by this code [1].
> 
> This causes the extension handler not to fire for cartridges going down.
> 
> [2015-04-19 07:19:22,486]  INFO - [MemberTerminatedMessageProcessor] Member 
> terminated: [service] XXX [cluster] XXX [member] 
> XXX-0.dom2a4618d5-edd9-4a99-9d9c-918715c761bd
> [2015-04-19 07:19:22,486]  INFO - [DefaultExtensionHandler] Member terminated 
> event received: [service] XXX [cluster] XX [member] 
> XXX-0.dom2a4618d5-edd9-4a99-9d9c-918715c761bd
> [2015-04-19 07:19:22,486] ERROR - [ExtensionUtils] Member id not found in 
> topology [member] XXXX.dom2a4618d5-edd9-4a99-9d9c-918715c761bd
> [2015-04-19 07:19:22,486] ERROR - [DefaultExtensionHandler] Topology is 
> inconsistent...failed to execute member terminated event
> 
> Any idea what’s going wrong here?
> 
> I assume the topology isn’t being maintained correctly for some reason, but I 
> haven’t quite figured out how/if the topology is being maintained at all. 
> Looking at the complete topology event handler [2] for example, it doesn’t 
> actually update the internally stored topology.. There’s nothing in the 
> cartridge agent that calls the topology manager’s acquireWriteLock function..
> 
> Best regards,
> 
> Michiel
> 
> [1] 
> https://github.com/apache/stratos/blob/4.0.0/components/org.apache.stratos.cartridge.agent/src/main/java/org/apache/stratos/cartridge/agent/extensions/DefaultExtensionHandler.java#L374
> 
> [2] 
> https://github.com/apache/stratos/blob/4.0.0/components/org.apache.stratos.cartridge.agent/src/main/java/org/apache/stratos/cartridge/agent/extensions/DefaultExtensionHandler.java#L328
> 
> 
> 
> -- 
> Imesh Gunaratne
> 
> Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to