Hi Martin, As i have gone through your log, please see below with my findings. There seems to be not anything to do with undeployment process for this particular scenario as it is not a bug in undeployment. We can fix it by changing a configuration. Also, we will need to investigate at cartridge agent side as well.
1. After you invoke the undeploy, member sent the in-maintenance event. 2. Autoscaler received it and mark the member as terminatingPending 3. Stratos is waiting for ReadToShutdown event from the member. But member didn't send this event due to network issue or some cleanup process issue 4. Autoscaler has a pendingTerminationMemberExpiryTimeout in autoscaler.xml(SRTATOS-HOME/repository/conf). I believe that this timeout is 30min(according to the log below) in your setup. After this timeout, autoscaler will move this member to obsolete list. So that the member will get terminated immediately. Please see the log as i found: TID: [0] [STRATOS] [2015-06-30 16:22:56,505] INFO {org.apache.stratos.autoscaler.context.partition.ClusterLevelPartitionContext$TerminationPendingMemberWatcher} - *Termination pending state of member is expired, member will be moved to obsolete list* [termination pending member] di-000-002.di-000-002.cisco-qvpc-cf-02-0.domain2b83bd51-a85a-473b-b143-adc2c8feac94 [expiry time] 1800000 [cluster] di-000-002.di-000-002.cisco-qvpc-cf-02-0.domain [cluster instance] di-000-002-1 5. Member got terminated successfully. Hence cluster and then application also got terminated successfully. As the solution, can you try to reduce pendingTerminationMemberExpiryTimeout according to your suitable value? Also, if you detect this issue again, it is better to get the cartridge agent logs as well. So that we can try to find out why cartridge agent failed to send the ReadyToShutdown event. Thanks, Reka On Wed, Jul 1, 2015 at 2:27 AM, Martin Eppel (meppel) <mep...@cisco.com> wrote: > Hi Reka, > > > > Here is the issue with application removal in one of our use cases: > > > > In the scenario there are 5 applications deployed, cartridge-proxy, > di-000-001, di-000-002, di-000-003, di-000-004 > > > > The scenario tries to remove application , di-000-001, di-000-002, > di-000-003, di-000-004. > > > > Removing , di-000-001 succeeds but, at least at first sight, removing > di-000-002 ends up with the application di-000-002 stuck in “undeploying” > mode and the cartridge in maintenance mode. However, as it turned out, > di-000-002 is being removed successfully but it took *32 minutes*. > > I think the question is why does it take that long, is it expected and > what are the conditions for it to take that long (in comparison, removing > di-000-002 only took a few seconds). > > Also, the application were removed using the force flag. Shouldn’t force > remove go quick in any case ? > > > > Removing di-000-001: > > “ > > *TID: [0] [STRATOS] [2015-06-30 15:52:09,073] INFO > {org.apache.stratos.autoscaler.services.impl.AutoscalerServiceImpl} - > Application undeployment process started: [application-id] di-000-001* > > *…* > > *TID: [0] [STRATOS] [2015-06-30 15:52:12,982] INFO > {org.apache.stratos.autoscaler.applications.topic.ApplicationBuilder} - > Application un-deployed successfully: [application-id] di-000-001* > > “ > > > > Removing di-000-002 > > “ > > … > > *TID: [0] [STRATOS] [2015-06-30 15:52:50,383] INFO > {org.apache.stratos.autoscaler.services.impl.AutoscalerServiceImpl} - > Application undeployment process started: [application-id] di-000-002* > > *…* > > *TID: [0] [STRATOS] [2015-06-30 16:24:04,083] INFO > {org.apache.stratos.autoscaler.applications.topic.ApplicationBuilder} - > Application un-deployed successfully: [application-id] di-000-002* > > *…* > > “ > > > > Application jsons are listed below, wso2carbon.log is attached. > > > > Thanks > > > > Martin > > > > > > stratos> list-applications > > Applications found: > > +-----------------+-----------------+-------------+ > > | Application ID | Alias | Status | > > +-----------------+-----------------+-------------+ > > | cartridge-proxy | cartridge-proxy | Deployed | > > +-----------------+-----------------+-------------+ > > | di-000-002 | di-000-002 | Undeploying | > > +-----------------+-----------------+-------------+ > > | di-000-003 | di-000-003 | Deployed | > > +-----------------+-----------------+-------------+ > > | di-000-004 | di-000-004 | Deployed | > > > > > > > > Waiting for a long time: > > > > +-----------------+-----------------+----------+ > > | Application ID | Alias | Status | > > +-----------------+-----------------+----------+ > > | cartridge-proxy | cartridge-proxy | Deployed | > > +-----------------+-----------------+----------+ > > | di-000-002 | di-000-002 | Created | > > +-----------------+-----------------+----------+ > > | di-000-003 | di-000-003 | Deployed | > > +-----------------+-----------------+----------+ > > | di-000-004 | di-000-004 | Deployed | > > +-----------------+-----------------+----------+ > > > > > > See below json files for the applications: > > > > stratos> describe-application di-000-001 > > Application not found: di-000-001 > > > > > > stratos> describe-application di-000-002 > > Application: di-000-002 > > { > > "applicationId": "di-000-002", > > "multiTenant": false, > > "alias": "di-000-002", > > "status": "Undeploying", > > "components": { > > "cartridges": [ > > { > > "type": "cisco-qvpc-cf-02-0", > > "cartridgeMin": 1, > > "cartridgeMax": 1, > > "subscribableInfo": { > > "alias": "di-000-002", > > "deploymentPolicy": "static-1", > > "autoscalingPolicy": "economyPolicy", > > "maxMembers": 0, > > "minMembers": 0, > > "artifactRepository": { > > "alias": "di-000-002", > > "privateRepo": true, > > "repoUrl": "http://octl.qmog.cisco.com:10080/git/default.git", > > "repoUsername": "user", > > "repoPassword": "password" > > }, > > "property": [ > > { > > "name": "payload_parameter.VOLUME_INFO", > > "value": "di-000-002:ca6bbb1b-47a5-44c5-af41-6264eb2bdcfd" > > } > > ], > > "persistence": { > > "isRequired": false > > } > > } > > } > > ] > > } > > } > > > > //// > > Application: di-000-003 > > { > > "applicationId": "di-000-003", > > "multiTenant": false, > > "alias": "di-000-003", > > "status": "Deployed", > > "components": { > > "cartridges": [ > > { > > "type": "cisco-qvpc-sf-0", > > "cartridgeMin": 1, > > "cartridgeMax": 1, > > "subscribableInfo": { > > "alias": "di-000-003", > > "deploymentPolicy": "static-1", > > "autoscalingPolicy": "economyPolicy", > > "maxMembers": 0, > > "minMembers": 0, > > "artifactRepository": { > > "alias": "di-000-003", > > "privateRepo": true, > > "repoUrl": "http://octl.qmog.cisco.com:10080/git/default.git", > > "repoUsername": "user", > > "repoPassword": "password" > > } > > } > > } > > ] > > } > > } > > > > ///// > > stratos> describe-application di-000-004 > > Application: di-000-004 > > { > > "applicationId": "di-000-004", > > "multiTenant": false, > > "alias": "di-000-004", > > "status": "Deployed", > > "components": { > > "cartridges": [ > > { > > "type": "cisco-qvpc-sf-0", > > "cartridgeMin": 1, > > "cartridgeMax": 1, > > "subscribableInfo": { > > "alias": "di-000-004", > > "deploymentPolicy": "static-1", > > "autoscalingPolicy": "economyPolicy", > > "maxMembers": 0, > > "minMembers": 0, > > "artifactRepository": { > > "alias": "di-000-004", > > "privateRepo": true, > > "repoUrl": "http://octl.qmog.cisco.com:10080/git/default.git", > > "repoUsername": "user", > > "repoPassword": "password" > > } > > } > > } > > ] > > } > > } > > > > //// > > > > Application: cartridge-proxy > > { > > "applicationId": "cartridge-proxy", > > "multiTenant": false, > > "alias": "cartridge-proxy", > > "status": "Deployed", > > "components": { > > "cartridges": [ > > { > > "type": "cartridge-proxy", > > "cartridgeMin": 1, > > "cartridgeMax": 1, > > "subscribableInfo": { > > "alias": "cartridge-proxy", > > "deploymentPolicy": "static-1", > > "autoscalingPolicy": "economyPolicy", > > "maxMembers": 0, > > "minMembers": 0, > > "artifactRepository": { > > "alias": "cartridge-proxy", > > "privateRepo": true, > > "repoUrl": "http://octl.qmog.cisco.com:10080/git/default.git", > > "repoUsername": "user", > > "repoPassword": "password" > > } > > } > > } > > ] > > } > > } > > > > > > > > > -- Reka Thirunavukkarasu Senior Software Engineer, WSO2, Inc.:http://wso2.com, Mobile: +94776442007