Hi Martin,

I also encountered a similar issue with the application un-deployment with
PCA but I guess you are using JCA.

I can see that Anuruddha has done a fix for the issue I'm referring with
the below commit:
https://github.com/apache/stratos/commit/2fe84b91843b20e91e8cafd06011f42d218f231c

Regarding the member context not found error, this could occur if the
termination request was made for an already terminated member. There is a
possibility that Autoscaler make a second terminate request if the first
request take some time to execute and at the time the second request hit
Cloud Controller the member is already terminated with the first request.

Can you please confirm whether the members were properly terminated and its
just this exceptions that you are seeing?

Thanks


On Sat, Jun 6, 2015 at 12:36 AM, Martin Eppel (meppel) <mep...@cisco.com>
wrote:

>  Hi Udara,
>
>
>
> Picked up your commit and rerun the test case:
>
>
>
> Attached is the log file (artifacts are the same as before).
>
>
>
> *Didn’t see the issue with* “*Member is in the wrong list” …*
>
>
>
> but see the following exception after the undeploy application message:
>
> *TID: [0] [STRATOS] [2015-06-05 18:09:46,836] ERROR
> {org.apache.stratos.messaging.message.receiver.topology.TopologyEventMessageDelegator}
> -  Failed to retrieve topology event message*
>
> *org.apache.stratos.common.exception.InvalidLockRequestedException: System
> error, cannot acquire a write lock while having a read lock on the same
> thread: [lock-name] application-holder [thread-id] 114 [thread-name]
> pool-24-thread-2*
>
> *                    at
> org.apache.stratos.common.concurrent.locks.ReadWriteLock.acquireWriteLock(ReadWriteLock.java:114)*
>
> *                    at
> org.apache.stratos.autoscaler.applications.ApplicationHolder.acquireWriteLock(ApplicationHolder.java:60)*
>
>
>
>
>
> *Also, after the “Application undeployment process started” is started,
> new members are being instantiated:*
>
>
>
> *TID: [0] [STRATOS] [2015-06-05 18:07:46,545]  INFO
> {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
> -  Publishing member created event*:
>
>
>
>
>
> *Eventually, these VMs get terminated :*
>
>
>
> *TID: [0] [STRATOS] [2015-06-05 18:42:42,413] ERROR
> {org.apache.stratos.cloud.controller.services.impl.CloudControllerServiceImpl}
> -  Could not terminate instance: [member-id]
> g-sc-G12-1.c1-0x0.c1.domaindd9c1d40-70cc-4950-9757-418afe19ba7f*
>
> *org.apache.stratos.cloud.controller.exception.InvalidMemberException:
> Could not terminate instance, member context not found: [member-id]
> g-sc-G12-1.c1-0x0.c1.domaindd9c1d40-70cc-4950-9757-418afe19ba7f*
>
> *                    at
> org.apache.stratos.cloud.controller.services.impl.CloudControllerServiceImpl.terminateInstance(CloudControllerServiceImpl.java:595)*
>
> *                    at
> sun.reflect.GeneratedMethodAccessor408.invoke(Unknown Source)*
>
> *                    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
>
> *                    at java.lang.reflect.Method.invoke(Method.java:606)*
>
>
>
>
>
> *but the application remains:*
>
>
>
> stratos> list-applications
>
> Applications found:
>
> +----------------+------------+----------+
>
> | Application ID | Alias      | Status   |
>
> +----------------+------------+----------+
>
> | g-sc-G12-1     | g-sc-G12-1 | Deployed |
>
> +----------------+------------+----------+
>
>
>
> ['g-sc-G12-1: applicationInstances 1, groupInstances 2, clusterInstances
> 3, members 0 ()\n']
>
>
>
>
>
>
>
> *From:* Martin Eppel (meppel)
> *Sent:* Friday, June 05, 2015 10:04 AM
> *To:* dev@stratos.apache.org
> *Subject:* RE: Testing Stratos 4.1: Application undeployment: application
> fails to undeploy (nested grouping, group scaling)
>
>
>
> Ok:
>
>
>
> log4j.logger.org.apache.stratos.manager=DEBUG
>
> log4j.logger.org.apache.stratos.autoscaler=DEBUG
>
> log4j.logger.org.apache.stratos.messaging=INFO
>
> log4j.logger.org.apache.stratos.cloud.controller=DEBUG
>
> log4j.logger.org.wso2.andes.client=ERROR
>
> # Autoscaler rule logs
>
> log4j.logger.org.apache.stratos.autoscaler.rule.RuleLog=DEBUG
>
>
>
> *From:* Udara Liyanage [mailto:ud...@wso2.com <ud...@wso2.com>]
> *Sent:* Friday, June 05, 2015 10:00 AM
> *To:* dev
> *Subject:* Re: Testing Stratos 4.1: Application undeployment: application
> fails to undeploy (nested grouping, group scaling)
>
>
>
> Hi Martin,
>
>
>
> Better if you can enable debugs logs for all AS, CC and cartridge agent
>
>
>
> On Fri, Jun 5, 2015 at 10:23 PM, Udara Liyanage <ud...@wso2.com> wrote:
>
> Hi,
>
>
>
> Please enable AS debug logs.
>
>
>
> On Fri, Jun 5, 2015 at 9:38 PM, Martin Eppel (meppel) <mep...@cisco.com>
> wrote:
>
> Hi Udara,
>
>
>
> Yes, this issue seems to be fairly well reproducible, which debug log do
> you want me to enable, cartridge agent logs ?
>
>
>
> Thanks
>
>
>
> Martin
>
>
>
> *From:* Udara Liyanage [mailto:ud...@wso2.com]
> *Sent:* Thursday, June 04, 2015 11:11 PM
> *To:* dev
> *Subject:* Re: Testing Stratos 4.1: Application undeployment: application
> fails to undeploy (nested grouping, group scaling)
>
>
>
> Hi,
>
>
>
> This might be possible if AS did not receive member activated event
> published by CC. Is it possible to enable debug logs if this is
> reproducible.
>
> Or else I can add an INFO logs and commit.
>
>
>
>
>
> On Fri, Jun 5, 2015 at 9:11 AM, Udara Liyanage <ud...@wso2.com> wrote:
>
> Hi,
>
>
>
>
>
> For the first issue you have mentioned, the particular member is
> activated, but it is still identified as an obsolete member and is being
> marked to be terminated since pending time expired. Does that mean member
> is still in Obsolete list even though it is being activated?
>
>
>
> //member started
>
> TID: [0] [STRATOS] [2015-06-04 19:53:04,706]  INFO
> {org.apache.stratos.autoscaler.context.cluster.ClusterContext} -  Member
> stat context has been added: [application] g-sc-G12-1 [cluster]
> g-sc-G12-1.c1-0x0.c1.domain [clusterInstanceContext] g-sc-G12-1-1
> [partitionContext] whole-region [member-id]
> g-sc-G12-1.c1-0x0.c1.domainb0aa0188-49f1-47f6-a040-c2eab4acb5b1
>
>
>
> //member activated
>
> TID: [0] [STRATOS] [2015-06-04 19:56:00,907]  INFO
> {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
> -  Publishing member activated event: [service-name] c1 [cluster-id]
> g-sc-G12-1.c1-0x0.c1.domain [cluster-instance-id] g-sc-G12-1-1 [member-id]
> g-sc-G12-1.c1-0x0.c1.domainb0aa0188-49f1-47f6-a040-c2eab4acb5b1
> [network-partition-id] RegionOne [partition-id] whole-region
>
> TID: [0] [STRATOS] [2015-06-04 19:56:00,916]  INFO
> {org.apache.stratos.messaging.message.processor.topology.MemberActivatedMessageProcessor}
> -  Member activated: [service] c1 [cluster] g-sc-G12-1.c1-0x0.c1.domain
> [member] g-sc-G12-1.c1-0x0.c1.domainb0aa0188-49f1-47f6-a040-c2eab4acb5b1
>
>
>
> //after 15 minutes ---member is still in pending state, pending timeout
> expired
>
> TID: [0] [STRATOS] [2015-06-04 20:08:04,713]  INFO
> {org.apache.stratos.autoscaler.context.partition.ClusterLevelPartitionContext$PendingMemberWatcher}
> -  Pending state of member expired, member will be moved to obsolete list.
> [pending member]
> g-sc-G12-1.c1-0x0.c1.domainb0aa0188-49f1-47f6-a040-c2eab4acb5b1 [expiry
> time] 900000 [cluster] g-sc-G12-1.c1-0x0.c1.domain [cluster instance] null
>
>
>
> On Fri, Jun 5, 2015 at 5:14 AM, Martin Eppel (meppel) <mep...@cisco.com>
> wrote:
>
> Hi,
>
>
>
> I am running into a scenario where application un-deployment fails (using
> stratos with latest commit  b1b6bca3f99b6127da24c9af0a6b20faff2907be).
>
>
>
> For application structure see [1.], (debug enabled) wso2carbon.log,
> application.json, cartridge-group.json, deployment-policy, auto-scaling
> policies see attached zip file.
>
>
>
> *It is noteworthy, that while the application is running the following log
> statements /exceptions are observed:*
>
>
>
> *…*
>
> *Member is in the wrong list and it is removed from active members list:
> g-sc-G12-1.c1-0x0.c1.domainb0aa0188-49f1-47f6-a040-c2eab4acb5b1*
>
> *…*
>
> *TID: [0] [STRATOS] [2015-06-04 20:11:03,425] ERROR
> {org.apache.stratos.autoscaler.rule.RuleTasksDelegator} -  Cannot terminate
> instance*
>
> *…*
>
> *// **after receiving the application undeploy event:*
>
> *[2015-06-04 20:12:39,465]  INFO
> {org.apache.stratos.autoscaler.services.impl.AutoscalerServiceImpl} -
> Application undeployment process started: [application-id] g-sc-G12-1*
>
> *// **a new instance is being started up*
>
> *…*
>
> *[2015-06-04 20:13:13,445]  INFO
> {org.apache.stratos.cloud.controller.services.impl.InstanceCreator} -
> Instance started successfully: [cartridge-type] c2 [cluster-id]
> g-sc-G12-1.c2-1x0.c2.domain [instance-id]
> RegionOne/5d4699f7-b00b-42eb-b565-b48fc8f20407*
>
>
>
> *// Also noteworthy seems the following warning which is seen repeatedly
> in the logs:*
>
> *ReadWriteLock} -  System warning! Trying to release a lock which has not
> been taken by the same thread: [lock-name]*
>
>
>
>
>
> [1.] Application structure
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
>
> Udara Liyanage
>
> Software Engineer
>
> WSO2, Inc.: http://wso2.com
>
> lean. enterprise. middleware
>
> web: http://udaraliyanage.wordpress.com
>
> phone: +94 71 443 6897
>
>
>
>
>
> --
>
>
> Udara Liyanage
>
> Software Engineer
>
> WSO2, Inc.: http://wso2.com
>
> lean. enterprise. middleware
>
> web: http://udaraliyanage.wordpress.com
>
> phone: +94 71 443 6897
>
>
>
>
>
> --
>
>
> Udara Liyanage
>
> Software Engineer
>
> WSO2, Inc.: http://wso2.com
>
> lean. enterprise. middleware
>
> web: http://udaraliyanage.wordpress.com
>
> phone: +94 71 443 6897
>
>
>
>
>
> --
>
>
> Udara Liyanage
>
> Software Engineer
>
> WSO2, Inc.: http://wso2.com
>
> lean. enterprise. middleware
>
> web: http://udaraliyanage.wordpress.com
>
> phone: +94 71 443 6897
>



-- 
Imesh Gunaratne

Senior Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Reply via email to