Hi Isuru,

While I agree that it is hard to handle scenarios like this in Stratos
given the current architecture and design, I believe pitfalls like this
could end up being a huge overhead for its users. Not only they would have
to maintain a PaaS but they will also have to monitor the logs or IaaS
level dashboard and manually kill instances whenever Stratos fails to do
so? Perhaps we need to rethink on the whole architecture?

As a short term solutions, perhaps we could wait for a certain amount of
time until the member is initialized in the member termination flow.

+1 to go for 4.1.5-rc3.

Thanks.

On Sun, Dec 6, 2015 at 9:45 AM, Imesh Gunaratne <im...@apache.org> wrote:

> Yes I agree with Isuru, however we should be able to raise a more
> meaningful error message in a such situation. If an instance has not
> initialized at the time the termination call is made, we should be able to
> tell that to the end user clearly.
>
> [2015-12-06 00:29:51,337]  WARN
> {org.apache.stratos.common.concurrent.locks.ReadWriteLock} -  System
> warning! Trying to release a lock which has not been taken by the same
> thread: [lock-name] topology-manager [thread-id] 214 [thread-name]
> http-nio-9443-exec-17
>
> Regarding the above warning message, we added it purposely to track
> situations where threads try to release locks while they have not been
> acquired by the same thread. If this happens there is a slight possibility
> to some functionality to not work properly.
>
> If we are to list down the issues we identified in this release candidate:
>
>    - SNAPSHOT versions available in docker files
>    - Thrift client configuration file not being up to date in load
>    balancer extensions
>    - CEP extension distribution issue
>    - A validation to handle member termination logic when the given
>    member has not initiatlized
>    - doap_Stratos.rdf file was not up to date with release versions
>
> Considering all of the above +1 to cancel this vote and go for 4.1.5-rc3.
>
> Thanks
>
> On Sun, Dec 6, 2015 at 9:16 AM, Isuru Haththotuwa <isu...@apache.org>
> wrote:
>
>> On Sun, Dec 6, 2015 at 12:37 AM, Akila Ravihansa Perera <
>> raviha...@wso2.com> wrote:
>>
>>> I tried to deploy an application on EC2 and immediately undeployed it
>>> which caused the following exception. Also the EC2 instance did not get
>>> terminated. Noticed the following warning in the log;
>>>
>>> [2015-12-06 00:29:51,337]  WARN
>>> {org.apache.stratos.common.concurrent.locks.ReadWriteLock} -  System
>>> warning! Trying to release a lock which has not been taken by the same
>>> thread: [lock-name] topology-manager [thread-id] 214 [thread-name]
>>> http-nio-9443-exec-17
>>>
>>>
>>>
>>>
>>> [2015-12-06 00:29:51,309]  INFO
>>> {org.apache.stratos.autoscaler.client.AutoscalerCloudControllerClient} -
>>>  Terminating instance via cloud controller: [member]
>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545
>>> [2015-12-06 00:29:51,313] ERROR
>>> {org.apache.stratos.cloud.controller.services.impl.CloudControllerServiceImpl}
>>> -  Could not terminate instance, instance id is blank: [member-id]
>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545
>>> , removing member from topology...
>>> [2015-12-06 00:29:51,319]  INFO
>>> {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
>>> -  Publishing member terminated event: [service-name] php-ec2 [cluster-id]
>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain
>>> [cluster-instance-id] single-cartridge-app-ec2-1 [member-id]
>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545
>>> [network-partition-id] network-partition-ec2 [partition-id] partition-1
>>> [group-id] null
>>> [2015-12-06 00:29:51,326]  INFO
>>> {org.apache.stratos.messaging.message.processor.topology.MemberTerminatedMessageProcessor}
>>> -  Member terminated: [service] php-ec2 [cluster]
>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain [member]
>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545
>>> [2015-12-06 00:29:51,326]  WARN
>>> {org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor} -  Obsolete
>>> member has either been terminated or its obsolete time out has expired and
>>> it is removed from obsolete members list:
>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545
>>> [2015-12-06 00:29:51,327]  INFO
>>> {org.apache.stratos.autoscaler.status.processor.cluster.ClusterStatusTerminatedProcessor}
>>> -  Publishing Cluster terminated event for [application]:
>>> single-cartridge-app-ec2 [cluster]:
>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain
>>> [2015-12-06 00:29:51,335]  INFO
>>> {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder} -
>>>  Cluster Terminated adding status started for and removing the cluster
>>> instancesingle-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain
>>> *[2015-12-06 00:29:51,337]  WARN
>>> {org.apache.stratos.common.concurrent.locks.ReadWriteLock} -  System
>>> warning! Trying to release a lock which has not been taken by the same
>>> thread: [lock-name] topology-manager [thread-id] 214 [thread-name]
>>> http-nio-9443-exec-17*
>>> [2015-12-06 00:29:51,346]  INFO
>>> {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
>>> -  Publishing Cluster terminated event: [application-id]
>>> single-cartridge-app-ec2 [cluster id]
>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain [instance-id]
>>> single-cartridge-app-ec2-1
>>> [2015-12-06 00:29:51,348] ERROR
>>> {org.apache.stratos.autoscaler.rule.RuleTasksDelegator} -  Cannot terminate
>>> instance
>>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceCloudControllerExceptionException:
>>> CloudControllerServiceCloudControllerExceptionException
>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>>> at
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>> at java.lang.Class.newInstance(Class.java:379)
>>> at
>>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.terminateInstance(CloudControllerServiceStub.java:8660)
>>> at
>>> org.apache.stratos.autoscaler.client.AutoscalerCloudControllerClient.terminateInstance(AutoscalerCloudControllerClient.java:203)
>>> at
>>> org.apache.stratos.autoscaler.rule.RuleTasksDelegator.terminateObsoleteInstance(RuleTasksDelegator.java:295)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> org.mvel2.optimizers.impl.refl.nodes.MethodAccessor.getValue(MethodAccessor.java:48)
>>> at
>>> org.mvel2.optimizers.impl.refl.nodes.VariableAccessor.getValue(VariableAccessor.java:37)
>>> at org.mvel2.ast.ASTNode.getReducedValueAccelerated(ASTNode.java:108)
>>> at org.mvel2.MVELRuntime.execute(MVELRuntime.java:85)
>>> at
>>> org.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
>>> at
>>> org.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
>>> at org.mvel2.MVEL.executeExpression(MVEL.java:930)
>>> at
>>> org.drools.base.mvel.MVELConsequence.evaluate(MVELConsequence.java:104)
>>> at
>>> org.drools.common.DefaultAgenda.fireActivation(DefaultAgenda.java:1287)
>>> at org.drools.common.DefaultAgenda.fireNextItem(DefaultAgenda.java:1221)
>>> at org.drools.common.DefaultAgenda.fireAllRules(DefaultAgenda.java:1456)
>>> at
>>> org.drools.common.AbstractWorkingMemory.fireAllRules(AbstractWorkingMemory.java:710)
>>> at
>>> org.drools.common.AbstractWorkingMemory.fireAllRules(AbstractWorkingMemory.java:674)
>>> at
>>> org.drools.impl.StatefulKnowledgeSessionImpl.fireAllRules(StatefulKnowledgeSessionImpl.java:230)
>>> at
>>> org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor.evaluate(ClusterMonitor.java:472)
>>> at
>>> org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor.access$200(ClusterMonitor.java:86)
>>> at
>>> org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor$2.run(ClusterMonitor.java:444)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>> [2015-12-06 00:29:51,356]  INFO
>>> {org.apache.stratos.autoscaler.event.receiver.topology.AutoscalerTopologyEventReceiver}
>>> -  [ClusterTerminatedEvent] Received: class
>>> org.apache.stratos.messaging.event.topology.ClusterInstanceTerminatedEvent
>>> [2015-12-06 00:29:51,356]  INFO
>>> {org.apache.stratos.autoscaler.status.processor.group.GroupStatusTerminatedProcessor}
>>> -  Sending application instance terminated for [application]
>>> single-cartridge-app-ec2 [instance] single-cartridge-app-ec2-1
>>>
>>
>> Looking at the logs, it seems that the instance id has been null when the
>> CC tries to terminate the instance, in the terminateInstance method. The
>> instance id is returned with NodeMetadata when an instance is created in
>> EC2. Maybe the instance id is null since the termination was started at the
>> same moment in which the call to start the instance [1] is happening. In
>> such cases the member is removed from the Topology and therefore the only
>> option is to manually terminate it from the IaaS. IMHO such scenarios are
>> difficult to handle from Stratos side. If the member is correctly removed
>> from the Topology, then it should be fine.
>>
>> [1]. computeService.createNodesInGroup
>>
>>>
>>>
>>> On Sun, Dec 6, 2015 at 12:20 AM, Akila Ravihansa Perera <
>>> raviha...@wso2.com> wrote:
>>>
>>>> We have few issues in Docker scripts. There are some SNAPSHOT
>>>> references [1, 2] which breaks Docker images.
>>>>
>>>> [1]
>>>> https://github.com/apache/stratos/blob/4.1.5-rc2/tools/docker-images/cartridge-docker-images/base-image/Dockerfile#L25
>>>>
>>>> [2]
>>>> https://github.com/apache/stratos/blob/4.1.5-rc2/tools/docker-images/cartridge-docker-images/base-image/files/run#L29
>>>>
>>>> On Sat, Dec 5, 2015 at 8:02 PM, Gayan Gunarathne <gay...@wso2.com>
>>>> wrote:
>>>>
>>>>> Modify it as 4.1.5-rc2
>>>>>
>>>>> Thanks,
>>>>> Gayan
>>>>>
>>>>>
>>>>> On Sat, Dec 5, 2015 at 12:04 PM, Akila Ravihansa Perera <
>>>>> raviha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Gayan,
>>>>>>
>>>>>> The vote is for the tag, not the binaries. Therefore we need to tag
>>>>>> the code in order to vote.
>>>>>>
>>>>>> Also we do not tag with a release version (4.1.5) until the vote has
>>>>>> passed.
>>>>>>
>>>>>> Could you please fix it?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>> On Friday, 4 December 2015, Imesh Gunaratne <im...@apache.org> wrote:
>>>>>>
>>>>>>> Hi Gayan,
>>>>>>>
>>>>>>> I do not see the 4.1.5-rc2 tag, have we created it as 4.1.5?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> On Thu, Dec 3, 2015 at 10:01 AM, Gayan Gunarathne <gay...@wso2.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> This thread is for discussion of the second release candidate for
>>>>>>>> Apache Stratos 4.1.5. Please use this thread for discussion of issues
>>>>>>>> uncovered in the RC, questions you may have about the RC, etc.
>>>>>>>>
>>>>>>>> The RC release packs could be found here [1]. A git tag (4.1.5)
>>>>>>>> [2] has been created for this release and its tree view could be
>>>>>>>> seen here [3].
>>>>>>>>
>>>>>>>> [1]  <http://goog_1891852155>
>>>>>>>> https://dist.apache.org/repos/dist/dev/stratos/releases/4.1.5-rc2/
>>>>>>>> [2] 
>>>>>>>> *https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=commit;h=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25
>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=commit;h=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25>*
>>>>>>>> [3] 
>>>>>>>> *https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=tree;h=bf33be1ad90c9bd071a5ded8dd440eb83de80ead;hb=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25
>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=tree;h=bf33be1ad90c9bd071a5ded8dd440eb83de80ead;hb=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25>*
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> The Apache Stratos team
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Gayan Gunarathne
>>>>>>>> Technical Lead, WSO2 Inc. (http://wso2.com)
>>>>>>>> Committer & PMC Member, Apache Stratos
>>>>>>>> email : gay...@wso2.com  | mobile : +94 775030545
>>>>>>>> <%2B94%20766819985>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Imesh Gunaratne
>>>>>>>
>>>>>>> Senior Technical Lead, WSO2
>>>>>>> Committer & PMC Member, Apache Stratos
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Akila Ravihansa Perera
>>>>>> WSO2 Inc.;  http://wso2.com/
>>>>>>
>>>>>> Blog: http://ravihansa3000.blogspot.com
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Gayan Gunarathne
>>>>> Technical Lead, WSO2 Inc. (http://wso2.com)
>>>>> Committer & PMC Member, Apache Stratos
>>>>> email : gay...@wso2.com  | mobile : +94 775030545 <%2B94%20766819985>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Akila Ravihansa Perera
>>>> WSO2 Inc.;  http://wso2.com/
>>>>
>>>> Blog: http://ravihansa3000.blogspot.com
>>>>
>>>
>>>
>>>
>>> --
>>> Akila Ravihansa Perera
>>> WSO2 Inc.;  http://wso2.com/
>>>
>>> Blog:
>>> http://ravihansa3000.blogspot.com
>>>
>>> --
>>> <http://ravihansa3000.blogspot.com>
>>> <http://ravihansa3000.blogspot.com>
>>> Thanks and Regards,
>>>
>>> Isuru H.
>>> <http://ravihansa3000.blogspot.com>
>>> +94 716 358 048 <http://ravihansa3000.blogspot.com>* <http://wso2.com/>*
>>>
>>>
>>> * <http://wso2.com/>*
>>>
>>>
>>>
>
>
> --
> Imesh Gunaratne
>
> Senior Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos
>



-- 
Akila Ravihansa Perera
WSO2 Inc.;  http://wso2.com/

Blog: http://ravihansa3000.blogspot.com

Reply via email to