Hi, I've fixed the warning message with commit [1]. The root cause is CC will throw an exception if the member instance id is null in the member termination flow. But the lock is acquired after the exception is thrown. Therefore it will never reach to the point where lock is acquired. I've fixed it by moving locking/topology reading related code to a separate try-finally block.
Note that this will only fix the System warning message. I'm currently working on a fix for member termination issue. Will start a separate thread for that. @Sajith: you wouldn't experience this in mock iaas since members become initialized within milliseconds. But that's not the case with EC2/OpenStack etc. [1] https://github.com/apache/stratos/commit/bb22134fbbe3232b5e0a20943c30d2cfc5b55322 Thanks. On Mon, Dec 7, 2015 at 3:30 PM, Sajith Kariyawasam <saj...@wso2.com> wrote: > Did anyone of you able to reproduce this? I was trying with mock iaas, but > I didn't encounter this error > > On Sun, Dec 6, 2015 at 9:02 PM, Gayan Gunarathne <gay...@wso2.com> wrote: > >> >> >> On Sun, Dec 6, 2015 at 10:47 AM, Isuru Haththotuwa <isu...@apache.org> >> wrote: >> >>> Hi Akila, >>> >>> On Sun, Dec 6, 2015 at 10:26 AM, Akila Ravihansa Perera < >>> raviha...@wso2.com> wrote: >>> >>>> Hi Isuru, >>>> >>>> While I agree that it is hard to handle scenarios like this in Stratos >>>> given the current architecture and design, I believe pitfalls like this >>>> could end up being a huge overhead for its users. Not only they would have >>>> to maintain a PaaS but they will also have to monitor the logs or IaaS >>>> level dashboard and manually kill instances whenever Stratos fails to do >>>> so? Perhaps we need to rethink on the whole architecture? >>>> >>> IMHO we need to consider the probability of this happening; for an >>> example, in this case, whether users will try to deploy an application and >>> undeploy it again right at the next moment. Even in such cases, if we leave >>> a more meaningful log it should be enough, and Imesh mentioned.. >>> >> >> Event it is instance id is blank, this seems to be a issue with acquiring >> and releasing the locks in our thread model.IMO we need to handle that. I >> think we need to first identify which thread is try to release the write >> lock which acquired by another thread. >> I think we may able to reproduce this in the mock iaas by setting the >> instance id to blank.Also if we print the get context class loader in >> releaseWriteLock as a debug log I think we can get the exact thread which >> causing this issue. Will check on those points. >> >> @Akila , Did you able to reproduce this regularly? >> >>> >>>> As a short term solutions, perhaps we could wait for a certain amount >>>> of time until the member is initialized in the member termination flow. >>>> >>>> +1 to go for 4.1.5-rc3. >>>> >>>> Thanks. >>>> >>> +1 for 4.1.5-RC3. >>> >>>> >>>> On Sun, Dec 6, 2015 at 9:45 AM, Imesh Gunaratne <im...@apache.org> >>>> wrote: >>>> >>>>> Yes I agree with Isuru, however we should be able to raise a more >>>>> meaningful error message in a such situation. If an instance has not >>>>> initialized at the time the termination call is made, we should be able to >>>>> tell that to the end user clearly. >>>>> >>>>> [2015-12-06 00:29:51,337] WARN >>>>> {org.apache.stratos.common.concurrent.locks.ReadWriteLock} - System >>>>> warning! Trying to release a lock which has not been taken by the same >>>>> thread: [lock-name] topology-manager [thread-id] 214 [thread-name] >>>>> http-nio-9443-exec-17 >>>>> >>>>> Regarding the above warning message, we added it purposely to track >>>>> situations where threads try to release locks while they have not been >>>>> acquired by the same thread. If this happens there is a slight possibility >>>>> to some functionality to not work properly. >>>>> >>>>> If we are to list down the issues we identified in this release >>>>> candidate: >>>>> >>>>> - SNAPSHOT versions available in docker files >>>>> - Thrift client configuration file not being up to date in load >>>>> balancer extensions >>>>> - CEP extension distribution issue >>>>> - A validation to handle member termination logic when the given >>>>> member has not initiatlized >>>>> - doap_Stratos.rdf file was not up to date with release versions >>>>> >>>>> Considering all of the above +1 to cancel this vote and go for >>>>> 4.1.5-rc3. >>>>> >>>>> Thanks >>>>> >>>>> On Sun, Dec 6, 2015 at 9:16 AM, Isuru Haththotuwa <isu...@apache.org> >>>>> wrote: >>>>> >>>>>> On Sun, Dec 6, 2015 at 12:37 AM, Akila Ravihansa Perera < >>>>>> raviha...@wso2.com> wrote: >>>>>> >>>>>>> I tried to deploy an application on EC2 and immediately undeployed >>>>>>> it which caused the following exception. Also the EC2 instance did not >>>>>>> get >>>>>>> terminated. Noticed the following warning in the log; >>>>>>> >>>>>>> [2015-12-06 00:29:51,337] WARN >>>>>>> {org.apache.stratos.common.concurrent.locks.ReadWriteLock} - System >>>>>>> warning! Trying to release a lock which has not been taken by the same >>>>>>> thread: [lock-name] topology-manager [thread-id] 214 [thread-name] >>>>>>> http-nio-9443-exec-17 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> [2015-12-06 00:29:51,309] INFO >>>>>>> {org.apache.stratos.autoscaler.client.AutoscalerCloudControllerClient} - >>>>>>> Terminating instance via cloud controller: [member] >>>>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>>>> [2015-12-06 00:29:51,313] ERROR >>>>>>> {org.apache.stratos.cloud.controller.services.impl.CloudControllerServiceImpl} >>>>>>> - Could not terminate instance, instance id is blank: [member-id] >>>>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>>>> , removing member from topology... >>>>>>> [2015-12-06 00:29:51,319] INFO >>>>>>> {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher} >>>>>>> - Publishing member terminated event: [service-name] php-ec2 >>>>>>> [cluster-id] >>>>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain >>>>>>> [cluster-instance-id] single-cartridge-app-ec2-1 [member-id] >>>>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>>>> [network-partition-id] network-partition-ec2 [partition-id] partition-1 >>>>>>> [group-id] null >>>>>>> [2015-12-06 00:29:51,326] INFO >>>>>>> {org.apache.stratos.messaging.message.processor.topology.MemberTerminatedMessageProcessor} >>>>>>> - Member terminated: [service] php-ec2 [cluster] >>>>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain [member] >>>>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>>>> [2015-12-06 00:29:51,326] WARN >>>>>>> {org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor} - >>>>>>> Obsolete >>>>>>> member has either been terminated or its obsolete time out has expired >>>>>>> and >>>>>>> it is removed from obsolete members list: >>>>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>>>> [2015-12-06 00:29:51,327] INFO >>>>>>> {org.apache.stratos.autoscaler.status.processor.cluster.ClusterStatusTerminatedProcessor} >>>>>>> - Publishing Cluster terminated event for [application]: >>>>>>> single-cartridge-app-ec2 [cluster]: >>>>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain >>>>>>> [2015-12-06 00:29:51,335] INFO >>>>>>> {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder} >>>>>>> - >>>>>>> Cluster Terminated adding status started for and removing the cluster >>>>>>> instancesingle-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain >>>>>>> *[2015-12-06 00:29:51,337] WARN >>>>>>> {org.apache.stratos.common.concurrent.locks.ReadWriteLock} - System >>>>>>> warning! Trying to release a lock which has not been taken by the same >>>>>>> thread: [lock-name] topology-manager [thread-id] 214 [thread-name] >>>>>>> http-nio-9443-exec-17* >>>>>>> [2015-12-06 00:29:51,346] INFO >>>>>>> {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher} >>>>>>> - Publishing Cluster terminated event: [application-id] >>>>>>> single-cartridge-app-ec2 [cluster id] >>>>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain [instance-id] >>>>>>> single-cartridge-app-ec2-1 >>>>>>> [2015-12-06 00:29:51,348] ERROR >>>>>>> {org.apache.stratos.autoscaler.rule.RuleTasksDelegator} - Cannot >>>>>>> terminate >>>>>>> instance >>>>>>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceCloudControllerExceptionException: >>>>>>> CloudControllerServiceCloudControllerExceptionException >>>>>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >>>>>>> Method) >>>>>>> at >>>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >>>>>>> at >>>>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>>>>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >>>>>>> at java.lang.Class.newInstance(Class.java:379) >>>>>>> at >>>>>>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.terminateInstance(CloudControllerServiceStub.java:8660) >>>>>>> at >>>>>>> org.apache.stratos.autoscaler.client.AutoscalerCloudControllerClient.terminateInstance(AutoscalerCloudControllerClient.java:203) >>>>>>> at >>>>>>> org.apache.stratos.autoscaler.rule.RuleTasksDelegator.terminateObsoleteInstance(RuleTasksDelegator.java:295) >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>>> at >>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>>>>> at >>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>>>>> at >>>>>>> org.mvel2.optimizers.impl.refl.nodes.MethodAccessor.getValue(MethodAccessor.java:48) >>>>>>> at >>>>>>> org.mvel2.optimizers.impl.refl.nodes.VariableAccessor.getValue(VariableAccessor.java:37) >>>>>>> at org.mvel2.ast.ASTNode.getReducedValueAccelerated(ASTNode.java:108) >>>>>>> at org.mvel2.MVELRuntime.execute(MVELRuntime.java:85) >>>>>>> at >>>>>>> org.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123) >>>>>>> at >>>>>>> org.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119) >>>>>>> at org.mvel2.MVEL.executeExpression(MVEL.java:930) >>>>>>> at >>>>>>> org.drools.base.mvel.MVELConsequence.evaluate(MVELConsequence.java:104) >>>>>>> at >>>>>>> org.drools.common.DefaultAgenda.fireActivation(DefaultAgenda.java:1287) >>>>>>> at >>>>>>> org.drools.common.DefaultAgenda.fireNextItem(DefaultAgenda.java:1221) >>>>>>> at >>>>>>> org.drools.common.DefaultAgenda.fireAllRules(DefaultAgenda.java:1456) >>>>>>> at >>>>>>> org.drools.common.AbstractWorkingMemory.fireAllRules(AbstractWorkingMemory.java:710) >>>>>>> at >>>>>>> org.drools.common.AbstractWorkingMemory.fireAllRules(AbstractWorkingMemory.java:674) >>>>>>> at >>>>>>> org.drools.impl.StatefulKnowledgeSessionImpl.fireAllRules(StatefulKnowledgeSessionImpl.java:230) >>>>>>> at >>>>>>> org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor.evaluate(ClusterMonitor.java:472) >>>>>>> at >>>>>>> org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor.access$200(ClusterMonitor.java:86) >>>>>>> at >>>>>>> org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor$2.run(ClusterMonitor.java:444) >>>>>>> at >>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>>>> at >>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>>> [2015-12-06 00:29:51,356] INFO >>>>>>> {org.apache.stratos.autoscaler.event.receiver.topology.AutoscalerTopologyEventReceiver} >>>>>>> - [ClusterTerminatedEvent] Received: class >>>>>>> org.apache.stratos.messaging.event.topology.ClusterInstanceTerminatedEvent >>>>>>> [2015-12-06 00:29:51,356] INFO >>>>>>> {org.apache.stratos.autoscaler.status.processor.group.GroupStatusTerminatedProcessor} >>>>>>> - Sending application instance terminated for [application] >>>>>>> single-cartridge-app-ec2 [instance] single-cartridge-app-ec2-1 >>>>>>> >>>>>> >>>>>> Looking at the logs, it seems that the instance id has been null when >>>>>> the CC tries to terminate the instance, in the terminateInstance method. >>>>>> The instance id is returned with NodeMetadata when an instance is created >>>>>> in EC2. Maybe the instance id is null since the termination was started >>>>>> at >>>>>> the same moment in which the call to start the instance [1] is happening. >>>>>> In such cases the member is removed from the Topology and therefore the >>>>>> only option is to manually terminate it from the IaaS. IMHO such >>>>>> scenarios >>>>>> are difficult to handle from Stratos side. If the member is correctly >>>>>> removed from the Topology, then it should be fine. >>>>>> >>>>>> [1]. computeService.createNodesInGroup >>>>>> >>>>>>> >>>>>>> >>>>>>> On Sun, Dec 6, 2015 at 12:20 AM, Akila Ravihansa Perera < >>>>>>> raviha...@wso2.com> wrote: >>>>>>> >>>>>>>> We have few issues in Docker scripts. There are some SNAPSHOT >>>>>>>> references [1, 2] which breaks Docker images. >>>>>>>> >>>>>>>> [1] >>>>>>>> https://github.com/apache/stratos/blob/4.1.5-rc2/tools/docker-images/cartridge-docker-images/base-image/Dockerfile#L25 >>>>>>>> >>>>>>>> [2] >>>>>>>> https://github.com/apache/stratos/blob/4.1.5-rc2/tools/docker-images/cartridge-docker-images/base-image/files/run#L29 >>>>>>>> >>>>>>>> On Sat, Dec 5, 2015 at 8:02 PM, Gayan Gunarathne <gay...@wso2.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Modify it as 4.1.5-rc2 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Gayan >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, Dec 5, 2015 at 12:04 PM, Akila Ravihansa Perera < >>>>>>>>> raviha...@wso2.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Gayan, >>>>>>>>>> >>>>>>>>>> The vote is for the tag, not the binaries. Therefore we need to >>>>>>>>>> tag the code in order to vote. >>>>>>>>>> >>>>>>>>>> Also we do not tag with a release version (4.1.5) until the vote >>>>>>>>>> has passed. >>>>>>>>>> >>>>>>>>>> Could you please fix it? >>>>>>>>>> >>>>>>>>>> Thanks. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Friday, 4 December 2015, Imesh Gunaratne <im...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Gayan, >>>>>>>>>>> >>>>>>>>>>> I do not see the 4.1.5-rc2 tag, have we created it as 4.1.5? >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> >>>>>>>>>>> On Thu, Dec 3, 2015 at 10:01 AM, Gayan Gunarathne < >>>>>>>>>>> gay...@wso2.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi All, >>>>>>>>>>>> >>>>>>>>>>>> This thread is for discussion of the second release candidate >>>>>>>>>>>> for Apache Stratos 4.1.5. Please use this thread for discussion >>>>>>>>>>>> of issues uncovered in the RC, questions you may have about >>>>>>>>>>>> the RC, etc. >>>>>>>>>>>> >>>>>>>>>>>> The RC release packs could be found here [1]. A git tag (4.1.5) >>>>>>>>>>>> [2] has been created for this release and its tree view could >>>>>>>>>>>> be seen here [3]. >>>>>>>>>>>> >>>>>>>>>>>> [1] <http://goog_1891852155> >>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/stratos/releases/4.1.5-rc2/ >>>>>>>>>>>> [2] >>>>>>>>>>>> *https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=commit;h=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25 >>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=commit;h=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25>* >>>>>>>>>>>> [3] >>>>>>>>>>>> *https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=tree;h=bf33be1ad90c9bd071a5ded8dd440eb83de80ead;hb=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25 >>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=tree;h=bf33be1ad90c9bd071a5ded8dd440eb83de80ead;hb=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25>* >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> The Apache Stratos team >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Gayan Gunarathne >>>>>>>>>>>> Technical Lead, WSO2 Inc. (http://wso2.com) >>>>>>>>>>>> Committer & PMC Member, Apache Stratos >>>>>>>>>>>> email : gay...@wso2.com | mobile : +94 775030545 >>>>>>>>>>>> <%2B94%20766819985> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Imesh Gunaratne >>>>>>>>>>> >>>>>>>>>>> Senior Technical Lead, WSO2 >>>>>>>>>>> Committer & PMC Member, Apache Stratos >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Akila Ravihansa Perera >>>>>>>>>> WSO2 Inc.; http://wso2.com/ >>>>>>>>>> >>>>>>>>>> Blog: http://ravihansa3000.blogspot.com >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Gayan Gunarathne >>>>>>>>> Technical Lead, WSO2 Inc. (http://wso2.com) >>>>>>>>> Committer & PMC Member, Apache Stratos >>>>>>>>> email : gay...@wso2.com | mobile : +94 775030545 >>>>>>>>> <%2B94%20766819985> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Akila Ravihansa Perera >>>>>>>> WSO2 Inc.; http://wso2.com/ >>>>>>>> >>>>>>>> Blog: http://ravihansa3000.blogspot.com >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Akila Ravihansa Perera >>>>>>> WSO2 Inc.; http://wso2.com/ >>>>>>> >>>>>>> Blog: >>>>>>> http://ravihansa3000.blogspot.com >>>>>>> >>>>>>> -- >>>>>>> <http://ravihansa3000.blogspot.com> >>>>>>> <http://ravihansa3000.blogspot.com> >>>>>>> Thanks and Regards, >>>>>>> >>>>>>> Isuru H. >>>>>>> <http://ravihansa3000.blogspot.com> >>>>>>> +94 716 358 048 <http://ravihansa3000.blogspot.com>* >>>>>>> <http://wso2.com/>* >>>>>>> >>>>>>> >>>>>>> * <http://wso2.com/>* >>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>>>> -- >>>>> Imesh Gunaratne >>>>> >>>>> Senior Technical Lead, WSO2 >>>>> Committer & PMC Member, Apache Stratos >>>>> >>>> >>>> >>>> >>>> -- >>>> Akila Ravihansa Perera >>>> WSO2 Inc.; http://wso2.com/ >>>> >>>> Blog: http://ravihansa3000.blogspot.com >>>> >>>> -- >>>> <http://ravihansa3000.blogspot.com> >>>> <http://ravihansa3000.blogspot.com> >>>> Thanks and Regards, >>>> >>>> Isuru H. >>>> <http://ravihansa3000.blogspot.com> >>>> +94 716 358 048 <http://ravihansa3000.blogspot.com>* >>>> <http://wso2.com/>* >>>> >>>> >>>> * <http://wso2.com/>* >>>> >>>> >>>> >> >> >> -- >> >> Gayan Gunarathne >> Technical Lead, WSO2 Inc. (http://wso2.com) >> Committer & PMC Member, Apache Stratos >> email : gay...@wso2.com | mobile : +94 775030545 <%2B94%20766819985> >> >> >> > > > > -- > Sajith Kariyawasam > *Committer and PMC member, Apache Stratos, * > *WSO2 Inc.; http://wso2.com <http://wso2.com>* > *Mobile: 0772269575* > -- Akila Ravihansa Perera WSO2 Inc.; http://wso2.com/ Blog: http://ravihansa3000.blogspot.com