On Sun, Dec 6, 2015 at 10:47 AM, Isuru Haththotuwa <isu...@apache.org> wrote:
> Hi Akila, > > On Sun, Dec 6, 2015 at 10:26 AM, Akila Ravihansa Perera < > raviha...@wso2.com> wrote: > >> Hi Isuru, >> >> While I agree that it is hard to handle scenarios like this in Stratos >> given the current architecture and design, I believe pitfalls like this >> could end up being a huge overhead for its users. Not only they would have >> to maintain a PaaS but they will also have to monitor the logs or IaaS >> level dashboard and manually kill instances whenever Stratos fails to do >> so? Perhaps we need to rethink on the whole architecture? >> > IMHO we need to consider the probability of this happening; for an > example, in this case, whether users will try to deploy an application and > undeploy it again right at the next moment. Even in such cases, if we leave > a more meaningful log it should be enough, and Imesh mentioned.. > Event it is instance id is blank, this seems to be a issue with acquiring and releasing the locks in our thread model.IMO we need to handle that. I think we need to first identify which thread is try to release the write lock which acquired by another thread. I think we may able to reproduce this in the mock iaas by setting the instance id to blank.Also if we print the get context class loader in releaseWriteLock as a debug log I think we can get the exact thread which causing this issue. Will check on those points. @Akila , Did you able to reproduce this regularly? > >> As a short term solutions, perhaps we could wait for a certain amount of >> time until the member is initialized in the member termination flow. >> >> +1 to go for 4.1.5-rc3. >> >> Thanks. >> > +1 for 4.1.5-RC3. > >> >> On Sun, Dec 6, 2015 at 9:45 AM, Imesh Gunaratne <im...@apache.org> wrote: >> >>> Yes I agree with Isuru, however we should be able to raise a more >>> meaningful error message in a such situation. If an instance has not >>> initialized at the time the termination call is made, we should be able to >>> tell that to the end user clearly. >>> >>> [2015-12-06 00:29:51,337] WARN >>> {org.apache.stratos.common.concurrent.locks.ReadWriteLock} - System >>> warning! Trying to release a lock which has not been taken by the same >>> thread: [lock-name] topology-manager [thread-id] 214 [thread-name] >>> http-nio-9443-exec-17 >>> >>> Regarding the above warning message, we added it purposely to track >>> situations where threads try to release locks while they have not been >>> acquired by the same thread. If this happens there is a slight possibility >>> to some functionality to not work properly. >>> >>> If we are to list down the issues we identified in this release >>> candidate: >>> >>> - SNAPSHOT versions available in docker files >>> - Thrift client configuration file not being up to date in load >>> balancer extensions >>> - CEP extension distribution issue >>> - A validation to handle member termination logic when the given >>> member has not initiatlized >>> - doap_Stratos.rdf file was not up to date with release versions >>> >>> Considering all of the above +1 to cancel this vote and go for 4.1.5-rc3. >>> >>> Thanks >>> >>> On Sun, Dec 6, 2015 at 9:16 AM, Isuru Haththotuwa <isu...@apache.org> >>> wrote: >>> >>>> On Sun, Dec 6, 2015 at 12:37 AM, Akila Ravihansa Perera < >>>> raviha...@wso2.com> wrote: >>>> >>>>> I tried to deploy an application on EC2 and immediately undeployed it >>>>> which caused the following exception. Also the EC2 instance did not get >>>>> terminated. Noticed the following warning in the log; >>>>> >>>>> [2015-12-06 00:29:51,337] WARN >>>>> {org.apache.stratos.common.concurrent.locks.ReadWriteLock} - System >>>>> warning! Trying to release a lock which has not been taken by the same >>>>> thread: [lock-name] topology-manager [thread-id] 214 [thread-name] >>>>> http-nio-9443-exec-17 >>>>> >>>>> >>>>> >>>>> >>>>> [2015-12-06 00:29:51,309] INFO >>>>> {org.apache.stratos.autoscaler.client.AutoscalerCloudControllerClient} - >>>>> Terminating instance via cloud controller: [member] >>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>> [2015-12-06 00:29:51,313] ERROR >>>>> {org.apache.stratos.cloud.controller.services.impl.CloudControllerServiceImpl} >>>>> - Could not terminate instance, instance id is blank: [member-id] >>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>> , removing member from topology... >>>>> [2015-12-06 00:29:51,319] INFO >>>>> {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher} >>>>> - Publishing member terminated event: [service-name] php-ec2 [cluster-id] >>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain >>>>> [cluster-instance-id] single-cartridge-app-ec2-1 [member-id] >>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>> [network-partition-id] network-partition-ec2 [partition-id] partition-1 >>>>> [group-id] null >>>>> [2015-12-06 00:29:51,326] INFO >>>>> {org.apache.stratos.messaging.message.processor.topology.MemberTerminatedMessageProcessor} >>>>> - Member terminated: [service] php-ec2 [cluster] >>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain [member] >>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>> [2015-12-06 00:29:51,326] WARN >>>>> {org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor} - Obsolete >>>>> member has either been terminated or its obsolete time out has expired and >>>>> it is removed from obsolete members list: >>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain5545861a-0a1b-4532-830e-1c9beb2d8545 >>>>> [2015-12-06 00:29:51,327] INFO >>>>> {org.apache.stratos.autoscaler.status.processor.cluster.ClusterStatusTerminatedProcessor} >>>>> - Publishing Cluster terminated event for [application]: >>>>> single-cartridge-app-ec2 [cluster]: >>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain >>>>> [2015-12-06 00:29:51,335] INFO >>>>> {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder} - >>>>> Cluster Terminated adding status started for and removing the cluster >>>>> instancesingle-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain >>>>> *[2015-12-06 00:29:51,337] WARN >>>>> {org.apache.stratos.common.concurrent.locks.ReadWriteLock} - System >>>>> warning! Trying to release a lock which has not been taken by the same >>>>> thread: [lock-name] topology-manager [thread-id] 214 [thread-name] >>>>> http-nio-9443-exec-17* >>>>> [2015-12-06 00:29:51,346] INFO >>>>> {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher} >>>>> - Publishing Cluster terminated event: [application-id] >>>>> single-cartridge-app-ec2 [cluster id] >>>>> single-cartridge-app-ec2.my-php-app-ec2.php-ec2.domain [instance-id] >>>>> single-cartridge-app-ec2-1 >>>>> [2015-12-06 00:29:51,348] ERROR >>>>> {org.apache.stratos.autoscaler.rule.RuleTasksDelegator} - Cannot >>>>> terminate >>>>> instance >>>>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceCloudControllerExceptionException: >>>>> CloudControllerServiceCloudControllerExceptionException >>>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >>>>> Method) >>>>> at >>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >>>>> at >>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >>>>> at java.lang.Class.newInstance(Class.java:379) >>>>> at >>>>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.terminateInstance(CloudControllerServiceStub.java:8660) >>>>> at >>>>> org.apache.stratos.autoscaler.client.AutoscalerCloudControllerClient.terminateInstance(AutoscalerCloudControllerClient.java:203) >>>>> at >>>>> org.apache.stratos.autoscaler.rule.RuleTasksDelegator.terminateObsoleteInstance(RuleTasksDelegator.java:295) >>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>> at >>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>>> at >>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>>> at >>>>> org.mvel2.optimizers.impl.refl.nodes.MethodAccessor.getValue(MethodAccessor.java:48) >>>>> at >>>>> org.mvel2.optimizers.impl.refl.nodes.VariableAccessor.getValue(VariableAccessor.java:37) >>>>> at org.mvel2.ast.ASTNode.getReducedValueAccelerated(ASTNode.java:108) >>>>> at org.mvel2.MVELRuntime.execute(MVELRuntime.java:85) >>>>> at >>>>> org.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123) >>>>> at >>>>> org.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119) >>>>> at org.mvel2.MVEL.executeExpression(MVEL.java:930) >>>>> at >>>>> org.drools.base.mvel.MVELConsequence.evaluate(MVELConsequence.java:104) >>>>> at >>>>> org.drools.common.DefaultAgenda.fireActivation(DefaultAgenda.java:1287) >>>>> at >>>>> org.drools.common.DefaultAgenda.fireNextItem(DefaultAgenda.java:1221) >>>>> at >>>>> org.drools.common.DefaultAgenda.fireAllRules(DefaultAgenda.java:1456) >>>>> at >>>>> org.drools.common.AbstractWorkingMemory.fireAllRules(AbstractWorkingMemory.java:710) >>>>> at >>>>> org.drools.common.AbstractWorkingMemory.fireAllRules(AbstractWorkingMemory.java:674) >>>>> at >>>>> org.drools.impl.StatefulKnowledgeSessionImpl.fireAllRules(StatefulKnowledgeSessionImpl.java:230) >>>>> at >>>>> org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor.evaluate(ClusterMonitor.java:472) >>>>> at >>>>> org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor.access$200(ClusterMonitor.java:86) >>>>> at >>>>> org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor$2.run(ClusterMonitor.java:444) >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>> at java.lang.Thread.run(Thread.java:745) >>>>> [2015-12-06 00:29:51,356] INFO >>>>> {org.apache.stratos.autoscaler.event.receiver.topology.AutoscalerTopologyEventReceiver} >>>>> - [ClusterTerminatedEvent] Received: class >>>>> org.apache.stratos.messaging.event.topology.ClusterInstanceTerminatedEvent >>>>> [2015-12-06 00:29:51,356] INFO >>>>> {org.apache.stratos.autoscaler.status.processor.group.GroupStatusTerminatedProcessor} >>>>> - Sending application instance terminated for [application] >>>>> single-cartridge-app-ec2 [instance] single-cartridge-app-ec2-1 >>>>> >>>> >>>> Looking at the logs, it seems that the instance id has been null when >>>> the CC tries to terminate the instance, in the terminateInstance method. >>>> The instance id is returned with NodeMetadata when an instance is created >>>> in EC2. Maybe the instance id is null since the termination was started at >>>> the same moment in which the call to start the instance [1] is happening. >>>> In such cases the member is removed from the Topology and therefore the >>>> only option is to manually terminate it from the IaaS. IMHO such scenarios >>>> are difficult to handle from Stratos side. If the member is correctly >>>> removed from the Topology, then it should be fine. >>>> >>>> [1]. computeService.createNodesInGroup >>>> >>>>> >>>>> >>>>> On Sun, Dec 6, 2015 at 12:20 AM, Akila Ravihansa Perera < >>>>> raviha...@wso2.com> wrote: >>>>> >>>>>> We have few issues in Docker scripts. There are some SNAPSHOT >>>>>> references [1, 2] which breaks Docker images. >>>>>> >>>>>> [1] >>>>>> https://github.com/apache/stratos/blob/4.1.5-rc2/tools/docker-images/cartridge-docker-images/base-image/Dockerfile#L25 >>>>>> >>>>>> [2] >>>>>> https://github.com/apache/stratos/blob/4.1.5-rc2/tools/docker-images/cartridge-docker-images/base-image/files/run#L29 >>>>>> >>>>>> On Sat, Dec 5, 2015 at 8:02 PM, Gayan Gunarathne <gay...@wso2.com> >>>>>> wrote: >>>>>> >>>>>>> Modify it as 4.1.5-rc2 >>>>>>> >>>>>>> Thanks, >>>>>>> Gayan >>>>>>> >>>>>>> >>>>>>> On Sat, Dec 5, 2015 at 12:04 PM, Akila Ravihansa Perera < >>>>>>> raviha...@wso2.com> wrote: >>>>>>> >>>>>>>> Hi Gayan, >>>>>>>> >>>>>>>> The vote is for the tag, not the binaries. Therefore we need to tag >>>>>>>> the code in order to vote. >>>>>>>> >>>>>>>> Also we do not tag with a release version (4.1.5) until the vote >>>>>>>> has passed. >>>>>>>> >>>>>>>> Could you please fix it? >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> >>>>>>>> On Friday, 4 December 2015, Imesh Gunaratne <im...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Gayan, >>>>>>>>> >>>>>>>>> I do not see the 4.1.5-rc2 tag, have we created it as 4.1.5? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> On Thu, Dec 3, 2015 at 10:01 AM, Gayan Gunarathne <gay...@wso2.com >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> Hi All, >>>>>>>>>> >>>>>>>>>> This thread is for discussion of the second release candidate >>>>>>>>>> for Apache Stratos 4.1.5. Please use this thread for discussion >>>>>>>>>> of issues uncovered in the RC, questions you may have about the >>>>>>>>>> RC, etc. >>>>>>>>>> >>>>>>>>>> The RC release packs could be found here [1]. A git tag (4.1.5) >>>>>>>>>> [2] has been created for this release and its tree view could be >>>>>>>>>> seen here [3]. >>>>>>>>>> >>>>>>>>>> [1] <http://goog_1891852155> >>>>>>>>>> https://dist.apache.org/repos/dist/dev/stratos/releases/4.1.5-rc2/ >>>>>>>>>> [2] >>>>>>>>>> *https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=commit;h=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25 >>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=commit;h=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25>* >>>>>>>>>> [3] >>>>>>>>>> *https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=tree;h=bf33be1ad90c9bd071a5ded8dd440eb83de80ead;hb=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25 >>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=stratos.git;a=tree;h=bf33be1ad90c9bd071a5ded8dd440eb83de80ead;hb=a9f1f51a9ae2829d85bf7b8f2d8fb622db991d25>* >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> The Apache Stratos team >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Gayan Gunarathne >>>>>>>>>> Technical Lead, WSO2 Inc. (http://wso2.com) >>>>>>>>>> Committer & PMC Member, Apache Stratos >>>>>>>>>> email : gay...@wso2.com | mobile : +94 775030545 >>>>>>>>>> <%2B94%20766819985> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Imesh Gunaratne >>>>>>>>> >>>>>>>>> Senior Technical Lead, WSO2 >>>>>>>>> Committer & PMC Member, Apache Stratos >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Akila Ravihansa Perera >>>>>>>> WSO2 Inc.; http://wso2.com/ >>>>>>>> >>>>>>>> Blog: http://ravihansa3000.blogspot.com >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Gayan Gunarathne >>>>>>> Technical Lead, WSO2 Inc. (http://wso2.com) >>>>>>> Committer & PMC Member, Apache Stratos >>>>>>> email : gay...@wso2.com | mobile : +94 775030545 >>>>>>> <%2B94%20766819985> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Akila Ravihansa Perera >>>>>> WSO2 Inc.; http://wso2.com/ >>>>>> >>>>>> Blog: http://ravihansa3000.blogspot.com >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Akila Ravihansa Perera >>>>> WSO2 Inc.; http://wso2.com/ >>>>> >>>>> Blog: >>>>> http://ravihansa3000.blogspot.com >>>>> >>>>> -- >>>>> <http://ravihansa3000.blogspot.com> >>>>> <http://ravihansa3000.blogspot.com> >>>>> Thanks and Regards, >>>>> >>>>> Isuru H. >>>>> <http://ravihansa3000.blogspot.com> >>>>> +94 716 358 048 <http://ravihansa3000.blogspot.com>* >>>>> <http://wso2.com/>* >>>>> >>>>> >>>>> * <http://wso2.com/>* >>>>> >>>>> >>>>> >>> >>> >>> -- >>> Imesh Gunaratne >>> >>> Senior Technical Lead, WSO2 >>> Committer & PMC Member, Apache Stratos >>> >> >> >> >> -- >> Akila Ravihansa Perera >> WSO2 Inc.; http://wso2.com/ >> >> Blog: http://ravihansa3000.blogspot.com >> >> -- >> <http://ravihansa3000.blogspot.com> >> <http://ravihansa3000.blogspot.com> >> Thanks and Regards, >> >> Isuru H. >> <http://ravihansa3000.blogspot.com> >> +94 716 358 048 <http://ravihansa3000.blogspot.com>* <http://wso2.com/>* >> >> >> * <http://wso2.com/>* >> >> >> -- Gayan Gunarathne Technical Lead, WSO2 Inc. (http://wso2.com) Committer & PMC Member, Apache Stratos email : gay...@wso2.com | mobile : +94 775030545 <%2B94%20766819985>