That's what I thought.

So now you see the problem: the extract I provided was the complete lifecycle 
of that instance, and there is nothing in the log to indicate "why" the 
termination happened. Either the log levels in all the callers need to be 
matched (which is itself sucky for both modularity and readability) or they 
should provide information along the lines I indicated so the single terminate 
message has the needed info.

On Wednesday 23 Jul 2014 10:26:38 Nirmal Fernando wrote:
> 
>     [ 
> https://issues.apache.org/jira/browse/STRATOS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071575#comment-14071575
>  ] 
> 
> Nirmal Fernando commented on STRATOS-706:
> -----------------------------------------
> 
> Hi Shaheed,
> 
> All logs should be there in wso2carbon.log file of single JVM.
> 
> > member terminate event should log reason
> > ----------------------------------------
> >
> >                 Key: STRATOS-706
> >                 URL: https://issues.apache.org/jira/browse/STRATOS-706
> >             Project: Stratos
> >          Issue Type: Bug
> >          Components: Autoscaler
> >    Affects Versions: 4.0.0
> >            Reporter: Martin Eppel
> >             Fix For: 4.0.1
> >
> >
> > When Stratos terminates a member it must log the reason for it. Ideally the 
> > logging should be systematic enough so that one can grep for different 
> > severity, or by member, or by event type or some other useful 
> > categorization.
> > The justification for this defect is that it will improve greatly debugging 
> > and troubleshooting capabilities. Without logging it is very difficult to 
> > debug terminations of members.
> >  
> > For example, consider this sequence in the stratos log file:
> >  
> > ===================
> > TID: [0] [STRATOS] [2014-07-15 09:58:48,654] DEBUG 
> > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} -  
> > Received an instance spawn request : MemberContext [memberId=null, 
> > nodeId=null, clusterId=cisco-gilan-appmgr-1.cisco-gil, cartridgeType=null, 
> > privateIpAddress=null, publicIpAddress=null, allocatedIpAddress=null, 
> > initTime=1405418328649, lbClusterId=null, networkPartitionId=OAM1] 
> > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
> > TID: [0] [STRATOS] [2014-07-15 09:58:48,654] DEBUG 
> > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} -  
> > Payload: 
> > SERVICE_NAME=cisco-gilan-appmgr,HOST_NAME=cisco-gilan-appmgr-1.qmog.cisco.com,MULTITENANT=false,TENANT_ID=-1234,TENANT_RANGE=-1234,CARTRIDGE_ALIAS=cisco-gilan-appmgr-1,CLUSTER_ID=cisco-gilan-appmgr-1.cisco-gil,CARTRIDGE_KEY=o1jbiPPmPWBgyNVM,DEPLOYMENT=default,REPO_URL=null,PORTS=9482,PUPPET_IP=PUPPET_IP,PUPPET_HOSTNAME=PUPPET_HOSTNAME,PUPPET_ENV=PUPPET_ENV,HEARTBEAT_AUTHKEY=20c9629a87f53ecdb5278d2ddb5a9d42,TRUSTSTORE_PASSWORD=wso2carbon,CEP_PORT=7611,MONITORING_SERVER_SECURE_PORT=0,MB_PORT=61616,OPENSTACK_COMPUTE_DNS=10.58.10.82,MB_IP=octl-01.qmog.cisco.com,QSB_PUPPET_ENVIR=,CEP_IP=octl-01.qmog.cisco.com,VSM_USER=admin,VEM_IP=192.168.66.43,ENABLE_DATA_PUBLISHER=false,MONITORING_SERVER_ADMIN_PASSWORD=xxxx,MONITORING_SERVER_IP=octl-01.qmog.cisco.com,VEM_USER=ubuntu,VEM_PWD=ubuntu,COMMIT_ENABLED=false,MONITORING_SERVER_ADMIN_USERNAME=xxxx,CERT_TRUSTSTORE=/opt/apache-stratos-cartridge-agent/security/client-truststore.jks,VSM_PWD=Starent123!,VSM_IP=192.168.66.2,MONITORING_SERVER_PORT=0,APPMGR_GITREPO=ssh://jenapper@10.58.10.189/home/jenapper/code/eccentrica.git,MEMBER_ID=cisco-gilan-appmgr-1.cisco-gil7ef7327f-2bb2-4768-820f-d064de29aa59,LB_CLUSTER_ID=null,NETWORK_PARTITION_ID=OAM1,PARTITION_ID=RegionOne-AZ-1
> >  {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
> > TID: [0] [STRATOS] [2014-07-15 09:58:55,888]  INFO 
> > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} -  
> > Member is terminated: MemberContext 
> > [memberId=cisco-gilan-appmgr-1.cisco-gil407f5bdc-aad2-4234-80fc-6cdf17be6192,
> >  nodeId=RegionOne/89433818-21ed-48d4-bd8f-c396ab30f6d2, 
> > clusterId=cisco-gilan-appmgr-1.cisco-gil, cartridgeType=cisco-gilan-appmgr, 
> > privateIpAddress=192.168.66.1, publicIpAddress=null, 
> > allocatedIpAddress=null, initTime=1405417410736, lbClusterId=null, 
> > networkPartitionId=OAM1] 
> > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
> > ===================
> >  
> > The problem is that Stratos gives no indication of why it is doing this 
> > [1]. Stratos should be enhanced so that the above message gives some 
> > indication of *why* the member is being terminated (loss of heartbeats, 
> > timeout on port knocking etc. etc.). This is needed as apache stratos 
> > expands it's user base.
> > This issue has high priority as it affects the efficiency of 
> > troubleshooting and system stability.
> >  
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.2#6252)

Reply via email to