[ https://issues.apache.org/jira/browse/STRATOS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064650#comment-14064650 ]
Martin Eppel (meppel) commented on STRATOS-706: ----------------------------------------------- Ok, let us verify that the logs from the autoscaler will satisfy the request for sufficient logging, if yes I’ll close it otherwise update the JIRA Thanks Martin From: Udara Liyanage [mailto:ud...@wso2.com] Sent: Wednesday, July 16, 2014 9:24 PM To: dev Cc: d...@stratos.incubator.apache.org Subject: Re: [jira] [Commented] (STRATOS-706) member terminate event should log reason Hi Martin, The job of the CC is to spawn/terminate instances. AS is the one who decides when/what to start and when to terminate. So as Nirmal said have a look at the AS logs in order to find the reason for termination/spawning. On Thu, Jul 17, 2014 at 5:09 AM, Nirmal Fernando (JIRA) <j...@apache.org<mailto:j...@apache.org>> wrote: [ https://issues.apache.org/jira/browse/STRATOS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064337#comment-14064337 ] Nirmal Fernando commented on STRATOS-706: ----------------------------------------- On Thu, Jul 17, 2014 at 1:11 AM, Martin Eppel (JIRA) <j...@apache.org<mailto:j...@apache.org>> All the log file you quoted is from Cloud Controller. And what CC does is providing an API to terminate instances. The caller of this API, i.e. auto-scaler is the one who logs the reason for calling CC to terminate instances. Did you check auto-scaler logs? -- Best Regards, Nirmal Nirmal Fernando. PPMC Member & Committer of Apache Stratos, Senior Software Engineer, WSO2 Inc. Blog: http://nirmalfdo.blogspot.com/ -- This message was sent by Atlassian JIRA (v6.2#6252) -- Udara Liyanage Software Engineer WSO2, Inc.: http://wso2.com<http://wso2.com/> lean. enterprise. middleware web: http://udaraliyanage.wordpress.com phone: +94 71 443 6897 > member terminate event should log reason > ---------------------------------------- > > Key: STRATOS-706 > URL: https://issues.apache.org/jira/browse/STRATOS-706 > Project: Stratos > Issue Type: Bug > Components: Autoscaler > Affects Versions: 4.0.0 > Reporter: Martin Eppel > Fix For: 4.0.1 > > > When Stratos terminates a member it must log the reason for it. Ideally the > logging should be systematic enough so that one can grep for different > severity, or by member, or by event type or some other useful categorization. > The justification for this defect is that it will improve greatly debugging > and troubleshooting capabilities. Without logging it is very difficult to > debug terminations of members. > > For example, consider this sequence in the stratos log file: > > =================== > TID: [0] [STRATOS] [2014-07-15 09:58:48,654] DEBUG > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} - > Received an instance spawn request : MemberContext [memberId=null, > nodeId=null, clusterId=cisco-gilan-appmgr-1.cisco-gil, cartridgeType=null, > privateIpAddress=null, publicIpAddress=null, allocatedIpAddress=null, > initTime=1405418328649, lbClusterId=null, networkPartitionId=OAM1] > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} > TID: [0] [STRATOS] [2014-07-15 09:58:48,654] DEBUG > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} - > Payload: > SERVICE_NAME=cisco-gilan-appmgr,HOST_NAME=cisco-gilan-appmgr-1.qmog.cisco.com,MULTITENANT=false,TENANT_ID=-1234,TENANT_RANGE=-1234,CARTRIDGE_ALIAS=cisco-gilan-appmgr-1,CLUSTER_ID=cisco-gilan-appmgr-1.cisco-gil,CARTRIDGE_KEY=o1jbiPPmPWBgyNVM,DEPLOYMENT=default,REPO_URL=null,PORTS=9482,PUPPET_IP=PUPPET_IP,PUPPET_HOSTNAME=PUPPET_HOSTNAME,PUPPET_ENV=PUPPET_ENV,HEARTBEAT_AUTHKEY=20c9629a87f53ecdb5278d2ddb5a9d42,TRUSTSTORE_PASSWORD=wso2carbon,CEP_PORT=7611,MONITORING_SERVER_SECURE_PORT=0,MB_PORT=61616,OPENSTACK_COMPUTE_DNS=10.58.10.82,MB_IP=octl-01.qmog.cisco.com,QSB_PUPPET_ENVIR=,CEP_IP=octl-01.qmog.cisco.com,VSM_USER=admin,VEM_IP=192.168.66.43,ENABLE_DATA_PUBLISHER=false,MONITORING_SERVER_ADMIN_PASSWORD=xxxx,MONITORING_SERVER_IP=octl-01.qmog.cisco.com,VEM_USER=ubuntu,VEM_PWD=ubuntu,COMMIT_ENABLED=false,MONITORING_SERVER_ADMIN_USERNAME=xxxx,CERT_TRUSTSTORE=/opt/apache-stratos-cartridge-agent/security/client-truststore.jks,VSM_PWD=Starent123!,VSM_IP=192.168.66.2,MONITORING_SERVER_PORT=0,APPMGR_GITREPO=ssh://jenapper@10.58.10.189/home/jenapper/code/eccentrica.git,MEMBER_ID=cisco-gilan-appmgr-1.cisco-gil7ef7327f-2bb2-4768-820f-d064de29aa59,LB_CLUSTER_ID=null,NETWORK_PARTITION_ID=OAM1,PARTITION_ID=RegionOne-AZ-1 > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} > TID: [0] [STRATOS] [2014-07-15 09:58:55,888] INFO > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} - > Member is terminated: MemberContext > [memberId=cisco-gilan-appmgr-1.cisco-gil407f5bdc-aad2-4234-80fc-6cdf17be6192, > nodeId=RegionOne/89433818-21ed-48d4-bd8f-c396ab30f6d2, > clusterId=cisco-gilan-appmgr-1.cisco-gil, cartridgeType=cisco-gilan-appmgr, > privateIpAddress=192.168.66.1, publicIpAddress=null, allocatedIpAddress=null, > initTime=1405417410736, lbClusterId=null, networkPartitionId=OAM1] > {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} > =================== > > The problem is that Stratos gives no indication of why it is doing this [1]. > Stratos should be enhanced so that the above message gives some indication of > *why* the member is being terminated (loss of heartbeats, timeout on port > knocking etc. etc.). This is needed as apache stratos expands it's user base. > This issue has high priority as it affects the efficiency of troubleshooting > and system stability. > -- This message was sent by Atlassian JIRA (v6.2#6252)