Re: Stratos not properly terminating VMs to fail to startup

Vanson Lim Thu, 02 Apr 2015 12:06:11 -0700

On 4/2/15, 1:09 PM, Jeffrey Nguyen (jeffrngu) wrote:

Hi Anuruddha,
The instances that are in Error state on Openstack Horizon were never removed even after Stratos successfully spawned an instance. Itsounds like you might need to enhance jClouds API to return an object with nodeID info for this case. Or perhaps a better solution wouldbe to modify the jClouds API to delete the failed instance if it wasn’t spawned successfully (or make that an option of the API thathandles spawning new instance).

Jeffrey,

If jcloud is not returning an nodeID, it should handle cleaning up. It's also reasonable for jcloud to return an object to the failedinstance but that's not much use to stratos except for leaving the VM around so that we can see that it failed to come up. I don'tknow if we want an option to have jcloud try to respawn an instances as this would most likely fail. It's better to have stratos managethe retries.


-Vanson.

Regards,
-Jeffrey

From: Anuruddha Liyanarachchi <[email protected] <mailto:[email protected]>>
Reply-To: "[email protected] <mailto:[email protected]>" 
<[email protected] <mailto:[email protected]>>
Date: Thursday, April 2, 2015 at 4:43 AM
To: "[email protected] <mailto:[email protected]>" <[email protected] 
<mailto:[email protected]>>
Subject: Re: Stratos not properly terminating VMs to fail to startup

Hi Vanson / Jeffery,

As seen in logs, the instance Id is not returned to Stratos (instanceId=null) for the members which went to error state.Therefore Stratosdon't have control over the instances in the error state. Hence spawned instances with errors are not being deleted.




On Wed, Apr 1, 2015 at 4:08 AM, Jeffrey Nguyen (jeffrngu) <[email protected] 
<mailto:[email protected]>> wrote:

    Hi Vanson,

    I opened a JIRA to track this issue last week:
    https://issues.apache.org/jira/browse/STRATOS-1293

    -Jeffrey

    On 3/31/15, 3:04 PM, "Vanson Lim (vlim)" <[email protected] 
<mailto:[email protected]>> wrote:

    >Devs,
    >
    >I've simulated the case where openstack fails to bring up a VM (we've
    >seen this before in cases where required resources are not available
    >or there is some IAAS problem/timeout which caused the VM to failure to
    >launch), in this case we cause this failure by specifying the
    >cartridge to have a fixed ip address is not part of the network which the
    >VM attaches to.  The network is defined with a 10.0.0.0/24 
<http://10.0.0.0/24>
    >subnet, but I've specified a fixed ip=10.0.8.1 for cause the VM startup
    >to fail.
    >
    >The VM start fails and the VM remains in an error state the of the
    >"pendingMemberExpiryTimeout" period set in the autoscaler.xml file.
    >
    >Stratos fails to delete the VM in error state and attempts to start a new
    >VM, which also fails to launch.
    >
    >This presumably repeat itself creating an additional VM in error state
    >during each iteration until we've exhausted all the resources in the
    >system.
    >
    >wso2carbon.log and cartridge definition attached.
    >
    >-Vanson
    >




--
*Thanks and Regards,*
Anuruddha Lanka Liyanarachchi
Software Engineer - WSO2
Mobile : +94 (0) 712762611
Tel      : +94 112 145 345
a <mailto:[email protected]>[email protected] <mailto:[email protected]>

Re: Stratos not properly terminating VMs to fail to startup

Reply via email to