Hi Alex, thanks for your reply. I am talking about the case "when entity is in use". I am using a DynamicCluster where members are EmptySoftwareProcess. I am also using ssh sensor to collect cpu utilization and auto-scaling policy based on aggregated average. When one of the machine is down and not accessible over ssh, Brooklyn does not recognized it and not putting it "on-fire" (state is running). Is it possible to use in this case org.apache.brooklyn.policy.ha.ServiceReplacer in conjunction with org.apache.brooklyn.policy.ha.SshMachineFailureDetector and do you have any yaml examples on how to do it?
Thanks, Galina _____________________________ Galina Grunin Senior Software Engineer IBM Certified IT Architect & Master Inventor IBM Cloud and IOT, Somers, NY email: [email protected] From: Aled Sage <[email protected]> To: [email protected] Date: 09/26/2015 06:38 AM Subject: Re: releasing machine if it is not accessible over ssh Hi Galina, Yes. Exactly how depends when you are talking about it failing. _*On initial provisioning (i.e. never ssh'able)*_ If provisioning fails (e.g. dead-on-arrival, so not ssh'able), the default is to delete the VM. The config machineCreateAttempts (defaulting to 1) says how many times to try to provision a VM. The config destroyOnFailure (defaulting to true) says whether to delete the VMs that failed to provision successfully. _*When entity is in use*_ Assuming your sensor feeds are enabled, Brooklyn will detect the failure and mark the VM as "service.up=false". You can wire up any policy you want to respond to this. A useful policy is the org.apache.brooklyn.policy.ha.ServiceReplacer, usually used in conjunction with ServiceRestarter and/or ServiceFailureDetector. The ServiceReplacer is often added to a cluster entity. When a member of the cluster fails in an unrecoverable way, it adds a new member to the cluster and then terminates the failed member. This will release your machine. The ServiceFailureDetector has config options like entityFailed.stabilizationDelay, which will control how long the machine must be unreachable for: "Time period for which the service must be consistently down for (e.g. doesn't report down-up-down) before emitting ENTITY_FAILED". Aled On 25/09/2015 16:45, Galina Grunin wrote: > Is there a way to make brooklyn release a machine if it fails to ssh it > for some period of time? > > _____________________________ > Galina Grunin > Senior Software Engineer > IBM Certified IT Architect & Master Inventor > IBM Cloud and IOT, Somers, NY > email: [email protected] >
