Starting more instances than specified (and more).
--------------------------------------------------
Key: WHIRR-576
URL: https://issues.apache.org/jira/browse/WHIRR-576
Project: Whirr
Issue Type: Bug
Components: core, service/hbase
Affects Versions: 0.7.1
Reporter: Zuocheng Ren
When starting an hbase cluster with whirr, I encountered the following problem:
This is what instance templates looks like:
whirr.instance-templates=1
zookeeper+hadoop-namenode+hadoop-jobtracker+hbase-master,3
hadoop-datanode+hadoop-tasktracker+hbase-regionserver
Despite the failure to start the cluster, I noticed that whirr tried to start
more instances than I specified.
I took a loot at aws console, indeed 6(1+3+2) instances were started.
Moreover, whirr wrongly report that the cluster failed to start and destroyed
the cluster.
whirr-0.7.1 ren$ bin/whirr launch-cluster --config
recipes/exalt-hbase-ec2.properties
Bootstrapping cluster
Configuring template
Configuring template
Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker,
hbase-master]
Starting 3 node(s) with roles [hadoop-datanode, hadoop-tasktracker,
hbase-regionserver]
Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker,
hbase-regionserver]
Nodes started: [[id=us-east-1/i-a4efefc0, providerId=i-a4efefc0,
group=exalt-hbase, name=exalt-hbase-a4efefc0, location=[id=us-east-1c,
scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA],
metadata={}], uri=null, imageId=us-east-1/ami-ab36fbc2, os=[name=null,
family=ubuntu, version=10.04, arch=paravirtual, is64Bit=false,
description=099720109477/ebs/ubuntu-images/ubuntu-lucid-10.04-i386-server-20110930],
state=RUNNING, loginPort=22, hostname=domU-12-31-39-0B-F6-07,
privateAddresses=[10.214.249.241], publicAddresses=[174.129.105.39],
hardware=[id=m1.small, providerId=m1.small, name=null, processors=[[cores=1.0,
speed=1.0]], ram=1740, volumes=[[id=null, type=LOCAL, size=150.0,
device=/dev/sda2, durable=false, isBootDevice=false], [id=vol-63cbc00f,
type=SAN, size=null, device=/dev/sda1, durable=true, isBootDevice=true]],
supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,Not(is64Bit())),
tags=[]], loginUser=ubuntu, userMetadata={Name=exalt-hbase-a4efefc0}, tags=[]]]
Destroying failed nodes [us-east-1/i-daefefbe, us-east-1/i-c4f5f5a0,
us-east-1/i-a6efefc2]
Destroyed failed nodes [us-east-1/i-daefefbe, us-east-1/i-c4f5f5a0,
us-east-1/i-a6efefc2]
Unable to start the cluster. Terminating all nodes.
java.io.IOException: java.util.concurrent.ExecutionException:
java.io.IOException: Too many instance failed while bootstrapping! 2
successfully started instances while 3 instances failed
at
org.apache.whirr.actions.BootstrapClusterAction.doAction(BootstrapClusterAction.java:129)
at
org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
at
org.apache.whirr.ClusterController.launchCluster(ClusterController.java:106)
at
org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
at org.apache.whirr.cli.Main.run(Main.java:64)
at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Too
many instance failed while bootstrapping! 2 successfully started instances
while 3 instances failed
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at
org.apache.whirr.actions.BootstrapClusterAction.doAction(BootstrapClusterAction.java:124)
... 5 more
Caused by: java.io.IOException: Too many instance failed while bootstrapping! 2
successfully started instances while 3 instances failed
at org.apache.whirr.compute.StartupProcess.call(StartupProcess.java:92)
at org.apache.whirr.compute.StartupProcess.call(StartupProcess.java:40)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Unable to load cluster state, assuming it has no running nodes.
java.io.FileNotFoundException: /Users/ren/.whirr/exalt-hbase/instances (No such
file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at com.google.common.io.Files$1.getInput(Files.java:100)
at com.google.common.io.Files$1.getInput(Files.java:97)
at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
at com.google.common.io.Files.readLines(Files.java:580)
at
org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
at
org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
at
org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
at
org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
at
org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
at org.apache.whirr.cli.Main.run(Main.java:64)
at org.apache.whirr.cli.Main.main(Main.java:97)
Starting to run scripts on cluster for phase destroyinstances:
Starting to run scripts on cluster for phase destroyinstances:
Finished running destroy phase scripts on all cluster instances
Destroying exalt-hbase cluster
Cluster exalt-hbase destroyed
Exception in thread "main" java.lang.RuntimeException: java.io.IOException:
java.util.concurrent.ExecutionException: java.io.IOException: Too many instance
failed while bootstrapping! 2 successfully started instances while 3 instances
failed
at
org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
at
org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
at org.apache.whirr.cli.Main.run(Main.java:64)
at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: java.io.IOException: java.util.concurrent.ExecutionException:
java.io.IOException: Too many instance failed while bootstrapping! 2
successfully started instances while 3 instances failed
at
org.apache.whirr.actions.BootstrapClusterAction.doAction(BootstrapClusterAction.java:129)
at
org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
at
org.apache.whirr.ClusterController.launchCluster(ClusterController.java:106)
... 3 more
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Too
many instance failed while bootstrapping! 2 successfully started instances
while 3 instances failed
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at
org.apache.whirr.actions.BootstrapClusterAction.doAction(BootstrapClusterAction.java:124)
... 5 more
Caused by: java.io.IOException: Too many instance failed while bootstrapping! 2
successfully started instances while 3 instances failed
at org.apache.whirr.compute.StartupProcess.call(StartupProcess.java:92)
at org.apache.whirr.compute.StartupProcess.call(StartupProcess.java:40)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira