Additional info on the AMI I'm trying to run is: ami-d1e525b8 (slightly customized version of ami-63be790a)
On Wed, Aug 24, 2011 at 8:35 PM, Joris Poort <gpo...@gmail.com> wrote: > Not sure if I'm testing the right thing here, but just made the AMI > public to ensure it has no private settings giving issues. Executed > same as before: > > Terminal: > Bootstrapping cluster > Configuring template > Configuring template > Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker] > Starting 1 node(s) with roles [hadoop-namenode, hadoop-jobtracker] > Dying because - java.net.SocketTimeoutException: Read timed out > Dying because - java.net.SocketTimeoutException: Read timed out > Dying because - java.net.SocketTimeoutException: Read timed out > Dying because - java.net.SocketTimeoutException: Read timed out > > Whirr.log: > 2011-08-24 20:21:09,014 DEBUG [jclouds.compute] (pool-3-thread-3) << > started instances([region=us-east-1, name=sir-a65d6614]) > 2011-08-24 20:21:09,167 DEBUG [jclouds.compute] (pool-3-thread-4) << > started instances([region=us-east-1, name=sir-bf822814]) > 2011-08-24 20:21:09,422 DEBUG [jclouds.compute] (pool-3-thread-3) << > present instances([region=us-east-1, name=sir-a65d6614]) > 2011-08-24 20:21:09,672 DEBUG [jclouds.compute] (pool-3-thread-4) << > present instances([region=us-east-1, name=sir-bf822814]) > 2011-08-24 20:30:44,216 DEBUG [jclouds.compute] (user thread 2) >> > blocking on socket [address=184.72.163.250, port=22] for 600000 > seconds > 2011-08-24 20:30:58,316 DEBUG [jclouds.compute] (user thread 2) << > socket [address=184.72.163.250, port=22] opened > 2011-08-24 20:31:05,739 DEBUG [jclouds.compute] (user thread 6) >> > blocking on socket [address=50.16.31.67, port=22] for 600000 seconds > 2011-08-24 20:31:12,968 DEBUG [jclouds.compute] (user thread 6) << > socket [address=50.16.31.67, port=22] opened > > Any further help on ideas of anything to try to help debug this issue > would be greatly appreciated! > > Cheers, > > Joris > > On Wed, Aug 24, 2011 at 8:02 PM, Joris Poort <gpo...@gmail.com> wrote: >> I think you're probably right its an auth issue - although I was >> expecting a more direct/clear error message if the keypair wasn't >> working. >> >> I created the AMI by taking an EBS snapshot then converting to >> instance-store. I've tried both the ebs back ami and instance-store >> with the same results. My understanding is that the keypair used to >> create the AMI is generally one of the accepted keys in addition to >> the key pair used to launch the instance created by jclouds. I'm not >> sure how to confirm this for sure - is the jclouds keypair stored >> anywhere that can be used to test this? >> >> Thanks again for your help, >> >> Joris >> >> On Wed, Aug 24, 2011 at 7:49 PM, Andrei Savu <savu.and...@gmail.com> wrote: >>> I'm not sure but it looks like an auth issue to me. Whirr creates it's >>> own key pair using the local SSH keys as specified in the properties >>> file. >>> >>> You've created the custom ami by taking an EBS snapshot? Can you use >>> that custom ami with a different key pair? >>> >>> -- Andrei Savu / andreisavu.ro >>> >>> >>> On Wed, Aug 24, 2011 at 7:31 PM, Joris Poort <gpo...@gmail.com> wrote: >>>> Andrei - thanks for the response! >>>> >>>> I logged into the custom AMI using ssh and a key pair on my local >>>> machine (I'm executing whirr via ubuntu virtual machine). I've tried >>>> both spot instances and regular instances and am getting the same >>>> behavior. >>>> >>>> Full output on terminal looks like (lines between "Starting 1 node" >>>> and "Dying because" are not always there): >>>> Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker] >>>> Starting 1 node(s) with roles [hadoop-namenode, hadoop-jobtracker] >>>> Dying because - net.schmizz.sshj.transport.TransportException: Broken >>>> transport; encountered EOF >>>> Dying because - net.schmizz.sshj.transport.TransportException: Broken >>>> transport; encountered EOF >>>> <<kex done>> woke to: net.schmizz.sshj.transport.TransportException: >>>> Broken transport; encountered EOF >>>> << (root@174.129.128.120:22) error acquiring >>>> SSHClient(root@174.129.128.120:22): Broken transport; encountered EOF >>>> net.schmizz.sshj.transport.TransportException: Broken transport; >>>> encountered EOF >>>> at net.schmizz.sshj.transport.Reader.run(Reader.java:70) >>>> Dying because - java.net.SocketTimeoutException: Read timed out >>>> Dying because - java.net.SocketTimeoutException: Read timed out >>>> Dying because - java.net.SocketTimeoutException: Read timed out >>>> Dying because - java.net.SocketTimeoutException: Read timed out >>>> >>>> Last few entries on whirr.log: >>>> 2011-08-24 19:20:05,428 DEBUG [jclouds.compute] (pool-3-thread-2) >> >>>> requesting 1 spot instances region(us-east-1) price(0.250000) >>>> spec([instanceType=m1.large, imageId=ami-d1e525b8, kernelId=null, >>>> ramdiskId=null, availabilityZone=null, >>>> keyName=jclouds#hadoop_custom_spot_1#us-east-1#45, >>>> securityGroupIdToNames={}, blockDeviceMappings=[], >>>> securityGroupIds=[], >>>> securityGroupNames=[jclouds#hadoop_custom_spot_1#us-east-1], >>>> monitoringEnabled=null, userData=null]) options([formParameters={}]) >>>> 2011-08-24 19:20:05,642 DEBUG [jclouds.compute] (pool-3-thread-4) << >>>> started instances([region=us-east-1, name=sir-4f589c11]) >>>> 2011-08-24 19:20:05,682 DEBUG [jclouds.compute] (pool-3-thread-2) << >>>> started instances([region=us-east-1, name=sir-59cec612]) >>>> 2011-08-24 19:20:05,864 DEBUG [jclouds.compute] (pool-3-thread-4) << >>>> present instances([region=us-east-1, name=sir-4f589c11]) >>>> 2011-08-24 19:20:05,917 DEBUG [jclouds.compute] (pool-3-thread-2) << >>>> present instances([region=us-east-1, name=sir-59cec612]) >>>> 2011-08-24 19:27:18,150 DEBUG [jclouds.compute] (user thread 8) >> >>>> blocking on socket [address=50.17.135.8, port=22] for 600000 seconds >>>> 2011-08-24 19:27:21,132 DEBUG [jclouds.compute] (user thread 7) >> >>>> blocking on socket [address=174.129.128.120, port=22] for 600000 >>>> seconds >>>> 2011-08-24 19:27:24,222 DEBUG [jclouds.compute] (user thread 7) << >>>> socket [address=174.129.128.120, port=22] opened >>>> 2011-08-24 19:27:32,255 DEBUG [jclouds.compute] (user thread 8) << >>>> socket [address=50.17.135.8, port=22] opened >>>> >>>> After ssh onto node, didn't find any logs in /tmp. >>>> >>>> Thanks again for any help on this! >>>> >>>> Joris >>>> >>>> On Wed, Aug 24, 2011 at 7:12 PM, Andrei Savu <savu.and...@gmail.com> wrote: >>>>> I suspect this is an authentication issue. How do you login to the custom >>>>> AMI? >>>>> >>>>> Also check whirr.log for more details and on the remote machines look >>>>> in /tmp for jclouds script execution logs. >>>>> >>>>> I know from IRC that you are using spot instances. Are you seeing the >>>>> same behavior with regular ones? >>>>> >>>>> -- Andrei Savu / andreisavu.ro >>>>> >>>>> >>>>> On Wed, Aug 24, 2011 at 7:07 PM, Joris Poort <gpo...@gmail.com> wrote: >>>>>> Hi, >>>>>> >>>>>> I'm new to whirr and I'm running custom AMI configuration (application >>>>>> installed on working canonical image). Executing with whirr 0.6.0 >>>>>> everything executes fine until I get the following error: >>>>>> "Dying because - java.net.SocketTimeoutException: Read timed out" >>>>>> >>>>>> The instances are running fine, I can ssh into them, but the whirr >>>>>> code stalls and I get the above error (2x number of instances), no >>>>>> proxy shell is created. If I run the exact same code with vanilla >>>>>> canonical images I don't have any issues. >>>>>> >>>>>> Anyone have any ideas on things to test, debug or work around this? >>>>>> Would really appreciate it! >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Joris >>>>>> >>>>> >>>> >>> >> >