Additional info on the AMI I'm trying to run is:
ami-d1e525b8 (slightly customized version of ami-63be790a)

On Wed, Aug 24, 2011 at 8:35 PM, Joris Poort <gpo...@gmail.com> wrote:
> Not sure if I'm testing the right thing here, but just made the AMI
> public to ensure it has no private settings giving issues.  Executed
> same as before:
>
> Terminal:
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker]
> Starting 1 node(s) with roles [hadoop-namenode, hadoop-jobtracker]
> Dying because - java.net.SocketTimeoutException: Read timed out
> Dying because - java.net.SocketTimeoutException: Read timed out
> Dying because - java.net.SocketTimeoutException: Read timed out
> Dying because - java.net.SocketTimeoutException: Read timed out
>
> Whirr.log:
> 2011-08-24 20:21:09,014 DEBUG [jclouds.compute] (pool-3-thread-3) <<
> started instances([region=us-east-1, name=sir-a65d6614])
> 2011-08-24 20:21:09,167 DEBUG [jclouds.compute] (pool-3-thread-4) <<
> started instances([region=us-east-1, name=sir-bf822814])
> 2011-08-24 20:21:09,422 DEBUG [jclouds.compute] (pool-3-thread-3) <<
> present instances([region=us-east-1, name=sir-a65d6614])
> 2011-08-24 20:21:09,672 DEBUG [jclouds.compute] (pool-3-thread-4) <<
> present instances([region=us-east-1, name=sir-bf822814])
> 2011-08-24 20:30:44,216 DEBUG [jclouds.compute] (user thread 2) >>
> blocking on socket [address=184.72.163.250, port=22] for 600000
> seconds
> 2011-08-24 20:30:58,316 DEBUG [jclouds.compute] (user thread 2) <<
> socket [address=184.72.163.250, port=22] opened
> 2011-08-24 20:31:05,739 DEBUG [jclouds.compute] (user thread 6) >>
> blocking on socket [address=50.16.31.67, port=22] for 600000 seconds
> 2011-08-24 20:31:12,968 DEBUG [jclouds.compute] (user thread 6) <<
> socket [address=50.16.31.67, port=22] opened
>
> Any further help on ideas of anything to try to help debug this issue
> would be greatly appreciated!
>
> Cheers,
>
> Joris
>
> On Wed, Aug 24, 2011 at 8:02 PM, Joris Poort <gpo...@gmail.com> wrote:
>> I think you're probably right its an auth issue - although I was
>> expecting a more direct/clear error message if the keypair wasn't
>> working.
>>
>> I created the AMI by taking an EBS snapshot then converting to
>> instance-store.  I've tried both the ebs back ami and instance-store
>> with the same results.  My understanding is that the keypair used to
>> create the AMI is generally one of the accepted keys in addition to
>> the key pair used to launch the instance created by jclouds.  I'm not
>> sure how to confirm this for sure - is the jclouds keypair stored
>> anywhere that can be used to test this?
>>
>> Thanks again for your help,
>>
>> Joris
>>
>> On Wed, Aug 24, 2011 at 7:49 PM, Andrei Savu <savu.and...@gmail.com> wrote:
>>> I'm not sure but it looks like an auth issue to me. Whirr creates it's
>>> own key pair using the local SSH keys as specified in the properties
>>> file.
>>>
>>> You've created the custom ami by taking an EBS snapshot? Can you use
>>> that custom ami with a different key pair?
>>>
>>> -- Andrei Savu / andreisavu.ro
>>>
>>>
>>> On Wed, Aug 24, 2011 at 7:31 PM, Joris Poort <gpo...@gmail.com> wrote:
>>>> Andrei - thanks for the response!
>>>>
>>>> I logged into the custom AMI using ssh and a key pair on my local
>>>> machine (I'm executing whirr via ubuntu virtual machine).  I've tried
>>>> both spot instances and regular instances and am getting the same
>>>> behavior.
>>>>
>>>> Full output on terminal looks like (lines between "Starting 1 node"
>>>> and "Dying because" are not always there):
>>>> Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker]
>>>> Starting 1 node(s) with roles [hadoop-namenode, hadoop-jobtracker]
>>>> Dying because - net.schmizz.sshj.transport.TransportException: Broken
>>>> transport; encountered EOF
>>>> Dying because - net.schmizz.sshj.transport.TransportException: Broken
>>>> transport; encountered EOF
>>>> <<kex done>> woke to: net.schmizz.sshj.transport.TransportException:
>>>> Broken transport; encountered EOF
>>>> << (root@174.129.128.120:22) error acquiring
>>>> SSHClient(root@174.129.128.120:22): Broken transport; encountered EOF
>>>> net.schmizz.sshj.transport.TransportException: Broken transport; 
>>>> encountered EOF
>>>>        at net.schmizz.sshj.transport.Reader.run(Reader.java:70)
>>>> Dying because - java.net.SocketTimeoutException: Read timed out
>>>> Dying because - java.net.SocketTimeoutException: Read timed out
>>>> Dying because - java.net.SocketTimeoutException: Read timed out
>>>> Dying because - java.net.SocketTimeoutException: Read timed out
>>>>
>>>> Last few entries on whirr.log:
>>>> 2011-08-24 19:20:05,428 DEBUG [jclouds.compute] (pool-3-thread-2) >>
>>>> requesting 1 spot instances region(us-east-1) price(0.250000)
>>>> spec([instanceType=m1.large, imageId=ami-d1e525b8, kernelId=null,
>>>> ramdiskId=null, availabilityZone=null,
>>>> keyName=jclouds#hadoop_custom_spot_1#us-east-1#45,
>>>> securityGroupIdToNames={}, blockDeviceMappings=[],
>>>> securityGroupIds=[],
>>>> securityGroupNames=[jclouds#hadoop_custom_spot_1#us-east-1],
>>>> monitoringEnabled=null, userData=null]) options([formParameters={}])
>>>> 2011-08-24 19:20:05,642 DEBUG [jclouds.compute] (pool-3-thread-4) <<
>>>> started instances([region=us-east-1, name=sir-4f589c11])
>>>> 2011-08-24 19:20:05,682 DEBUG [jclouds.compute] (pool-3-thread-2) <<
>>>> started instances([region=us-east-1, name=sir-59cec612])
>>>> 2011-08-24 19:20:05,864 DEBUG [jclouds.compute] (pool-3-thread-4) <<
>>>> present instances([region=us-east-1, name=sir-4f589c11])
>>>> 2011-08-24 19:20:05,917 DEBUG [jclouds.compute] (pool-3-thread-2) <<
>>>> present instances([region=us-east-1, name=sir-59cec612])
>>>> 2011-08-24 19:27:18,150 DEBUG [jclouds.compute] (user thread 8) >>
>>>> blocking on socket [address=50.17.135.8, port=22] for 600000 seconds
>>>> 2011-08-24 19:27:21,132 DEBUG [jclouds.compute] (user thread 7) >>
>>>> blocking on socket [address=174.129.128.120, port=22] for 600000
>>>> seconds
>>>> 2011-08-24 19:27:24,222 DEBUG [jclouds.compute] (user thread 7) <<
>>>> socket [address=174.129.128.120, port=22] opened
>>>> 2011-08-24 19:27:32,255 DEBUG [jclouds.compute] (user thread 8) <<
>>>> socket [address=50.17.135.8, port=22] opened
>>>>
>>>> After ssh onto node,  didn't find any logs in /tmp.
>>>>
>>>> Thanks again for any help on this!
>>>>
>>>> Joris
>>>>
>>>> On Wed, Aug 24, 2011 at 7:12 PM, Andrei Savu <savu.and...@gmail.com> wrote:
>>>>> I suspect this is an authentication issue. How do you login to the custom 
>>>>> AMI?
>>>>>
>>>>> Also check whirr.log for more details and on the remote machines look
>>>>> in /tmp for jclouds script execution logs.
>>>>>
>>>>> I know from IRC that you are using spot instances. Are you seeing the
>>>>> same behavior with regular ones?
>>>>>
>>>>> -- Andrei Savu / andreisavu.ro
>>>>>
>>>>>
>>>>> On Wed, Aug 24, 2011 at 7:07 PM, Joris Poort <gpo...@gmail.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm new to whirr and I'm running custom AMI configuration (application
>>>>>> installed on working canonical image).  Executing with whirr 0.6.0
>>>>>> everything executes fine until I get the following error:
>>>>>> "Dying because - java.net.SocketTimeoutException: Read timed out"
>>>>>>
>>>>>> The instances are running fine, I can ssh into them, but the whirr
>>>>>> code stalls and I get the above error (2x number of instances), no
>>>>>> proxy shell is created.  If I run the exact same code with vanilla
>>>>>> canonical images I don't have any issues.
>>>>>>
>>>>>> Anyone have any ideas on things to test, debug or work around this?
>>>>>> Would really appreciate it!
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Joris
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to