Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)

2015-03-11 Thread Bob Doolittle
For the record, once I added a new storage domain the Data center came up.

So in the end, this seems to have been due to known bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1160667
https://bugzilla.redhat.com/show_bug.cgi?id=1160423


Effectively, for hosts with static/manual IP addressing (i.e. not DHCP), the 
DNS and default route information are not set up correctly by 
hosted-engine-setup. I'm not sure why that's not considered a higher priority 
bug (e.g. blocker for 3.5.2?) since I believe the most typical configuration 
for servers is static IP addressing.

All seems to be working now. Many thanks to Simone for the invaluable 
assistance.

-Bob

On Mar 10, 2015 2:29 PM, Bob Doolittle b...@doolittle.us.com 
mailto:b...@doolittle.us.com wrote:


 On 03/10/2015 10:20 AM, Simone Tiraboschi wrote:


 - Original Message -

 From: Bob Doolittle b...@doolittle.us.com mailto:b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com mailto:stira...@redhat.com
 Cc: users-ovirt users@ovirt.org mailto:users@ovirt.org
 Sent: Tuesday, March 10, 2015 2:40:13 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on 
 F20 (The VDSM host was found in a failed
 state)


 On 03/10/2015 04:58 AM, Simone Tiraboschi wrote:

 - Original Message -

 From: Bob Doolittle b...@doolittle.us.com 
 mailto:b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com mailto:stira...@redhat.com
 Cc: users-ovirt users@ovirt.org mailto:users@ovirt.org
 Sent: Monday, March 9, 2015 11:48:03 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
 F20 (The VDSM host was found in a failed
 state)


 On 03/09/2015 02:47 PM, Bob Doolittle wrote:

 Resending with CC to list (and an update).

 On 03/09/2015 01:40 PM, Simone Tiraboschi wrote:

 - Original Message -

 From: Bob Doolittle b...@doolittle.us.com 
 mailto:b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com 
 mailto:stira...@redhat.com
 Cc: users-ovirt users@ovirt.org mailto:users@ovirt.org
 Sent: Monday, March 9, 2015 6:26:30 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
 on
 F20 (Cannot add the host to cluster ... SSH
 has failed)

 ...

 OK, I've started over. Simply removing the storage domain was
 insufficient,
 the hosted-engine deploy failed when it found the HA and Broker
 services
 already configured. I decided to just start over fresh starting with
 re-installing the OS on my host.

 I can't deploy DNS at the moment, so I have to simply replicate
 /etc/hosts
 files on my host/engine. I did that this time, but have run into a new
 problem:

 [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
   Enter the name of the cluster to which you want to add the
   host
   (Default) [Default]:
 [ INFO  ] Waiting for the host to become operational in the engine.
 This
 may
 take several minutes...
 [ ERROR ] The VDSM host was found in a failed state. Please check
 engine
 and
 bootstrap installation logs.
 [ ERROR ] Unable to add ovirt-vm to the manager
   Please shutdown the VM allowing the system to launch it as a
   monitored service.
   The system will wait until the VM is down.
 [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection
 refused
 [ INFO  ] Stage: Clean up
 [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection
 refused


 I've attached my engine log and the ovirt-hosted-engine-setup log. I
 think I
 had an issue with resolving external hostnames, or else a connectivity
 issue
 during the install.

 For some reason your engine wasn't able to deploy your hosts but the SSH
 session this time was established.
 2015-03-09 13:05:58,514 ERROR
 [org.ovirt.engine.core.bll.InstallVdsInternalCommand]
 (org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed
 for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.:
 java.io.IOException: Command returned failure code 1 during SSH session
 'r...@xion2.smartcity.net mailto:r...@xion2.smartcity.net'

 Can you please attach host-deploy logs from the engine VM?

 OK, attached.

 Like I said, it looks to me like a name-resolution issue during the yum
 update on the engine. I think I've fixed that, but do you have a better
 suggestion for cleaning up and re-deploying other than installing the OS
 on my host and starting all over again?

 I just finished starting over from scratch, starting with OS installation
 on
 my host/node, and wound up with a very similar problem - the engine
 couldn't
 reach the hosts during the yum operation. But this time the error was
 Network is unreachable. Which is weird, because I can ssh into the
 engine
 and ping many of those hosts, after the operation has failed.

 Here's my latest host-deploy log from the engine. I'd appreciate any
 clues.

 It seams that now your host is able to resolve that addresses but it's not
 able to connect over http.
 On your hosts some of them resolves as 

Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)

2015-03-11 Thread Sven Kieske


On 11/03/15 16:37, Bob Doolittle wrote:
 For the record, once I added a new storage domain the Data center came up.
 
 So in the end, this seems to have been due to known bugs:
 
 https://bugzilla.redhat.com/show_bug.cgi?id=1160667
 https://bugzilla.redhat.com/show_bug.cgi?id=1160423
 
 
 Effectively, for hosts with static/manual IP addressing (i.e. not DHCP), the 
 DNS and default route information are not set up correctly by 
 hosted-engine-setup. I'm not sure why that's not considered a higher priority 
 bug (e.g. blocker for 3.5.2?) since I believe the most typical configuration 
 for servers is static IP addressing.

+1

-- 
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH  Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)

2015-03-11 Thread Simone Tiraboschi


- Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Tuesday, March 10, 2015 7:29:44 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 
 (The VDSM host was found in a failed
 state)
 
 
 On 03/10/2015 10:20 AM, Simone Tiraboschi wrote:
 
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Cc: users-ovirt users@ovirt.org
  Sent: Tuesday, March 10, 2015 2:40:13 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
  F20 (The VDSM host was found in a failed
  state)
 
 
  On 03/10/2015 04:58 AM, Simone Tiraboschi wrote:
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Cc: users-ovirt users@ovirt.org
  Sent: Monday, March 9, 2015 11:48:03 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
  F20 (The VDSM host was found in a failed
  state)
 
 
  On 03/09/2015 02:47 PM, Bob Doolittle wrote:
  Resending with CC to list (and an update).
 
  On 03/09/2015 01:40 PM, Simone Tiraboschi wrote:
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Cc: users-ovirt users@ovirt.org
  Sent: Monday, March 9, 2015 6:26:30 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
  on
  F20 (Cannot add the host to cluster ... SSH
  has failed)
 
  ...
  OK, I've started over. Simply removing the storage domain was
  insufficient,
  the hosted-engine deploy failed when it found the HA and Broker
  services
  already configured. I decided to just start over fresh starting with
  re-installing the OS on my host.
 
  I can't deploy DNS at the moment, so I have to simply replicate
  /etc/hosts
  files on my host/engine. I did that this time, but have run into a
  new
  problem:
 
  [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
Enter the name of the cluster to which you want to add the
host
(Default) [Default]:
  [ INFO  ] Waiting for the host to become operational in the engine.
  This
  may
  take several minutes...
  [ ERROR ] The VDSM host was found in a failed state. Please check
  engine
  and
  bootstrap installation logs.
  [ ERROR ] Unable to add ovirt-vm to the manager
Please shutdown the VM allowing the system to launch it as
a
monitored service.
The system will wait until the VM is down.
  [ ERROR ] Failed to execute stage 'Closing up': [Errno 111]
  Connection
  refused
  [ INFO  ] Stage: Clean up
  [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection
  refused
 
 
  I've attached my engine log and the ovirt-hosted-engine-setup log. I
  think I
  had an issue with resolving external hostnames, or else a
  connectivity
  issue
  during the install.
  For some reason your engine wasn't able to deploy your hosts but the
  SSH
  session this time was established.
  2015-03-09 13:05:58,514 ERROR
  [org.ovirt.engine.core.bll.InstallVdsInternalCommand]
  (org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed
  for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.:
  java.io.IOException: Command returned failure code 1 during SSH
  session
  'r...@xion2.smartcity.net'
 
  Can you please attach host-deploy logs from the engine VM?
  OK, attached.
 
  Like I said, it looks to me like a name-resolution issue during the yum
  update on the engine. I think I've fixed that, but do you have a better
  suggestion for cleaning up and re-deploying other than installing the
  OS
  on my host and starting all over again?
  I just finished starting over from scratch, starting with OS
  installation
  on
  my host/node, and wound up with a very similar problem - the engine
  couldn't
  reach the hosts during the yum operation. But this time the error was
  Network is unreachable. Which is weird, because I can ssh into the
  engine
  and ping many of those hosts, after the operation has failed.
 
  Here's my latest host-deploy log from the engine. I'd appreciate any
  clues.
  It seams that now your host is able to resolve that addresses but it's
  not
  able to connect over http.
  On your hosts some of them resolves as IPv6 addresses; can you please try
  to use curl to get one of the file that it wasn't able to fetch?
  Can you please check your network configuration before and after
  host-deploy?
  I can give you the network configuration after host-deploy, at least for
  the
  host/Node. The engine won't start for me this morning, after I shut down
  the
  host for the night.
 
  In order to give you the config before host-deploy (or, apparently for the
  engine), I'll have to re-install the OS on the host and start again from
  scratch. Obviously I'd rather not do that unless absolutely necessary.
 
  

Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)

2015-03-10 Thread Simone Tiraboschi


- Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Tuesday, March 10, 2015 2:40:13 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 
 (The VDSM host was found in a failed
 state)
 
 
 On 03/10/2015 04:58 AM, Simone Tiraboschi wrote:
 
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Cc: users-ovirt users@ovirt.org
  Sent: Monday, March 9, 2015 11:48:03 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
  F20 (The VDSM host was found in a failed
  state)
 
 
  On 03/09/2015 02:47 PM, Bob Doolittle wrote:
  Resending with CC to list (and an update).
 
  On 03/09/2015 01:40 PM, Simone Tiraboschi wrote:
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Cc: users-ovirt users@ovirt.org
  Sent: Monday, March 9, 2015 6:26:30 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
  on
  F20 (Cannot add the host to cluster ... SSH
  has failed)
 
 ...
  OK, I've started over. Simply removing the storage domain was
  insufficient,
  the hosted-engine deploy failed when it found the HA and Broker
  services
  already configured. I decided to just start over fresh starting with
  re-installing the OS on my host.
 
  I can't deploy DNS at the moment, so I have to simply replicate
  /etc/hosts
  files on my host/engine. I did that this time, but have run into a new
  problem:
 
  [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
Enter the name of the cluster to which you want to add the
host
(Default) [Default]:
  [ INFO  ] Waiting for the host to become operational in the engine.
  This
  may
  take several minutes...
  [ ERROR ] The VDSM host was found in a failed state. Please check
  engine
  and
  bootstrap installation logs.
  [ ERROR ] Unable to add ovirt-vm to the manager
Please shutdown the VM allowing the system to launch it as a
monitored service.
The system will wait until the VM is down.
  [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection
  refused
  [ INFO  ] Stage: Clean up
  [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection
  refused
 
 
  I've attached my engine log and the ovirt-hosted-engine-setup log. I
  think I
  had an issue with resolving external hostnames, or else a connectivity
  issue
  during the install.
  For some reason your engine wasn't able to deploy your hosts but the SSH
  session this time was established.
  2015-03-09 13:05:58,514 ERROR
  [org.ovirt.engine.core.bll.InstallVdsInternalCommand]
  (org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed
  for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.:
  java.io.IOException: Command returned failure code 1 during SSH session
  'r...@xion2.smartcity.net'
 
  Can you please attach host-deploy logs from the engine VM?
  OK, attached.
 
  Like I said, it looks to me like a name-resolution issue during the yum
  update on the engine. I think I've fixed that, but do you have a better
  suggestion for cleaning up and re-deploying other than installing the OS
  on my host and starting all over again?
  I just finished starting over from scratch, starting with OS installation
  on
  my host/node, and wound up with a very similar problem - the engine
  couldn't
  reach the hosts during the yum operation. But this time the error was
  Network is unreachable. Which is weird, because I can ssh into the
  engine
  and ping many of those hosts, after the operation has failed.
 
  Here's my latest host-deploy log from the engine. I'd appreciate any
  clues.
  It seams that now your host is able to resolve that addresses but it's not
  able to connect over http.
  On your hosts some of them resolves as IPv6 addresses; can you please try
  to use curl to get one of the file that it wasn't able to fetch?
  Can you please check your network configuration before and after
  host-deploy?
 
 I can give you the network configuration after host-deploy, at least for the
 host/Node. The engine won't start for me this morning, after I shut down the
 host for the night.
 
 In order to give you the config before host-deploy (or, apparently for the
 engine), I'll have to re-install the OS on the host and start again from
 scratch. Obviously I'd rather not do that unless absolutely necessary.
 
 Here's the host config after the failed host-deploy:
 
 Host/Node:
 
 # ip route
 169.254.0.0/16 dev ovirtmgmt  scope link  metric 1007
 172.16.0.0/16 dev ovirtmgmt  proto kernel  scope link  src 172.16.0.58

You are missing a default gateway and so the issue.
Are you sure that it was properly configured before trying to deploy that host?

 # ip addr
 1: lo: LOOPBACK,UP,LOWER_UP mtu 65536 qdisc noqueue state 

Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)

2015-03-10 Thread Bob Doolittle

On 03/10/2015 04:58 AM, Simone Tiraboschi wrote:

 - Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Monday, March 9, 2015 11:48:03 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 
 (The VDSM host was found in a failed
 state)


 On 03/09/2015 02:47 PM, Bob Doolittle wrote:
 Resending with CC to list (and an update).

 On 03/09/2015 01:40 PM, Simone Tiraboschi wrote:
 - Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Monday, March 9, 2015 6:26:30 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
 F20 (Cannot add the host to cluster ... SSH
 has failed)

...
 OK, I've started over. Simply removing the storage domain was
 insufficient,
 the hosted-engine deploy failed when it found the HA and Broker services
 already configured. I decided to just start over fresh starting with
 re-installing the OS on my host.

 I can't deploy DNS at the moment, so I have to simply replicate
 /etc/hosts
 files on my host/engine. I did that this time, but have run into a new
 problem:

 [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
   Enter the name of the cluster to which you want to add the host
   (Default) [Default]:
 [ INFO  ] Waiting for the host to become operational in the engine. This
 may
 take several minutes...
 [ ERROR ] The VDSM host was found in a failed state. Please check engine
 and
 bootstrap installation logs.
 [ ERROR ] Unable to add ovirt-vm to the manager
   Please shutdown the VM allowing the system to launch it as a
   monitored service.
   The system will wait until the VM is down.
 [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection
 refused
 [ INFO  ] Stage: Clean up
 [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection
 refused


 I've attached my engine log and the ovirt-hosted-engine-setup log. I
 think I
 had an issue with resolving external hostnames, or else a connectivity
 issue
 during the install.
 For some reason your engine wasn't able to deploy your hosts but the SSH
 session this time was established.
 2015-03-09 13:05:58,514 ERROR
 [org.ovirt.engine.core.bll.InstallVdsInternalCommand]
 (org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed
 for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.:
 java.io.IOException: Command returned failure code 1 during SSH session
 'r...@xion2.smartcity.net'

 Can you please attach host-deploy logs from the engine VM?
 OK, attached.

 Like I said, it looks to me like a name-resolution issue during the yum
 update on the engine. I think I've fixed that, but do you have a better
 suggestion for cleaning up and re-deploying other than installing the OS
 on my host and starting all over again?
 I just finished starting over from scratch, starting with OS installation on
 my host/node, and wound up with a very similar problem - the engine couldn't
 reach the hosts during the yum operation. But this time the error was
 Network is unreachable. Which is weird, because I can ssh into the engine
 and ping many of those hosts, after the operation has failed.

 Here's my latest host-deploy log from the engine. I'd appreciate any clues.
 It seams that now your host is able to resolve that addresses but it's not 
 able to connect over http.
 On your hosts some of them resolves as IPv6 addresses; can you please try to 
 use curl to get one of the file that it wasn't able to fetch?
 Can you please check your network configuration before and after host-deploy?

I can give you the network configuration after host-deploy, at least for the 
host/Node. The engine won't start for me this morning, after I shut down the 
host for the night.

In order to give you the config before host-deploy (or, apparently for the 
engine), I'll have to re-install the OS on the host and start again from 
scratch. Obviously I'd rather not do that unless absolutely necessary.

Here's the host config after the failed host-deploy:

Host/Node:

# ip route
169.254.0.0/16 dev ovirtmgmt  scope link  metric 1007 
172.16.0.0/16 dev ovirtmgmt  proto kernel  scope link  src 172.16.0.58 

# ip addr
1: lo: LOOPBACK,UP,LOWER_UP mtu 65536 qdisc noqueue state UNKNOWN group 
default 
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host 
   valid_lft forever preferred_lft forever
2: p3p2: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast master 
ovirtmgmt state UP group default qlen 1000
link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
inet6 fe80::baca:3aff:fe79:2212/64 scope link 
   valid_lft forever preferred_lft forever
3: bond0: NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP mtu 1500 qdisc noqueue 

Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)

2015-03-10 Thread Bob Doolittle

On 03/10/2015 10:20 AM, Simone Tiraboschi wrote:

 - Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Tuesday, March 10, 2015 2:40:13 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 
 (The VDSM host was found in a failed
 state)


 On 03/10/2015 04:58 AM, Simone Tiraboschi wrote:
 - Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Monday, March 9, 2015 11:48:03 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
 F20 (The VDSM host was found in a failed
 state)


 On 03/09/2015 02:47 PM, Bob Doolittle wrote:
 Resending with CC to list (and an update).

 On 03/09/2015 01:40 PM, Simone Tiraboschi wrote:
 - Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Monday, March 9, 2015 6:26:30 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
 on
 F20 (Cannot add the host to cluster ... SSH
 has failed)

 ...
 OK, I've started over. Simply removing the storage domain was
 insufficient,
 the hosted-engine deploy failed when it found the HA and Broker
 services
 already configured. I decided to just start over fresh starting with
 re-installing the OS on my host.

 I can't deploy DNS at the moment, so I have to simply replicate
 /etc/hosts
 files on my host/engine. I did that this time, but have run into a new
 problem:

 [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
   Enter the name of the cluster to which you want to add the
   host
   (Default) [Default]:
 [ INFO  ] Waiting for the host to become operational in the engine.
 This
 may
 take several minutes...
 [ ERROR ] The VDSM host was found in a failed state. Please check
 engine
 and
 bootstrap installation logs.
 [ ERROR ] Unable to add ovirt-vm to the manager
   Please shutdown the VM allowing the system to launch it as a
   monitored service.
   The system will wait until the VM is down.
 [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection
 refused
 [ INFO  ] Stage: Clean up
 [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection
 refused


 I've attached my engine log and the ovirt-hosted-engine-setup log. I
 think I
 had an issue with resolving external hostnames, or else a connectivity
 issue
 during the install.
 For some reason your engine wasn't able to deploy your hosts but the SSH
 session this time was established.
 2015-03-09 13:05:58,514 ERROR
 [org.ovirt.engine.core.bll.InstallVdsInternalCommand]
 (org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed
 for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.:
 java.io.IOException: Command returned failure code 1 during SSH session
 'r...@xion2.smartcity.net'

 Can you please attach host-deploy logs from the engine VM?
 OK, attached.

 Like I said, it looks to me like a name-resolution issue during the yum
 update on the engine. I think I've fixed that, but do you have a better
 suggestion for cleaning up and re-deploying other than installing the OS
 on my host and starting all over again?
 I just finished starting over from scratch, starting with OS installation
 on
 my host/node, and wound up with a very similar problem - the engine
 couldn't
 reach the hosts during the yum operation. But this time the error was
 Network is unreachable. Which is weird, because I can ssh into the
 engine
 and ping many of those hosts, after the operation has failed.

 Here's my latest host-deploy log from the engine. I'd appreciate any
 clues.
 It seams that now your host is able to resolve that addresses but it's not
 able to connect over http.
 On your hosts some of them resolves as IPv6 addresses; can you please try
 to use curl to get one of the file that it wasn't able to fetch?
 Can you please check your network configuration before and after
 host-deploy?
 I can give you the network configuration after host-deploy, at least for the
 host/Node. The engine won't start for me this morning, after I shut down the
 host for the night.

 In order to give you the config before host-deploy (or, apparently for the
 engine), I'll have to re-install the OS on the host and start again from
 scratch. Obviously I'd rather not do that unless absolutely necessary.

 Here's the host config after the failed host-deploy:

 Host/Node:

 # ip route
 169.254.0.0/16 dev ovirtmgmt  scope link  metric 1007
 172.16.0.0/16 dev ovirtmgmt  proto kernel  scope link  src 172.16.0.58
 You are missing a default gateway and so the issue.
 Are you sure that it was properly configured before trying to deploy that 
 host?

It should have been, it was a fresh OS install. So I'm starting again, and 
keeping careful records of my network config.

Here 

Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)

2015-03-10 Thread Simone Tiraboschi


- Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Monday, March 9, 2015 11:48:03 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 
 (The VDSM host was found in a failed
 state)
 
 
 On 03/09/2015 02:47 PM, Bob Doolittle wrote:
  Resending with CC to list (and an update).
 
  On 03/09/2015 01:40 PM, Simone Tiraboschi wrote:
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Cc: users-ovirt users@ovirt.org
  Sent: Monday, March 9, 2015 6:26:30 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
  F20 (Cannot add the host to cluster ... SSH
  has failed)
 
 
  On 03/09/2015 12:53 PM, Simone Tiraboschi wrote:
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Cc: users-ovirt users@ovirt.org
  Sent: Monday, March 9, 2015 12:48:37 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
  on
  F20 (Cannot add the host to cluster ... SSH
  has failed)
 
 
  On 03/09/2015 07:12 AM, Simone Tiraboschi wrote:
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Sent: Monday, March 9, 2015 12:02:49 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
  on
  F20 (Cannot add the host to cluster ... SSH
  has failed)
 
  On Mar 9, 2015 5:23 AM, Simone Tiraboschi stira...@redhat.com
  wrote:
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: users-ovirt users@ovirt.org
  Sent: Friday, March 6, 2015 9:21:20 PM
  Subject: [ovirt-users] Error during hosted-engine-setup for 3.5.1
  on
  F20 (Cannot add the host to cluster ... SSH has
  failed)
 
  Hi,
 
  I'm following the instructions here:
  http://www.ovirt.org/Hosted_Engine_Howto
  My self-hosted install failed near the end:
 
  To continue make a selection from the options below:
(1) Continue setup - engine installation is complete
(2) Power off and restart the VM
(3) Abort setup
(4) Destroy VM and abort setup
 
(1, 2, 3, 4)[1]: 1
  [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
Enter the name of the cluster to which you want to add
the
  host
(Default) [Default]:
  [ ERROR ] Cannot automatically add the host to cluster Default:
  Cannot
  add
  Host. Connecting to host via SSH has failed, verify that the host
  is
  reachable (IP address, routable address etc.) You may refer to the
  engine.log file for further details.
  [ ERROR ] Failed to execute stage 'Closing up': Cannot add the host
  to
  cluster Default
  [ INFO  ] Stage: Clean up
  [ INFO  ] Generating answer file
  '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150306135624.conf'
  [ INFO  ] Stage: Pre-termination
  [ INFO  ] Stage: Termination
 
  I can ssh into the engine VM both locally and remotely. There is no
  /root/.ssh directory, however. Did I need to set that up somehow?
  It's the engine that needs to open an SSH connection to the host
  calling
  it by its hostname.
  So please be sure that you can SSH to the host from the engine using
  its
  hostname and not its IP address.
 
  I'm assuming this should be a password-less login (key-based
  authentication?).
  Yes, it is.
 
  As what user?
  root
  OK, I see a couple of problems.
  First off, I didn't have my deploying-host hostname in the hosts map
  for
  my
  engine.
  This is enough by itself to make the deploy procedure failing. If
  possible
  we recommend to rely a DNS infrastructure especially if you are
  deploying
  more than one host.
  OK, I've started over. Simply removing the storage domain was
  insufficient,
  the hosted-engine deploy failed when it found the HA and Broker services
  already configured. I decided to just start over fresh starting with
  re-installing the OS on my host.
 
  I can't deploy DNS at the moment, so I have to simply replicate
  /etc/hosts
  files on my host/engine. I did that this time, but have run into a new
  problem:
 
  [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
Enter the name of the cluster to which you want to add the host
(Default) [Default]:
  [ INFO  ] Waiting for the host to become operational in the engine. This
  may
  take several minutes...
  [ ERROR ] The VDSM host was found in a failed state. Please check engine
  and
  bootstrap installation logs.
  [ ERROR ] Unable to add ovirt-vm to the manager
Please shutdown the VM allowing the system to launch it as a
monitored service.
The system will wait until the VM is down.
  [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection
  refused
  [ INFO  ] Stage: Clean up
  [ ERROR ] Failed to execute stage