Re: [Users] Is there a way to force remove a host?

2012-09-28 Thread Itamar Heim

On 09/25/2012 01:45 PM, Shireesh Anjal wrote:

On Tuesday 25 September 2012 04:04 PM, Itamar Heim wrote:

On 09/25/2012 12:32 PM, Shireesh Anjal wrote:

On Tuesday 25 September 2012 01:42 PM, Itamar Heim wrote:

On 09/25/2012 09:44 AM, Shireesh Anjal wrote:

On Tuesday 25 September 2012 03:25 AM, Itamar Heim wrote:

On 09/24/2012 11:53 PM, Jason Brooks wrote:

On Mon 24 Sep 2012 01:24:44 PM PDT, Itamar Heim wrote:

On 09/24/2012 08:49 PM, Dominic Kaiser wrote:

This conversation is fine but if I want to force remove no matter
what I
should be able to from the GUI.  The nodes are no longer
available I
want to get rid of them ovirt does not let me.  I can delete from
database but why not from the GUI?  I am sure others may run into
this
problem as well.


what happens to the status of the host when you right click on the
host and specify you confirm it was shutdown?


I'm having this same issue. Confirming the host is shut down doesn't
make a difference.

I'm seeing lots of Failed to GlusterHostRemoveVDS, error =
Unexpected
exception errors in my engine log that seem to correspond w/ the
failed
remove host attempts.


is cluster defined as gluster as well?
what is the status of the host after you confirm shutdown?
any error on log on this specific command?

shireesh - not sure if relevant to this flow, but need to make sure
removing a host from the engine isn't blocked on gluster needing to
remove it from the gluster cluster if the host is not available any
more, or last host in gluster cluster?


Yes, currently the system tries the 'gluster peer detach hostname'
command when trying to remove a server, which fails if the server is
unavailable. This can be enhanced to show the error to user and then
allow 'force remove' which can use the 'gluster peer detach hostname
*force*' command that forcefully removes the server from the cluster,
even if it is not available or has bricks on it.


what if it is the last server in the cluster?
what if there is another server in the cluster but no communication to
it as well?


A quick look at code tells me that in case of virt, we don't allow
removing a host if it has  VM(s) in it (even if the host is currently
not available) i.e. vdsDynamic.getvm_count()  0. Please correct me if
I'm wrong. If that's correct, and if we want to keep it consistent for
gluster as well, then we should not allow removing a host if it has
gluster volume(s) in it. This is how it behaves in case of 'last server
in cluster' today.


true, but user can fence the host or confirm shutdown manually, which
will release all resources on it, then it can be removed.


I see. In that case, we can just remove the validation and allow
removing the host irrespective of whether it contains volume(s) or not.
Since it's the only host in the cluster, this won't cause any harm.





In case of no up server available in the cluster, we can show the error
and provide a 'force' option that will just remove it from the engine DB
and will not attempt gluster peer detach.


something like that.
i assume the gluster storage will handle this somehow?


What would you expect gluster storage to do in such a case? If all
servers are not accessible to a gluster client, the client can't
read/write from/to volumes of the cluster. Cluster management operations
in gluster (like removing a server from the cluster) are always done
from one of the servers of the cluster. So if no servers are available,
nothing can be done. Vijay can shed more light on this if required.

Assuming that some of the servers come up at a later point in time, they
would continue to consider this (removed from engine) server as one of
the peers. This would create an inconsistency between actual gluster
configuration and the engine DB. This, however can be handled once we
have a feature to sync configuration with gluster (this is WIP). This
feature will automatically identify such servers, and allow the user to
either import them to engine, or remove (peer detach) from the gluster
cluster.


why is that an issue though - worst case the server wouldn't appear in 
the admin console[1] if it is alive, and if it is dead, it is something 
the gluster cluster is supposed to deal with?


[1] though i assume the admin will continue to alert on its presence for 
being out-of-sync on list of servers in cluster.





















Dominic

On Sep 22, 2012 4:19 PM, Eli Mesika emes...@redhat.com
mailto:emes...@redhat.com wrote:



- Original Message -
  From: Douglas Landgraf dougsl...@redhat.com
mailto:dougsl...@redhat.com
  To: Dominic Kaiser domi...@bostonvineyard.org
mailto:domi...@bostonvineyard.org
  Cc: Eli Mesika emes...@redhat.com
mailto:emes...@redhat.com, users@ovirt.org
mailto:users@ovirt.org, Robert Middleswarth
rob...@middleswarth.net mailto:rob...@middleswarth.net
  Sent: Friday, September 21, 2012 8:12:27 PM
  Subject: Re: [Users] Is there a way to force remove a host?
 
  Hi Dominic,
 
 

Re: [Users] VM stuck in state Not Responding

2012-09-28 Thread Itamar Heim

On 09/28/2012 12:34 PM, Patrick Hurrelmann wrote:

Hi List,

in my test lab the iSCSI SAN crashed and caused some mess. My cluster
has 3 hosts running VMs. The SPM node was fenced and automatically
shutdown due to the storage crash. All VMs running on the other 2 hosts
were put to pause. I recovered the storage and powered on the fenced
node. All VMs were restarted or coming back to live except one. Since
this incident I am no longer able to start oder stop it. It is stuck in
state Not Responding and it seems I cannot revive it anymore.
The engine only provides the stop or shutdown operations, but none works.

The following is logged when trying to stop it:

2012-09-28 12:29:08,415 INFO  [org.ovirt.engine.core.bll.StopVmCommand]
(pool-3-thread-50) [49165a9b] Running command: StopVmCommand internal:
false. Entities affected :  ID: 0e95f511-62c5-438c-91fe-01c206ceb78f
Type: VM2012-09-28 12:29:08,416 WARN
[org.ovirt.engine.core.bll.VmOperationCommandBase] (pool-3-thread-50)
[49165a9b] Strange, according to the status NotResponding virtual
machine 0e95f511-62c5-438c-91fe-01c206ceb78f should be running in a
host but it isnt.
2012-09-28 12:29:08,420 ERROR [org.ovirt.engine.core.bll.StopVmCommand]
(pool-3-thread-50) [49165a9b] Transaction rolled-back for command:
org.ovirt.engine.core.bll.StopVmCommand.

and when trying to shutdown:

2012-09-28 12:30:16,213 INFO
[org.ovirt.engine.core.bll.ShutdownVmCommand] (pool-3-thread-48)
[42788145] Running command: ShutdownVmCommand internal: false. Entities
affected :  ID: 0e95f511-62c5-438c-91fe-01c206ceb78f Type: VM
2012-09-28 12:30:16,214 WARN
[org.ovirt.engine.core.bll.VmOperationCommandBase] (pool-3-thread-48)
[42788145] Strange, according to the status NotResponding virtual
machine 0e95f511-62c5-438c-91fe-01c206ceb78f should be running in a
host but it isnt.
2012-09-28 12:30:16,218 ERROR
[org.ovirt.engine.core.bll.ShutdownVmCommand] (pool-3-thread-48)
[42788145] Transaction rolled-back for command:
org.ovirt.engine.core.bll.ShutdownVmCommand.

Is there anything I can do to reset that stuck state and bring the VM
back to live?

Best regards
Patrick



try moving all vm's from that host (migrate them to the other hosts), 
then fence it (or shutdown manually and right click, confirm shutdown) 
to try and release the vm from it.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] oVirt 3.0 - importing OVF

2012-09-28 Thread Piotr Szubiakowski

Hi,
I have VM files from VMWare ESX 3.5. I converted this VM to OVF format. 
Is it possible to import this VM to oVirt. Should I use the 
import/export storage and virt-v2v tool?


Best regards,
Piotr

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] VM stuck in state Not Responding

2012-09-28 Thread Patrick Hurrelmann
 Is there anything I can do to reset that stuck state and bring the VM
 back to live?

 Best regards
 Patrick

 
 try moving all vm's from that host (migrate them to the other hosts), 
 then fence it (or shutdown manually and right click, confirm shutdown) 
 to try and release the vm from it.

In the web interface it is shown with icon for stopped VMs and an empty
host, but status is showing Not Responding. So the stuck VM is not
assigned to any host? All 3 hosts and the engine itself have already
been rebooted since the storage crash (The hosts one by one and going to
maintenance before).

Regards
Patrick

-- 
Lobster LOGsuite GmbH, Münchner Straße 15a, D-82319 Starnberg

HRB 178831, Amtsgericht München
Geschäftsführer: Dr. Martin Fischer, Rolf Henrich
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] VM stuck in state Not Responding

2012-09-28 Thread Itamar Heim

On 09/28/2012 03:04 PM, Patrick Hurrelmann wrote:

Is there anything I can do to reset that stuck state and bring the VM
back to live?

Best regards
Patrick



try moving all vm's from that host (migrate them to the other hosts),
then fence it (or shutdown manually and right click, confirm shutdown)
to try and release the vm from it.


In the web interface it is shown with icon for stopped VMs and an empty
host, but status is showing Not Responding. So the stuck VM is not
assigned to any host? All 3 hosts and the engine itself have already
been rebooted since the storage crash (The hosts one by one and going to
maintenance before).


shortest solution is for you to change the status of the VM in the db to 
unlock it, but would be nice to try and understand why this specific vm 
got into this state to fix the bug.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] oVirt 3.0 - importing OVF

2012-09-28 Thread Itamar Heim

On 09/28/2012 01:21 PM, Piotr Szubiakowski wrote:

Hi,
I have VM files from VMWare ESX 3.5. I converted this VM to OVF format.
Is it possible to import this VM to oVirt. Should I use the
import/export storage and virt-v2v tool?

Best regards,
Piotr

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


use virt-v2v for ovirt target to convert it and place it into the export 
domain, then import it to the system.
though i remember something about v2v being sensitive to source being 
esx directly or vcenter.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] oVirt 3.0 - importing OVF

2012-09-28 Thread Richard W.M. Jones
On Fri, Sep 28, 2012 at 03:15:20PM +0200, Itamar Heim wrote:
 On 09/28/2012 01:21 PM, Piotr Szubiakowski wrote:
 Hi,
 I have VM files from VMWare ESX 3.5. I converted this VM to OVF format.
 Is it possible to import this VM to oVirt. Should I use the
 import/export storage and virt-v2v tool?
 
 Best regards,
 Piotr
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
 use virt-v2v for ovirt target to convert it and place it into the
 export domain, then import it to the system.
 though i remember something about v2v being sensitive to source
 being esx directly or vcenter.

As Itamar says, currently you can only convert from live ESX servers.
It's in the RHEL 7 schedule to be able to import from OVF/OVA.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Is there a way to force remove a host?

2012-09-28 Thread Shireesh Anjal

On Friday 28 September 2012 01:00 PM, Itamar Heim wrote:

On 09/25/2012 01:45 PM, Shireesh Anjal wrote:

On Tuesday 25 September 2012 04:04 PM, Itamar Heim wrote:

On 09/25/2012 12:32 PM, Shireesh Anjal wrote:

On Tuesday 25 September 2012 01:42 PM, Itamar Heim wrote:

On 09/25/2012 09:44 AM, Shireesh Anjal wrote:

On Tuesday 25 September 2012 03:25 AM, Itamar Heim wrote:

On 09/24/2012 11:53 PM, Jason Brooks wrote:

On Mon 24 Sep 2012 01:24:44 PM PDT, Itamar Heim wrote:

On 09/24/2012 08:49 PM, Dominic Kaiser wrote:
This conversation is fine but if I want to force remove no 
matter

what I
should be able to from the GUI.  The nodes are no longer
available I
want to get rid of them ovirt does not let me. I can delete from
database but why not from the GUI?  I am sure others may run 
into

this
problem as well.


what happens to the status of the host when you right click on 
the

host and specify you confirm it was shutdown?


I'm having this same issue. Confirming the host is shut down 
doesn't

make a difference.

I'm seeing lots of Failed to GlusterHostRemoveVDS, error =
Unexpected
exception errors in my engine log that seem to correspond w/ the
failed
remove host attempts.


is cluster defined as gluster as well?
what is the status of the host after you confirm shutdown?
any error on log on this specific command?

shireesh - not sure if relevant to this flow, but need to make sure
removing a host from the engine isn't blocked on gluster needing to
remove it from the gluster cluster if the host is not available any
more, or last host in gluster cluster?


Yes, currently the system tries the 'gluster peer detach hostname'
command when trying to remove a server, which fails if the server is
unavailable. This can be enhanced to show the error to user and then
allow 'force remove' which can use the 'gluster peer detach 
hostname
*force*' command that forcefully removes the server from the 
cluster,

even if it is not available or has bricks on it.


what if it is the last server in the cluster?
what if there is another server in the cluster but no 
communication to

it as well?


A quick look at code tells me that in case of virt, we don't allow
removing a host if it has  VM(s) in it (even if the host is currently
not available) i.e. vdsDynamic.getvm_count()  0. Please correct me if
I'm wrong. If that's correct, and if we want to keep it consistent for
gluster as well, then we should not allow removing a host if it has
gluster volume(s) in it. This is how it behaves in case of 'last 
server

in cluster' today.


true, but user can fence the host or confirm shutdown manually, which
will release all resources on it, then it can be removed.


I see. In that case, we can just remove the validation and allow
removing the host irrespective of whether it contains volume(s) or not.
Since it's the only host in the cluster, this won't cause any harm.





In case of no up server available in the cluster, we can show the 
error
and provide a 'force' option that will just remove it from the 
engine DB

and will not attempt gluster peer detach.


something like that.
i assume the gluster storage will handle this somehow?


What would you expect gluster storage to do in such a case? If all
servers are not accessible to a gluster client, the client can't
read/write from/to volumes of the cluster. Cluster management operations
in gluster (like removing a server from the cluster) are always done
from one of the servers of the cluster. So if no servers are available,
nothing can be done. Vijay can shed more light on this if required.

Assuming that some of the servers come up at a later point in time, they
would continue to consider this (removed from engine) server as one of
the peers. This would create an inconsistency between actual gluster
configuration and the engine DB. This, however can be handled once we
have a feature to sync configuration with gluster (this is WIP). This
feature will automatically identify such servers, and allow the user to
either import them to engine, or remove (peer detach) from the gluster
cluster.


why is that an issue though - worst case the server wouldn't appear in 
the admin console[1] if it is alive, and if it is dead, it is 
something the gluster cluster is supposed to deal with?


It's just that I think it's not good to have the management console 
being out of sync with gluster configuration. However, as I said, we 
will soon have a mechanism to handle such cases.


Also, we're thinking of a simpler approach by just providing a 'force 
remove' checkbox on the remove host confirmation dialog (only if the 
host belongs to a gluster enabled cluster). User can then tick this 
checkbox when normal remove flow doesn't work in above discussed scenarios.




[1] though i assume the admin will continue to alert on its presence 
for being out-of-sync on list of servers in cluster.


Yes - this feature is WIP.






















Dominic

On Sep 22, 2012 4:19 PM, Eli Mesika