Re: [one-users] Delete Recreate fails in case of Host-Error

2014-12-15 Thread Ruben S. Montero
This is now fixed for the next release. The snapshot will be reused, the
only potential problem is that  reusing the same volume with old images
from a previous installation, may lead to inconsistent copies. This is an
anomalous situation, and reusing the snapshot is probably faster and safer.

Cheers

Ruben

On Sat Dec 13 2014 at 8:58:41 AM Damon (Albino Geek) albinog...@gmail.com
wrote:

 On Fri, 12 Dec 2014 04:15:34 -0800, Fabian Zimmermann dev@gmail.com
 wrote:
  Hi,
 
  Ah! Thanks I just misread the oned.conf
 
  Nevertheless, I used -r and assumed it would re-create the VM, but
  this failed if you use (ceph) shared storage, because clone will abort
  if previous cleanup failed, so there is (in my opinion) a bug, because
  tm should handle this by removing or using the old snap/disk, isn't it?
 
  Fabian

 I do think that it should be better handled by the CephFS TM.
 ___
 Users mailing list
 Users@lists.opennebula.org
 http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

___
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org


Re: [one-users] Delete Recreate fails in case of Host-Error

2014-12-12 Thread Ruben S. Montero
Hi Fabian,

In OpenNebula 4.10 if the VM is in UNKNOWN it should go directly to boot
(bypassing CLEANUP and PROLOG) , provided you are using shared storage...

Cheers

Ruben

On Fri Dec 12 2014 at 4:16:11 AM Fabian Zimmermann dev@gmail.com
wrote:

 Hi,

 I just setup our fencing system and everything is working so far, but if
 I use the HOST-ERROR-Hook to deleterecreate the VMs this will fail.

 The CLEANUP is tried on the broken host - of course it will fail - and
 the followed CLONE will fail, because of already existing snapshots/disks.

 I just workaround this - see attached patch, which will just skip clone
 if disk/snap already exists, but what's the correct way to handle this?

 Thanks a lot,

 Fabian Zimmermann
 ___
 Users mailing list
 Users@lists.opennebula.org
 http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

___
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org


Re: [one-users] Delete Recreate fails in case of Host-Error

2014-12-12 Thread Ruben S. Montero
Totally, tm-ceph should work in this case... The issue for this:

http://dev.opennebula.org/issues/3446

THANKS your feedback!!!

Ruben

On Fri Dec 12 2014 at 1:15:38 PM Fabian Zimmermann dev@gmail.com
wrote:

 Hi,

 Am 12.12.14 13:04, schrieb Ruben S. Montero:
  If you have shared storage, you can simply issue onevm resched (I think
 the
  hook can do that with the -m option).
 
  If there is no shared storage then you have to go through the
  delete-recreate step, that will fail for the host as it is down. The host
  should be cleaned up manually once it is online again.
 Ah! Thanks I just misread the oned.conf

 Nevertheless, I used -r and assumed it would re-create the VM, but
 this failed if you use (ceph) shared storage, because clone will abort
 if previous cleanup failed, so there is (in my opinion) a bug, because
 tm should handle this by removing or using the old snap/disk, isn't it?

 Fabian


___
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org


Re: [one-users] Delete Recreate fails in case of Host-Error

2014-12-12 Thread Fabian Zimmermann
Hi Ruben,


Am 12.12.14 11:20, schrieb Ruben S. Montero:
 In OpenNebula 4.10 if the VM is in UNKNOWN it should go directly to boot
 (bypassing CLEANUP and PROLOG) , provided you are using shared storage...
we are using 4.10.1 and it looks like CLEANUP is executed isn't it?

--
Wed Dec 10 16:38:45 2014 [Z0][LCM][I]: New VM state is RUNNING
Thu Dec 11 09:16:01 2014 [Z0][LCM][I]: New VM state is UNKNOWN
Thu Dec 11 09:23:32 2014 [Z0][LCM][I]: New VM state is CLEANUP.
Thu Dec 11 09:23:32 2014 [Z0][DiM][I]: New VM state is PENDING
--

As written, i'm just using the ERR-Host-Hook to execute the
host-error.rb and deleterecreate the VMs.

Fabian

___
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org


Re: [one-users] Delete Recreate fails in case of Host-Error

2014-12-12 Thread Ruben S. Montero
Hi Fabian,

If you have shared storage, you can simply issue onevm resched (I think the
hook can do that with the -m option).

If there is no shared storage then you have to go through the
delete-recreate step, that will fail for the host as it is down. The host
should be cleaned up manually once it is online again.

Cheers

On Fri Dec 12 2014 at 12:44:50 PM Fabian Zimmermann dev@gmail.com
wrote:

 Hi Ruben,


 Am 12.12.14 11:20, schrieb Ruben S. Montero:
  In OpenNebula 4.10 if the VM is in UNKNOWN it should go directly to boot
  (bypassing CLEANUP and PROLOG) , provided you are using shared storage...
 we are using 4.10.1 and it looks like CLEANUP is executed isn't it?

 --
 Wed Dec 10 16:38:45 2014 [Z0][LCM][I]: New VM state is RUNNING
 Thu Dec 11 09:16:01 2014 [Z0][LCM][I]: New VM state is UNKNOWN
 Thu Dec 11 09:23:32 2014 [Z0][LCM][I]: New VM state is CLEANUP.
 Thu Dec 11 09:23:32 2014 [Z0][DiM][I]: New VM state is PENDING
 --

 As written, i'm just using the ERR-Host-Hook to execute the
 host-error.rb and deleterecreate the VMs.

 Fabian


___
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org


Re: [one-users] Delete Recreate fails in case of Host-Error

2014-12-12 Thread Damon (Albino Geek)
On Fri, 12 Dec 2014 04:15:34 -0800, Fabian Zimmermann dev@gmail.com  
wrote:

Hi,

Ah! Thanks I just misread the oned.conf

Nevertheless, I used -r and assumed it would re-create the VM, but
this failed if you use (ceph) shared storage, because clone will abort
if previous cleanup failed, so there is (in my opinion) a bug, because
tm should handle this by removing or using the old snap/disk, isn't it?

Fabian


I do think that it should be better handled by the CephFS TM.
___
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org


[one-users] Delete Recreate fails in case of Host-Error

2014-12-11 Thread Fabian Zimmermann
Hi,

I just setup our fencing system and everything is working so far, but if
I use the HOST-ERROR-Hook to deleterecreate the VMs this will fail.

The CLEANUP is tried on the broken host - of course it will fail - and
the followed CLONE will fail, because of already existing snapshots/disks.

I just workaround this - see attached patch, which will just skip clone
if disk/snap already exists, but what's the correct way to handle this?

Thanks a lot,

Fabian Zimmermann
--- clone.old   2014-12-11 13:13:52.001648056 +0100
+++ clone   2014-12-11 09:32:18.882111020 +0100
@@ -76,14 +76,15 @@
 CLONE_CMD=$(cat EOF
 set -e

-RBD_FORMAT=\$($RBD info $SRC_PATH | sed -n 's/.*format: // p')
-
-if [ \$RBD_FORMAT = 2 ]; then
-$RBD snap create $SRC_PATH@$RBD_SNAP
-$RBD snap protect $SRC_PATH@$RBD_SNAP
-$RBD clone $SRC_PATH@$RBD_SNAP $RBD_DST
-else
-$RBD copy $SRC_PATH $RBD_DST
+if ! $RBD info $RBD_DST; then
+RBD_FORMAT=\$($RBD info $SRC_PATH | sed -n 's/.*format: // p')
+if [ \$RBD_FORMAT = 2 ]; then
+$RBD snap create $SRC_PATH@$RBD_SNAP
+$RBD snap protect $SRC_PATH@$RBD_SNAP
+$RBD clone $SRC_PATH@$RBD_SNAP $RBD_DST
+else
+$RBD copy $SRC_PATH $RBD_DST
+fi
 fi
 EOF
 )
___
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org