There seems to be a problem with sparse files and glusterfs and/or the
underlying FS (XFS?).

You can disable that functionality adding an "exit 1" command at the
top of these scripts:

* /var/lib/one/remotes/datastore/mkfs
* /var/lib/one/remotes/tm/mkimage

Another way of solving this is changing the what the image is created
(so it is not sparse). The problem is that it will take a lot more
time to create the image. The commad to change is 'dd' from those
scripts. For example, for 'mkfs':

exec_and_log "$DD if=/dev/zero of=$DST bs=1 count=1 seek=${SIZE}M" \
    "Could not create image $DST"

to

exec_and_log "$DD if=/dev/zero of=$DST bs=1M count=${SIZE}" \
    "Could not create image $DST"

Cheers

On Sat, Jan 17, 2015 at 10:43 AM, Wilma Hermann <wilma.herm...@gmail.com> wrote:
> Hi,
>
> Our OpenNebula setup uses GlusterFS to share /var/lib/one among all
> machines. Yesterday a customer created a new volatile disk for a VM. But
> this image creation crashed the gluster client on the host the VM was
> running on. I assume it has something to do with the fact that the customer
> entered 'ext3' as filesystem type.
>
> This isn't the first time this bug occured, we also had it almost one year
> ago and there it was also related to the filesystem type of an image. I
> believe that this feature is rarely used by our customers and simply wasn't
> used in the meantime. Now we are using OpenNebula 4.8.0 on Ubuntu 12.04.5
> with glusterfs 3.2.5.
>
> Here's the log of the VM that triggered the crash:
>
> Sat Jan 10 13:24:21 2015 [Z0][VMM][I]: VM successfully rebooted-hard.
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Command execution fail:
> /var/lib/one/remotes/tm/shared/mkimage 51200 ext3
> 192.168.128.14:/var/lib/one//datastores/0/346/disk.2 346 0
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkimage: Making filesystem of 51200M
> and type ext3 at 192.168.128.14:/var/lib/one//datastores/0/346/disk.2
> Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: mkimage: Command "set -e
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: export PATH=/usr/sbin:/sbin:$PATH
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: dd if=/dev/zero
> of=/var/lib/one/datastores/0/346/disk.2 bs=1 count=1 seek=51200M
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkfs -t ext3 -F
> /var/lib/one/datastores/0/346/disk.2" failed: Warning: Permanently added
> '192.168.128.14' (ECDSA) to the list of known hosts.
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1+0 records in
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1+0 records out
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1 byte (1 B) copied, 0.000576409 s,
> 1.7 kB/s
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mke2fs 1.42 (29-Nov-2011)
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not erase sector 2:
> Attempt to write block to filesystem resulted in short write
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not read block 0:
> Attempt to read block from filesystem resulted in short read
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not erase sector 0:
> Attempt to write block to filesystem resulted in short write
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkfs.ext3: Attempt to write block to
> filesystem resulted in short write while zeroing block 13107184 at end of
> filesystem
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]:
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Could not write 5 blocks in inode
> table starting at 1027: Attempt to write block to filesystem resulted in
> short write
> Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: Could not create image
> /var/lib/one/datastores/0/346/disk.2
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: ExitCode: 1
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Failed to execute transfer manager
> driver operation: tm_attach.
> Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: Error attaching new VM Disk: Could
> not create image /var/lib/one/datastores/0/346/disk.2
>
> After that crash all subsequent operations fail because the frontend was
> unable to log into that particular host (since /var/lib/one was missing and
> passwordless SSH did not work anymore).
>
> I have 2 questions:
> 1) Does anyone have an idea what's going on there?
> 2) Is it possible to disable this filesystem type feature. We don't need it,
> but I would like to prevent these accidental host crashes.
>
> Greetings
> Wilma
>
> _______________________________________________
> Users mailing list
> Users@lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>



-- 
Javier Fontán Muiños
Developer
OpenNebula - Flexible Enterprise Cloud Made Simple
www.OpenNebula.org | @OpenNebula | github.com/jfontan
_______________________________________________
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

Reply via email to