[ceph-users] Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

2015-02-27 Thread Florian Haas
Hi everyone,

I always have a bit of trouble wrapping my head around how libvirt seems
to ignore ceph.conf option while qemu/kvm does not, so I thought I'd
ask. Maybe Josh, Wido or someone else can clarify the following.

http://ceph.com/docs/master/rbd/qemu-rbd/ says:

Important: If you set rbd_cache=true, you must set cache=writeback or
risk data loss. Without cache=writeback, QEMU will not send flush
requests to librbd. If QEMU exits uncleanly in this configuration,
filesystems on top of rbd can be corrupted.

Now this refers to explicitly setting rbd_cache=true on the qemu command
line, not having rbd_cache=true in the [client] section in ceph.conf,
and I'm not even sure whether qemu supports that anymore.

Even if it does, I'm still not sure whether the statement is accurate.

qemu has, for some time, had a cache=directsync mode which is intended
to be used as follows (from
http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg00020.html):

This mode is useful when guests may not be sending flushes when
appropriate and therefore leave data at risk in case of power failure.
When cache=directsync is used, write operations are only completed to
the guest when data is safely on disk.

So even if there are no flush requests to librbd, users should still be
safe from corruption when using cache=directsync, no?

So in summary, I *think* the following considerations apply, but I'd be
grateful if someone could confirm or refute them:

cache = writethrough
Maps to rbd_cache=true, rbd_cache_max_dirty=0. Read cache only, safe to
use whether or not guest I/O stack sends flushes.

cache = writeback
Maps to rbd_cache=true, rbd_cache_max_dirty  0. Safe to use only if
guest I/O stack sends flushes. Maps to cache = writethrough until first
flush if rbd_cache_writethrough_until_flush = true (default in master).

cache = none
Maps to rbd_cache=false. No caching, safe to use regardless of guest I/O
stack flush support.

cache = unsafe
Maps to rbd_cache=true, rbd_cache_max_dirty  0, but also *ignores* all
flush requests from the guest. Not safe to use (except in the unlikely
case that your guest never-ever writes).

cache=directsync
Maps to rbd_cache=true, rbd_cache_max_dirty=0. Bypasses the host page
cache altogether, which I think would be meaningless with the rbd
storage driver because it doesn't use the host page cache (unlike
qcow2). Read cache only, safe to use whether or not guest I/O stack
sends flushes.

Is the above an accurate summary? If so, I'll be happy to send a doc patch.

Cheers,
Florian
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

2015-02-27 Thread Mark Wu
2015-02-27 20:56 GMT+08:00 Alexandre DERUMIER aderum...@odiso.com:

 Hi,

 from qemu rbd.c

 if (flags  BDRV_O_NOCACHE) {
 rados_conf_set(s-cluster, rbd_cache, false);
 } else {
 rados_conf_set(s-cluster, rbd_cache, true);
 }

 and
 block.c

 int bdrv_parse_cache_flags(const char *mode, int *flags)
 {
 *flags = ~BDRV_O_CACHE_MASK;

 if (!strcmp(mode, off) || !strcmp(mode, none)) {
 *flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
 } else if (!strcmp(mode, directsync)) {
 *flags |= BDRV_O_NOCACHE;
 } else if (!strcmp(mode, writeback)) {
 *flags |= BDRV_O_CACHE_WB;
 } else if (!strcmp(mode, unsafe)) {
 *flags |= BDRV_O_CACHE_WB;
 *flags |= BDRV_O_NO_FLUSH;
 } else if (!strcmp(mode, writethrough)) {
 /* this is the default */
 } else {
 return -1;
 }

 return 0;
 }


 So rbd_cache is

 disabled for cache=directsync|none

 and enabled for writethrough|writeback|unsafe


 so directsync or none should be safe if guest does not send flush.



 - Mail original -
 De: Florian Haas flor...@hastexo.com
 À: ceph-users ceph-users@lists.ceph.com
 Envoyé: Vendredi 27 Février 2015 13:38:13
 Objet: [ceph-users] Possibly misleading/outdated documentation about
 qemu/kvm and rbd cache settings

 Hi everyone,

 I always have a bit of trouble wrapping my head around how libvirt seems
 to ignore ceph.conf option while qemu/kvm does not, so I thought I'd
 ask. Maybe Josh, Wido or someone else can clarify the following.

 http://ceph.com/docs/master/rbd/qemu-rbd/ says:

 Important: If you set rbd_cache=true, you must set cache=writeback or
 risk data loss. Without cache=writeback, QEMU will not send flush
 requests to librbd. If QEMU exits uncleanly in this configuration,
 filesystems on top of rbd can be corrupted.

 Now this refers to explicitly setting rbd_cache=true on the qemu command
 line, not having rbd_cache=true in the [client] section in ceph.conf,
 and I'm not even sure whether qemu supports that anymore.

 Even if it does, I'm still not sure whether the statement is accurate.

 qemu has, for some time, had a cache=directsync mode which is intended
 to be used as follows (from
 http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg00020.html):

 This mode is useful when guests may not be sending flushes when
 appropriate and therefore leave data at risk in case of power failure.
 When cache=directsync is used, write operations are only completed to
 the guest when data is safely on disk.

 So even if there are no flush requests to librbd, users should still be
 safe from corruption when using cache=directsync, no?

 So in summary, I *think* the following considerations apply, but I'd be
 grateful if someone could confirm or refute them:


 cache = writethrough
 Maps to rbd_cache=true, rbd_cache_max_dirty=0. Read cache only, safe to

 Actually, qemu doesn't care about the setting rbd_cache_max_dirty. In the
mode of writethrough,
qemu always sends flush following every write request.

 use whether or not guest I/O stack sends flushes.

 cache = writeback
 Maps to rbd_cache=true, rbd_cache_max_dirty  0. Safe to use only if
 guest I/O stack sends flushes. Maps to cache = writethrough until first

Qemu can report to guest if the write cache is enabled and guest kernel can
manage the cache
as what it does against volatile writeback cache on physical storage
controller
(Please see
https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt)
If filesystem barrier is not disabled on guest, it can avoid data
corruption.

 flush if rbd_cache_writethrough_until_flush = true (default in master).

 cache = none
 Maps to rbd_cache=false. No caching, safe to use regardless of guest I/O
 stack flush support.

 cache = unsafe
 Maps to rbd_cache=true, rbd_cache_max_dirty  0, but also *ignores* all
 flush requests from the guest. Not safe to use (except in the unlikely
 case that your guest never-ever writes).

 cache=directsync
 Maps to rbd_cache=true, rbd_cache_max_dirty=0. Bypasses the host page
 cache altogether, which I think would be meaningless with the rbd
 storage driver because it doesn't use the host page cache (unlike
 qcow2). Read cache only, safe to use whether or not guest I/O stack
 sends flushes.

 Is the above an accurate summary? If so, I'll be happy to send a doc patch.

 Cheers,
 Florian
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

2015-02-27 Thread Alexandre DERUMIER
Hi,

from qemu rbd.c

if (flags  BDRV_O_NOCACHE) {
rados_conf_set(s-cluster, rbd_cache, false);
} else {
rados_conf_set(s-cluster, rbd_cache, true);
}

and
block.c

int bdrv_parse_cache_flags(const char *mode, int *flags)
{
*flags = ~BDRV_O_CACHE_MASK;

if (!strcmp(mode, off) || !strcmp(mode, none)) {
*flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
} else if (!strcmp(mode, directsync)) {
*flags |= BDRV_O_NOCACHE;
} else if (!strcmp(mode, writeback)) {
*flags |= BDRV_O_CACHE_WB;
} else if (!strcmp(mode, unsafe)) {
*flags |= BDRV_O_CACHE_WB;
*flags |= BDRV_O_NO_FLUSH;
} else if (!strcmp(mode, writethrough)) {
/* this is the default */
} else {
return -1;
}

return 0;
}


So rbd_cache is 

disabled for cache=directsync|none

and enabled for writethrough|writeback|unsafe


so directsync or none should be safe if guest does not send flush.



- Mail original -
De: Florian Haas flor...@hastexo.com
À: ceph-users ceph-users@lists.ceph.com
Envoyé: Vendredi 27 Février 2015 13:38:13
Objet: [ceph-users] Possibly misleading/outdated documentation about qemu/kvm 
and rbd cache settings

Hi everyone, 

I always have a bit of trouble wrapping my head around how libvirt seems 
to ignore ceph.conf option while qemu/kvm does not, so I thought I'd 
ask. Maybe Josh, Wido or someone else can clarify the following. 

http://ceph.com/docs/master/rbd/qemu-rbd/ says: 

Important: If you set rbd_cache=true, you must set cache=writeback or 
risk data loss. Without cache=writeback, QEMU will not send flush 
requests to librbd. If QEMU exits uncleanly in this configuration, 
filesystems on top of rbd can be corrupted. 

Now this refers to explicitly setting rbd_cache=true on the qemu command 
line, not having rbd_cache=true in the [client] section in ceph.conf, 
and I'm not even sure whether qemu supports that anymore. 

Even if it does, I'm still not sure whether the statement is accurate. 

qemu has, for some time, had a cache=directsync mode which is intended 
to be used as follows (from 
http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg00020.html): 

This mode is useful when guests may not be sending flushes when 
appropriate and therefore leave data at risk in case of power failure. 
When cache=directsync is used, write operations are only completed to 
the guest when data is safely on disk. 

So even if there are no flush requests to librbd, users should still be 
safe from corruption when using cache=directsync, no? 

So in summary, I *think* the following considerations apply, but I'd be 
grateful if someone could confirm or refute them: 

cache = writethrough 
Maps to rbd_cache=true, rbd_cache_max_dirty=0. Read cache only, safe to 
use whether or not guest I/O stack sends flushes. 

cache = writeback 
Maps to rbd_cache=true, rbd_cache_max_dirty  0. Safe to use only if 
guest I/O stack sends flushes. Maps to cache = writethrough until first 
flush if rbd_cache_writethrough_until_flush = true (default in master). 

cache = none 
Maps to rbd_cache=false. No caching, safe to use regardless of guest I/O 
stack flush support. 

cache = unsafe 
Maps to rbd_cache=true, rbd_cache_max_dirty  0, but also *ignores* all 
flush requests from the guest. Not safe to use (except in the unlikely 
case that your guest never-ever writes). 

cache=directsync 
Maps to rbd_cache=true, rbd_cache_max_dirty=0. Bypasses the host page 
cache altogether, which I think would be meaningless with the rbd 
storage driver because it doesn't use the host page cache (unlike 
qcow2). Read cache only, safe to use whether or not guest I/O stack 
sends flushes. 

Is the above an accurate summary? If so, I'll be happy to send a doc patch. 

Cheers, 
Florian 
___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

2015-02-27 Thread Florian Haas
On 02/27/2015 01:56 PM, Alexandre DERUMIER wrote:
 Hi,
 
 from qemu rbd.c
 
 if (flags  BDRV_O_NOCACHE) {
 rados_conf_set(s-cluster, rbd_cache, false);
 } else {
 rados_conf_set(s-cluster, rbd_cache, true);
 }
 
 and
 block.c
 
 int bdrv_parse_cache_flags(const char *mode, int *flags)
 {
 *flags = ~BDRV_O_CACHE_MASK;
 
 if (!strcmp(mode, off) || !strcmp(mode, none)) {
 *flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
 } else if (!strcmp(mode, directsync)) {
 *flags |= BDRV_O_NOCACHE;
 } else if (!strcmp(mode, writeback)) {
 *flags |= BDRV_O_CACHE_WB;
 } else if (!strcmp(mode, unsafe)) {
 *flags |= BDRV_O_CACHE_WB;
 *flags |= BDRV_O_NO_FLUSH;
 } else if (!strcmp(mode, writethrough)) {
 /* this is the default */
 } else {
 return -1;
 }
 
 return 0;
 }
 
 
 So rbd_cache is 
 
 disabled for cache=directsync|none
 
 and enabled for writethrough|writeback|unsafe
 
 
 so directsync or none should be safe if guest does not send flush.

That's what I figured too, but then where does the important warning
in the documentation come from that implores people to always set
writeback? As per git blame it came directly from Josh. If anyone's an
authority on RBD, it would be him. :)

Cheers,
Florian

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

2015-02-27 Thread Florian Haas
On 02/27/2015 02:46 PM, Mark Wu wrote:
 
 
 2015-02-27 20:56 GMT+08:00 Alexandre DERUMIER aderum...@odiso.com
 mailto:aderum...@odiso.com:
 
 Hi,
 
 from qemu rbd.c
 
 if (flags  BDRV_O_NOCACHE) {
 rados_conf_set(s-cluster, rbd_cache, false);
 } else {
 rados_conf_set(s-cluster, rbd_cache, true);
 }
 
 and
 block.c
 
 int bdrv_parse_cache_flags(const char *mode, int *flags)
 {
 *flags = ~BDRV_O_CACHE_MASK;
 
 if (!strcmp(mode, off) || !strcmp(mode, none)) {
 *flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
 } else if (!strcmp(mode, directsync)) {
 *flags |= BDRV_O_NOCACHE;
 } else if (!strcmp(mode, writeback)) {
 *flags |= BDRV_O_CACHE_WB;
 } else if (!strcmp(mode, unsafe)) {
 *flags |= BDRV_O_CACHE_WB;
 *flags |= BDRV_O_NO_FLUSH;
 } else if (!strcmp(mode, writethrough)) {
 /* this is the default */
 } else {
 return -1;
 }
 
 return 0;
 }
 
 
 So rbd_cache is
 
 disabled for cache=directsync|none
 
 and enabled for writethrough|writeback|unsafe
 
 
 so directsync or none should be safe if guest does not send flush.
 
 
 
 - Mail original -
 De: Florian Haas flor...@hastexo.com mailto:flor...@hastexo.com
 À: ceph-users ceph-users@lists.ceph.com
 mailto:ceph-users@lists.ceph.com
 Envoyé: Vendredi 27 Février 2015 13:38:13
 Objet: [ceph-users] Possibly misleading/outdated documentation about
 qemu/kvm and rbd cache settings
 
 Hi everyone,
 
 I always have a bit of trouble wrapping my head around how libvirt seems
 to ignore ceph.conf option while qemu/kvm does not, so I thought I'd
 ask. Maybe Josh, Wido or someone else can clarify the following.
 
 http://ceph.com/docs/master/rbd/qemu-rbd/ says:
 
 Important: If you set rbd_cache=true, you must set cache=writeback or
 risk data loss. Without cache=writeback, QEMU will not send flush
 requests to librbd. If QEMU exits uncleanly in this configuration,
 filesystems on top of rbd can be corrupted.
 
 Now this refers to explicitly setting rbd_cache=true on the qemu command
 line, not having rbd_cache=true in the [client] section in ceph.conf,
 and I'm not even sure whether qemu supports that anymore.
 
 Even if it does, I'm still not sure whether the statement is accurate.
 
 qemu has, for some time, had a cache=directsync mode which is intended
 to be used as follows (from
 http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg00020.html):
 
 This mode is useful when guests may not be sending flushes when
 appropriate and therefore leave data at risk in case of power failure.
 When cache=directsync is used, write operations are only completed to
 the guest when data is safely on disk.
 
 So even if there are no flush requests to librbd, users should still be
 safe from corruption when using cache=directsync, no?
 
 So in summary, I *think* the following considerations apply, but I'd be
 grateful if someone could confirm or refute them: 
 
 
 cache = writethrough
 Maps to rbd_cache=true, rbd_cache_max_dirty=0. Read cache only, safe to
 
  Actually, qemu doesn't care about the setting rbd_cache_max_dirty. In
 the mode of writethrough,
 qemu always sends flush following every write request.

So how exactly is that functionally different from rbd_cache_max_dirty=0?


 use whether or not guest I/O stack sends flushes.
 
 cache = writeback
 Maps to rbd_cache=true, rbd_cache_max_dirty  0. Safe to use only if
 guest I/O stack sends flushes. Maps to cache = writethrough until first 
 
 Qemu can report to guest if the write cache is enabled and guest kernel
 can manage the cache
 as what it does against volatile writeback cache on physical storage
 controller
 (Please see
 https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt)
 If filesystem barrier is not disabled on guest, it can avoid data
 corruption.

You mean block barriers? I thought those were killed upstream like 4
years ago.

Cheers,
Florian

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com