Oh, sorry, I forgot to mention that all OSDs are with bluestore, so xfs mount options don't have any influence.
VMs have cache="none" by default, then I've tried "writethrough". No difference. And aren't these rbd cache options enabled by default? 2017-11-07 18:45 GMT+05:00 Peter Maloney <peter.malo...@brockmann-consult.de >: > I see nobarrier in there... Try without that. (unless that's just the > bluestore xfs...then it probably won't change anything). And are the osds > using bluestore? > > And what cache options did you set in the VM config? It's dangerous to set > writeback without also this in the client side ceph.conf: > > rbd cache writethrough until flush = true > rbd_cache = true > > > > > On 11/07/17 14:36, Дробышевский, Владимир wrote: > > Hello! > > I've got a weird situation with rdb drive image reliability. I found > that after hard-reset VM with ceph rbd drive from my new cluster become > corrupted. I accidentally found it during HA tests of my new cloud cluster: > after host reset VM was not able to boot again because of the virtual drive > errors. The same result will be if you just kill qemu process (like would > happened at host crash time). > > First of all I thought it is a guest OS problem. But then I tried > RouterOS (linux based), Linux, FreeBSD - all options show the same > behavior. > Then I blamed OpenNebula installation. For the test sake I've installed > the latest Proxmox (5.1-36) to another server. The first subtest: I've > created a VM in OpenNebula from predefined image, shut it down, then create > Proxmox VM and pointed it to the image was created from OpenNebula. > The second subtest: I've made a clean install from ISO with from Proxmox > console, having previously created from Proxmox VM and drive image (of > course, on the same ceph pool). > Both results: unbootable VMs. > > Finally I've made a clean install to the fresh VM with local LVM-backed > drive image. And - guess what? - it survived qemu process kill. > > This is the first situation of this kind in my practice so I would like > to ask for guidance. I believe that it is a cache problem of some kind, but > I haven't faced it with earlier releases. > > Some cluster details: > > It's a small test cluster with 4 nodes, each has: > > 2x CPU E5-2665, > 128GB RAM > 1 OSD with Samsung sm863 1.92TB drive > IB connection with IPoIB on QDR IB network > > OS: Ubuntu 16.04 with 4.10 kernel > ceph: luminous 12.2.1 > > Client (kvm host) OSes: > 1. Ubuntu 16.04 (the same hosts as ceph cluster) > 2. Debian 9.1 in case of Proxmox > > > *ceph.conf:* > > [global] > fsid = 6a8ffc55-fa2e-48dc-a71c-647e1fff749b > > public_network = 10.103.0.0/16 > cluster_network = 10.104.0.0/16 > > mon_initial_members = e001n01, e001n02, e001n03 > mon_host = 10.103.0.1,10.103.0.2,10.103.0.3 > > rbd default format = 2 > > auth_cluster_required = cephx > auth_service_required = cephx > auth_client_required = cephx > > osd mount options = rw,noexec,nodev,noatime,nodiratime,nobarrier > osd mount options xfs = rw,noexec,nodev,noatime,nodiratime,nobarrier > osd_mkfs_type = xfs > > bluestore fsck on mount = true > > debug_lockdep = 0/0 > debug_context = 0/0 > debug_crush = 0/0 > debug_buffer = 0/0 > debug_timer = 0/0 > debug_filer = 0/0 > debug_objecter = 0/0 > debug_rados = 0/0 > debug_rbd = 0/0 > debug_journaler = 0/0 > debug_objectcatcher = 0/0 > debug_client = 0/0 > debug_osd = 0/0 > debug_optracker = 0/0 > debug_objclass = 0/0 > debug_filestore = 0/0 > debug_journal = 0/0 > debug_ms = 0/0 > debug_monc = 0/0 > debug_tp = 0/0 > debug_auth = 0/0 > debug_finisher = 0/0 > debug_heartbeatmap = 0/0 > debug_perfcounter = 0/0 > debug_asok = 0/0 > debug_throttle = 0/0 > debug_mon = 0/0 > debug_paxos = 0/0 > debug_rgw = 0/0 > > [osd] > osd op threads = 4 > osd disk threads = 2 > osd max backfills = 1 > osd recovery threads = 1 > osd recovery max active = 1 > > -- > > Best regards, > Vladimir > > > _______________________________________________ > ceph-users mailing > listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- > > -------------------------------------------- > Peter Maloney > Brockmann Consult > Max-Planck-Str. 2 > 21502 Geesthacht > Germany > Tel: +49 4152 889 300 > Fax: +49 4152 889 333 > E-mail: peter.malo...@brockmann-consult.de > Internet: http://www.brockmann-consult.de > -------------------------------------------- > > -- Best regards, Vladimir
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com