2017-11-07 19:06 GMT+05:00 Jason Dillaman <jdill...@redhat.com>: > On Tue, Nov 7, 2017 at 8:55 AM, Дробышевский, Владимир <v...@itgorod.ru> > wrote: > > > > Oh, sorry, I forgot to mention that all OSDs are with bluestore, so xfs > mount options don't have any influence. > > > > VMs have cache="none" by default, then I've tried "writethrough". No > difference. > > > > And aren't these rbd cache options enabled by default? > > Yes, they are enabled by default. Note, however, that the QEMU cache > options for the drive will override the Ceph configuration defaults. > > What specifically are you seeing in the guest OS when you state > "corruption"?
Guest OS just can't mount partitions and stuck at initramfs. But I found the reason: image is staying locked forever after a hypervisor crash. I didn't tie up one thing with another. And I've seen that images are staying locked when I tried to investigate the problem, don't know why didn't I try to unlock before writing here :( It was a problem with permissions, I've changed it and now everything works as it should. Thanks a lot for your help! And Nico, thank you for pointing out your thread: I found the correct permissions (in Jason's message) there. Going to open PR with fix for OpenNebula docs. > Assuming you haven't disabled barriers in your guest OS > mount options and are using a journaled filesystem like ext4 or XFS, > it should be sending proper flush requests to QEMU / librbd to ensure > that it remains crash consistent. However, if you disable barriers or > set a QEMU "cache=unsafe" option, these flush requests will not be > sent and your data will most likely be corrupt after a hard failure. > > > 2017-11-07 18:45 GMT+05:00 Peter Maloney <peter.maloney@brockmann-consu > lt.de>: > >> > >> I see nobarrier in there... Try without that. (unless that's just the > bluestore xfs...then it probably won't change anything). And are the osds > using bluestore? > >> > >> And what cache options did you set in the VM config? It's dangerous to > set writeback without also this in the client side ceph.conf: > >> > >> rbd cache writethrough until flush = true > >> rbd_cache = true > >> > >> > >> > >> > >> On 11/07/17 14:36, Дробышевский, Владимир wrote: > >> > >> Hello! > >> > >> I've got a weird situation with rdb drive image reliability. I found > that after hard-reset VM with ceph rbd drive from my new cluster become > corrupted. I accidentally found it during HA tests of my new cloud cluster: > after host reset VM was not able to boot again because of the virtual drive > errors. The same result will be if you just kill qemu process (like would > happened at host crash time). > >> > >> First of all I thought it is a guest OS problem. But then I tried > RouterOS (linux based), Linux, FreeBSD - all options show the same behavior. > >> Then I blamed OpenNebula installation. For the test sake I've > installed the latest Proxmox (5.1-36) to another server. The first subtest: > I've created a VM in OpenNebula from predefined image, shut it down, then > create Proxmox VM and pointed it to the image was created from OpenNebula. > >> The second subtest: I've made a clean install from ISO with from > Proxmox console, having previously created from Proxmox VM and drive image > (of course, on the same ceph pool). > >> Both results: unbootable VMs. > >> > >> Finally I've made a clean install to the fresh VM with local > LVM-backed drive image. And - guess what? - it survived qemu process kill. > >> > >> This is the first situation of this kind in my practice so I would > like to ask for guidance. I believe that it is a cache problem of some > kind, but I haven't faced it with earlier releases. > >> > >> Some cluster details: > >> > >> It's a small test cluster with 4 nodes, each has: > >> > >> 2x CPU E5-2665, > >> 128GB RAM > >> 1 OSD with Samsung sm863 1.92TB drive > >> IB connection with IPoIB on QDR IB network > >> > >> OS: Ubuntu 16.04 with 4.10 kernel > >> ceph: luminous 12.2.1 > >> > >> Client (kvm host) OSes: > >> 1. Ubuntu 16.04 (the same hosts as ceph cluster) > >> 2. Debian 9.1 in case of Proxmox > >> > >> > >> ceph.conf: > >> > >> [global] > >> fsid = 6a8ffc55-fa2e-48dc-a71c-647e1fff749b > >> > >> public_network = 10.103.0.0/16 > >> cluster_network = 10.104.0.0/16 > >> > >> mon_initial_members = e001n01, e001n02, e001n03 > >> mon_host = 10.103.0.1,10.103.0.2,10.103.0.3 > >> > >> rbd default format = 2 > >> > >> auth_cluster_required = cephx > >> auth_service_required = cephx > >> auth_client_required = cephx > >> > >> osd mount options = rw,noexec,nodev,noatime,nodiratime,nobarrier > >> osd mount options xfs = rw,noexec,nodev,noatime,nodiratime,nobarrier > >> osd_mkfs_type = xfs > >> > >> bluestore fsck on mount = true > >> > >> debug_lockdep = 0/0 > >> debug_context = 0/0 > >> debug_crush = 0/0 > >> debug_buffer = 0/0 > >> debug_timer = 0/0 > >> debug_filer = 0/0 > >> debug_objecter = 0/0 > >> debug_rados = 0/0 > >> debug_rbd = 0/0 > >> debug_journaler = 0/0 > >> debug_objectcatcher = 0/0 > >> debug_client = 0/0 > >> debug_osd = 0/0 > >> debug_optracker = 0/0 > >> debug_objclass = 0/0 > >> debug_filestore = 0/0 > >> debug_journal = 0/0 > >> debug_ms = 0/0 > >> debug_monc = 0/0 > >> debug_tp = 0/0 > >> debug_auth = 0/0 > >> debug_finisher = 0/0 > >> debug_heartbeatmap = 0/0 > >> debug_perfcounter = 0/0 > >> debug_asok = 0/0 > >> debug_throttle = 0/0 > >> debug_mon = 0/0 > >> debug_paxos = 0/0 > >> debug_rgw = 0/0 > >> > >> [osd] > >> osd op threads = 4 > >> osd disk threads = 2 > >> osd max backfills = 1 > >> osd recovery threads = 1 > >> osd recovery max active = 1 > >> > >> -- > >> > >> Best regards, > >> Vladimir > >> > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > >> > >> -- > >> > >> -------------------------------------------- > >> Peter Maloney > >> Brockmann Consult > >> Max-Planck-Str. 2 > >> 21502 Geesthacht > >> Germany > >> Tel: +49 4152 889 300 > >> Fax: +49 4152 889 333 > >> E-mail: peter.malo...@brockmann-consult.de > >> Internet: http://www.brockmann-consult.de > >> -------------------------------------------- > > > > > > > > > > -- > > > > Best regards, > > Vladimir > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > -- > Jason > -- Best regards, Vladimir
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com