On 4/23/19 9:46 AM, Armin Schindler wrote: > On 23.04.2019 09:28, Michael Hierweck wrote: >> On 23.04.19 09:05, Armin Schindler wrote: >>> On 20.04.2019 14:38, a...@sysgo.com wrote: >>>>> On 13 March 2019 at 11:47 Roland Kammerer <roland.kamme...@linbit.com> >>>>> wrote: >>>>> >>>>> >>>>> On Tue, Mar 12, 2019 at 09:08:42AM +0100, Armin Schindler wrote: >>>>>> On 3/11/19 1:42 PM, Roland Kammerer wrote: >>>>>>> On Mon, Mar 11, 2019 at 11:13:11AM +0100, Armin Schindler wrote: >>>>>>>> 2 hosts Debian 9 (stretch) with default DRBD version 8.4.7. >>>>>>> >>>>>>> Please retry with the current 8.4.11 version of DRBD. You can it from >>>>>>> here: >>>>>>> https://www.linbit.com/en/drbd-community/drbd-download/ >>>>>> >>>>>> Okay, thanks. I will test 8.4.11. >>>>>> >>>>>> Do I need to change/update the tools as well or just the kernel driver? >>>>>> I currently use drbd-utils 8.9.10. >>>>> >>>>> They should be fine. I don't remember any non-corner cases fixes for 8.4 >>>>> in drbd-utils. >>>> >>>> I tried version 8.4.11 and the problem persists. >>>> When using Qemu/KVM virtio disk with a caching mode that uses host page >>>> cache, >>>> or when using just a filesystem like ext4 on (without Qemu/KVM) on the >>>> host, the >>>> drbd device gets out of sync after a while. >> >> Same here: >> >> LVM (thick) => DRBD => Virtio (cache=none or cache=directsync) >> >> After some weeks of running about 80 VMs on 4 nodes, some of the VM backings >> report out of sync >> blocks. We are running an active/passive cluster with locally attached >> storage. >> >> We were not able to reproduce this behaviour when using cache="writethrough" >> or cache="writeback". >> >> We are running this setup since 2011/2012. The first years we were fine but >> about 3 years ago >> we run into serious trouble because out-of-sync blocks lead to damaged file >> system (journals). >> >> The issue was discussed in 2014: >> >> https://lists.gt.net/drbd/users/25227 >> >> We love(d) DRBD because of its simplicity and reliability. (Ceph is much >> more complex...) >> However we wonder whether DRBD can still be considered that kind of "simple >> and reliable" it >> was some years ago. > > It sounds like we have the exact same setup and same problems, but > >> Even if the situation might be introduced by virtio block driver >> optimizations some years ago >> (no stable pages anymore?) a solution is needed. > > I don't think it was introduced by virtio block. > When I use the drbd device locally mounted, e.g. for a LXC root-fs, I > can reproduce the out-of-sync as well.
Is there something else we can test? Could a config setting causing this? We use mostly the defaults. Any help is welcome. Armin
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user