Hello,

We would like to run a RAID1 between a local storage and a RBD device. This 
would allow us to sustain network failures or Ceph failures and also give 
better read performance as we would set it up with write-mostly on RBD in mdadm.

Basically we would like to implement 
https://discord.com/blog/how-discord-supercharges-network-disks-for-extreme-low-latency.

RAID1 is working well but if there is timeouts, the RBD volume won't fail and 
mdadm will not catch the broken device. Also the writes then hangs waiting for 
the network/RBD to come back. If we force unmap the RBD device then it fails as 
expected and writes can continue on other RAID1 device.

We tried setting the `osd_request_timeout` to a small value (3 or 2 seconds) 
but it only gives us timeout in kernel logs:

```
libceph: tid 25792 on osd39 timeout
rbd: rbd0: write at objno 602 0~512 result -110
rbd: rbd0: write result -110
print_req_error: 15 callbacks suppressed
blk_update_request: timeout error, dev rbd0, sector 4931584 op 0x1:(WRITE) 
flags 0x800 phys_seg 1 prio class 0
libceph: tid 25794 on osd39 timeout
rbd: rbd0: write at objno 602 512~512 result -110
rbd: rbd0: write result -110
blk_update_request: timeout error, dev rbd0, sector 4931585 op 0x1:(WRITE) 
flags 0x800 phys_seg 1 prio class 0
```

Is there something that we missed or is it currently impossible with kRBD to 
kind of "fail fast" in case of timeout and unmap/remove associated RBD devices? 
Or is there another client that can do what we want (ceph-nbd or with librbd)?

We found this rook issue that is not really helpful but give insight 
https://github.com/rook/rook/issues/376.

Thanks!

--
Mathias Chapelain
Storage Engineer
Proton AG
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to