Hello.

I'm currently verifying the behavior of RBD on failure. I'm wondering
about the consistency of RBD images after network failures. As a
result of my investigation, I found that RBD sets a Watcher to RBD
image if a client mounts this volume to prevent multiple mounts. In
addition, I found that if the client is isolated from the network for
a long time, the Watcher is released. However, the client still mounts
this image. In this situation, if another client can also mount this
image and the image is writable from both clients, data corruption
occurs. Could you tell me whether this is a realistic scenario?

I tested the following case on which the watcher was released by hand
and detected data corruption.

1. Release the Watcher on the node (A) that mounts the RBD using the
`ceph osd blocklist add` command
2. Another node (B) mounts the RBD volume.
3. Unblock using the `ceph osd blocklist rm` command
4. Write from node (B) (write successfully)
5. Write from node (A) (can be written successfully from the
application's point of view. In fact, the write fails)
6. Write content at node (A) is lost.

In this case, I released the watcher by hand to emulate the timeout
due to network failure. It's because I couldn't emulate real network
failure in this test.

I considered using exclusive lock and restricting access to those from
a single node. However, we gave up on that as blocking writes entirely
would make snapshots non-functional.

The version of Ceph we are using is v17.2.6.

Best regards,
Yuma.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to