Hi again.

It seems I've found the problem, although I don't understand the root cause.

I looked into OSD datastore using ceph-objectstore-tool and I see that for almost every object there are two copies, like:

2#13:080008d8:::rbd_data.15.3d3e1d6b8b4567.0000000000361a96:28#
2#13:080008d8:::rbd_data.15.3d3e1d6b8b4567.0000000000361a96:head#

And more interesting is the fact that these two copies don't differ (!).

So the space is taken up by the unneeded snapshot copies.

rbd_data.15.3d3e1d6b8b4567 is the prefix of the biggest (14 TB) base image we have. This image has 1 snapshot:

[root@sill-01 ~]# rbd info rpool_hdd/rms-201807-golden
rbd image 'rms-201807-golden':
        size 14 TiB in 3670016 objects
        order 22 (4 MiB objects)
        id: 3d3e1d6b8b4567
        data_pool: ecpool_hdd
        block_name_prefix: rbd_data.15.3d3e1d6b8b4567
        format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, data-pool
        op_features:
        flags:
        create_timestamp: Tue Aug  7 13:00:10 2018
[root@sill-01 ~]# rbd snap ls rpool_hdd/rms-201807-golden
SNAPID NAME      SIZE TIMESTAMP
    37 initial 14 TiB Tue Aug 14 12:42:48 2018

The problem is this image has NEVER been written to after importing it to Ceph with RBD. All writes go only to its clones.

So I have 2.. no, 5 questions:

1) Why base image snapshot is "provisioned" while the image isn't written to? May it be related to `rbd snap revert`? (i.e. does rbd snap revert just copy all snapshot data into the image itself?)

2) If all parent snapshots seem to be forcefully provisioned on write: Is there a way to disable this behaviour? Maybe if I make the base image readonly its snapshots will stop to be "provisioned"?

3) Even if there is no way to disable it: why does Ceph create extra copy of equal snapshot data during rebalance?

4) What's ":28" in rados objects? Snapshot id is 37. Even in hex 0x28 = 40, not 37. Or does RADOS snapshot id not need to be equal to RBD snapshot ID?

5) Am I safe to "unprovision" the snapshot? (for example, by doing `rbd snap revert`?)
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to