[ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses

Adam Boyhan Fri, 22 Jan 2021 11:32:50 -0800

The steps are pretty straight forward. 

- Create rbd image of 500G on the primary 
- Enable rbd-mirror snapshot on the image 
- Map the image on the primary 
- Format the block device with ext4 
- Mount it and write out 200-300G worth of data (I am using rsync with some 
local real data we have) 
- Unmap the image from the primary 
- Create rdb snapshot 
- Create rdb mirror snapshot 
- Wait for copy process to complete 
- Clone the rdb snapshot on secondary 
- Map the image on secondary 
- Try to mount on secondary


Just as a reference. All of my nodes are the same. 

root@Bunkcephtest1:~# ceph --version 
ceph version 15.2.8 (8b89984e92223ec320fb4c70589c39f384c86985) octopus (stable) 

root@Bunkcephtest1:~# dpkg -l | grep rbd-mirror 
ii rbd-mirror 15.2.8-pve2 amd64 Ceph daemon for mirroring RBD images 

This is pretty straight forward, I don't know what I could be missing here. 



From: "Jason Dillaman" <jdill...@redhat.com> 
To: "adamb" <ad...@medent.com> 
Cc: "ceph-users" <ceph-users@ceph.io>, "Matt Wilder" <matt.wil...@bitmex.com> 
Sent: Friday, January 22, 2021 2:11:36 PM 
Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses 

Any chance you could write a small reproducer test script? I can't 
repeat what you are seeing and we do have test cases that really 
hammer random IO on primary images, create snapshots, rinse-and-repeat 
and they haven't turned up anything yet. 

Thanks! 

On Fri, Jan 22, 2021 at 1:50 PM Adam Boyhan <ad...@medent.com> wrote: 
> 
> I have been doing a lot of testing. 
> 
> The size of the RBD image doesn't have any effect. 
> 
> I run into the issue once I actually write data to the rbd. The more data I 
> write out, the larger the chance of reproducing the issue. 
> 
> I seem to hit the issue of missing the filesystem all together the most, but 
> I have also had a few instances where some of the data was simply missing. 
> 
> I monitor the mirror status on the remote cluster until the snapshot is 100% 
> copied and also make sure all the IO is done. My setup has no issue maxing 
> out my 10G interconnect during replication, so its pretty obvious once its 
> done. 
> 
> The only way I have found to resolve the issue is to call a mirror resync on 
> the secondary array. 
> 
> I can then map the rbd on the primary, write more data to it, snap it again, 
> and I am back in the same position. 
> 
> ________________________________ 
> From: "adamb" <ad...@medent.com> 
> To: "dillaman" <dilla...@redhat.com> 
> Cc: "ceph-users" <ceph-users@ceph.io>, "Matt Wilder" <matt.wil...@bitmex.com> 
> Sent: Thursday, January 21, 2021 3:11:31 PM 
> Subject: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses 
> 
> Sure thing. 
> 
> root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-1 
> SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE 
> 12192 TestSnapper1 2 TiB Thu Jan 21 14:15:02 2021 user 
> 12595 
> .mirror.non_primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.34c4a53e-9525-446c-8de6-409ea93c5edd
>  2 TiB Thu Jan 21 15:05:02 2021 mirror (non-primary peer_uuids:[] 
> 6c26557e-d011-47b1-8c99-34cf6e0c7f2f:12801 copied) 
> 
> 
> root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1 
> vm-100-disk-1: 
> global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403 
> state: up+replaying 
> description: replaying, 
> {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611259501,"remote_snapshot_timestamp":1611259501,"replay_state":"idle"}
>  
> service: admin on Bunkcephmon1 
> last_update: 2021-01-21 15:06:24 
> peer_sites: 
> name: ccs 
> state: up+stopped 
> description: local image is primary 
> last_update: 2021-01-21 15:06:23 
> 
> 
> root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1 
> CephTestPool1/vm-100-disk-1-CLONE 
> root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE 
> /dev/nbd0 
> root@Bunkcephtest1:~# blkid /dev/nbd0 
> root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 
> mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing 
> codepage or helper program, or other error. 
> 
> 
> Primary still looks good. 
> 
> root@Ccscephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1 
> CephTestPool1/vm-100-disk-1-CLONE 
> root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE 
> /dev/nbd0 
> root@Ccscephtest1:~# blkid /dev/nbd0 
> /dev/nbd0: UUID="830b8e05-d5c1-481d-896d-14e21d17017d" TYPE="ext4" 
> root@Ccscephtest1:~# mount /dev/nbd0 /usr2 
> root@Ccscephtest1:~# cat /proc/mounts | grep nbd0 
> /dev/nbd0 /usr2 ext4 rw,relatime 0 0 
> 
> 
> 
> 
> 
> 
> From: "Jason Dillaman" <jdill...@redhat.com> 
> To: "adamb" <ad...@medent.com> 
> Cc: "Eugen Block" <ebl...@nde.ag>, "ceph-users" <ceph-users@ceph.io>, "Matt 
> Wilder" <matt.wil...@bitmex.com> 
> Sent: Thursday, January 21, 2021 3:01:46 PM 
> Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses 
> 
> On Thu, Jan 21, 2021 at 11:51 AM Adam Boyhan <ad...@medent.com> wrote: 
> > 
> > I was able to trigger the issue again. 
> > 
> > - On the primary I created a snap called TestSnapper for disk vm-100-disk-1 
> > - Allowed the next RBD-Mirror scheduled snap to complete 
> > - At this point the snapshot is showing up on the remote side. 
> > 
> > root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1 
> > vm-100-disk-1: 
> > global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403 
> > state: up+replaying 
> > description: replaying, 
> > {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611247200,"remote_snapshot_timestamp":1611247200,"replay_state":"idle"}
> >  
> > service: admin on Bunkcephmon1 
> > last_update: 2021-01-21 11:46:24 
> > peer_sites: 
> > name: ccs 
> > state: up+stopped 
> > description: local image is primary 
> > last_update: 2021-01-21 11:46:28 
> > 
> > root@Ccscephtest1:/etc/pve/priv# rbd snap ls --all 
> > CephTestPool1/vm-100-disk-1 
> > SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE 
> > 11532 TestSnapper 2 TiB Thu Jan 21 11:21:25 2021 user 
> > 11573 
> > .mirror.primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.9525e4eb-41c0-499c-8879-0c7d9576e253
> >  2 TiB Thu Jan 21 11:35:00 2021 mirror (primary 
> > peer_uuids:[debf975b-ebb8-432c-a94a-d3b101e0f770]) 
> > 
> > Seems like the sync is complete, So I then clone it, map it and attempt to 
> > mount it. 
> 
> Can you run "snap ls --all" on the non-primary cluster? The 
> non-primary snapshot will list its status. On my cluster (with a much 
> smaller image): 
> 
> # 
> # CLUSTER 1 
> # 
> $ rbd --cluster cluster1 create --size 1G mirror/image1 
> $ rbd --cluster cluster1 mirror image enable mirror/image1 snapshot 
> Mirroring enabled 
> $ rbd --cluster cluster1 device map -t nbd mirror/image1 
> /dev/nbd0 
> $ mkfs.ext4 /dev/nbd0 
> mke2fs 1.45.5 (07-Jan-2020) 
> Discarding device blocks: done 
> Creating filesystem with 262144 4k blocks and 65536 inodes 
> Filesystem UUID: 50e0da12-1f99-4d45-b6e6-5f7a7decaeff 
> Superblock backups stored on blocks: 
> 32768, 98304, 163840, 229376 
> 
> Allocating group tables: done 
> Writing inode tables: done 
> Creating journal (8192 blocks): done 
> Writing superblocks and filesystem accounting information: done 
> $ blkid /dev/nbd0 
> /dev/nbd0: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff" 
> BLOCK_SIZE="4096" TYPE="ext4" 
> $ rbd --cluster cluster1 snap create mirror/image1@fs 
> Creating snap: 100% complete...done. 
> $ rbd --cluster cluster1 mirror image snapshot mirror/image1 
> Snapshot ID: 6 
> $ rbd --cluster cluster1 snap ls --all mirror/image1 
> SNAPID NAME 
> SIZE PROTECTED TIMESTAMP 
> NAMESPACE 
> 5 fs 
> 1 GiB Thu Jan 21 14:50:24 2021 
> user 
> 6 
> .mirror.primary.f9f692b8-2405-416c-9247-5628e303947a.39722e17-f7e6-4050-acf0-3842a5620d81
>  
> 1 GiB Thu Jan 21 14:50:51 2021 mirror (primary 
> peer_uuids:[cd643f30-4982-4caf-874d-cf21f6f4b66f]) 
> 
> # 
> # CLUSTER 2 
> # 
> 
> $ rbd --cluster cluster2 mirror image status mirror/image1 
> image1: 
> global_id: f9f692b8-2405-416c-9247-5628e303947a 
> state: up+replaying 
> description: replaying, 
> {"bytes_per_second":1140872.53,"bytes_per_snapshot":17113088.0,"local_snapshot_timestamp":1611258651,"remote_snapshot_timestamp":1611258651,"replay_state":"idle"}
>  
> service: mirror.0 on cube-1 
> last_update: 2021-01-21 14:51:18 
> peer_sites: 
> name: cluster1 
> state: up+stopped 
> description: local image is primary 
> last_update: 2021-01-21 14:51:27 
> $ rbd --cluster cluster2 snap ls --all mirror/image1 
> SNAPID NAME 
> SIZE PROTECTED TIMESTAMP 
> NAMESPACE 
> 5 fs 
> 1 GiB Thu Jan 21 14:50:52 
> 2021 user 
> 6 
> .mirror.non_primary.f9f692b8-2405-416c-9247-5628e303947a.0a13b822-0508-47d6-a460-a8cc4e012686
>  
> 1 GiB Thu Jan 21 14:50:53 2021 mirror (non-primary 
> peer_uuids:[] 9824df2b-86c4-4264-a47e-cf968efd09e1:6 copied) 
> $ rbd --cluster cluster2 --rbd-default-clone-format 2 clone 
> mirror/image1@fs mirror/image2 
> $ rbd --cluster cluster2 device map -t nbd mirror/image2 
> /dev/nbd1 
> $ blkid /dev/nbd1 
> /dev/nbd1: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff" 
> BLOCK_SIZE="4096" TYPE="ext4" 
> $ mount /dev/nbd1 /mnt/ 
> $ mount | grep nbd 
> /dev/nbd1 on /mnt type ext4 (rw,relatime,seclabel) 
> 
> > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper 
> > CephTestPool1/vm-100-disk-1-CLONE 
> > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE --id 
> > admin --keyring /etc/ceph/ceph.client.admin.keyring 
> > /dev/nbd0 
> > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 
> > mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, 
> > missing codepage or helper program, or other error. 
> > 
> > On the primary still no issues 
> > 
> > root@Ccscephtest1:/etc/pve/priv# rbd clone 
> > CephTestPool1/vm-100-disk-1@TestSnapper CephTestPool1/vm-100-disk-1-CLONE 
> > root@Ccscephtest1:/etc/pve/priv# rbd-nbd map 
> > CephTestPool1/vm-100-disk-1-CLONE --id admin --keyring 
> > /etc/ceph/ceph.client.admin.keyring 
> > /dev/nbd0 
> > root@Ccscephtest1:/etc/pve/priv# mount /dev/nbd0 /usr2 
> > 
> > 
> > 
> > 
> > 
> > ________________________________ 
> > From: "Jason Dillaman" <jdill...@redhat.com> 
> > To: "adamb" <ad...@medent.com> 
> > Cc: "Eugen Block" <ebl...@nde.ag>, "ceph-users" <ceph-users@ceph.io>, "Matt 
> > Wilder" <matt.wil...@bitmex.com> 
> > Sent: Thursday, January 21, 2021 9:42:26 AM 
> > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses 
> > 
> > On Thu, Jan 21, 2021 at 9:40 AM Adam Boyhan <ad...@medent.com> wrote: 
> > > 
> > > After the resync finished. I can mount it now. 
> > > 
> > > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 
> > > CephTestPool1/vm-100-disk-0-CLONE 
> > > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id 
> > > admin --keyring /etc/ceph/ceph.client.admin.keyring 
> > > /dev/nbd0 
> > > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 
> > > 
> > > Makes me a bit nervous how it got into that position and everything 
> > > appeared ok. 
> > 
> > We unfortunately need to create the snapshots that are being synced as 
> > a first step, but perhaps there are some extra guardrails we can put 
> > on the system to prevent premature usage if the sync status doesn't 
> > indicate that it's complete. 
> > 
> > > ________________________________ 
> > > From: "Jason Dillaman" <jdill...@redhat.com> 
> > > To: "adamb" <ad...@medent.com> 
> > > Cc: "Eugen Block" <ebl...@nde.ag>, "ceph-users" <ceph-users@ceph.io>, 
> > > "Matt Wilder" <matt.wil...@bitmex.com> 
> > > Sent: Thursday, January 21, 2021 9:25:11 AM 
> > > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses 
> > > 
> > > On Thu, Jan 21, 2021 at 8:34 AM Adam Boyhan <ad...@medent.com> wrote: 
> > > > 
> > > > When cloning the snapshot on the remote cluster I can't see my ext4 
> > > > filesystem. 
> > > > 
> > > > Using the same exact snapshot on both sides. Shouldn't this be 
> > > > consistent? 
> > > 
> > > Yes. Has the replication process completed ("rbd mirror image status 
> > > CephTestPool1/vm-100-disk-0")? 
> > > 
> > > > Primary Site 
> > > > root@Ccscephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 | 
> > > > grep TestSnapper1 
> > > > 10621 TestSnapper1 2 TiB Thu Jan 21 08:15:22 2021 user 
> > > > 
> > > > root@Ccscephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 
> > > > CephTestPool1/vm-100-disk-0-CLONE 
> > > > root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id 
> > > > admin --keyring /etc/ceph/ceph.client.admin.keyring 
> > > > /dev/nbd0 
> > > > root@Ccscephtest1:~# mount /dev/nbd0 /usr2 
> > > > 
> > > > Secondary Site 
> > > > root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 | 
> > > > grep TestSnapper1 
> > > > 10430 TestSnapper1 2 TiB Thu Jan 21 08:20:08 2021 user 
> > > > 
> > > > root@Bunkcephtest1:~# rbd clone 
> > > > CephTestPool1/vm-100-disk-0@TestSnapper1 
> > > > CephTestPool1/vm-100-disk-0-CLONE 
> > > > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE 
> > > > --id admin --keyring /etc/ceph/ceph.client.admin.keyring 
> > > > /dev/nbd0 
> > > > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 
> > > > mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, 
> > > > missing codepage or helper program, or other error. 
> > > > 
> > > > 
> > > > 
> > > > ________________________________ 
> > > > From: "adamb" <ad...@medent.com> 
> > > > To: "dillaman" <dilla...@redhat.com> 
> > > > Cc: "Eugen Block" <ebl...@nde.ag>, "ceph-users" <ceph-users@ceph.io>, 
> > > > "Matt Wilder" <matt.wil...@bitmex.com> 
> > > > Sent: Wednesday, January 20, 2021 3:42:46 PM 
> > > > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses 
> > > > 
> > > > Awesome information. I new I had to be missing something. 
> > > > 
> > > > All of my clients will be far newer than mimic so I don't think that 
> > > > will be an issue. 
> > > > 
> > > > Added the following to my ceph.conf on both clusters. 
> > > > 
> > > > rbd_default_clone_format = 2 
> > > > 
> > > > root@Bunkcephmon2:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 
> > > > CephTestPool2/vm-100-disk-0-CLONE 
> > > > root@Bunkcephmon2:~# rbd ls CephTestPool2 
> > > > vm-100-disk-0-CLONE 
> > > > 
> > > > I am sure I will be back with more questions. Hoping to replace our 
> > > > Nimble storage with Ceph and NVMe. 
> > > > 
> > > > Appreciate it! 
> > > > 
> > > > ________________________________ 
> > > > From: "Jason Dillaman" <jdill...@redhat.com> 
> > > > To: "adamb" <ad...@medent.com> 
> > > > Cc: "Eugen Block" <ebl...@nde.ag>, "ceph-users" <ceph-users@ceph.io>, 
> > > > "Matt Wilder" <matt.wil...@bitmex.com> 
> > > > Sent: Wednesday, January 20, 2021 3:28:39 PM 
> > > > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses 
> > > > 
> > > > On Wed, Jan 20, 2021 at 3:10 PM Adam Boyhan <ad...@medent.com> wrote: 
> > > > > 
> > > > > That's what I though as well, specially based on this. 
> > > > > 
> > > > > 
> > > > > 
> > > > > Note 
> > > > > 
> > > > > You may clone a snapshot from one pool to an image in another pool. 
> > > > > For example, you may maintain read-only images and snapshots as 
> > > > > templates in one pool, and writeable clones in another pool. 
> > > > > 
> > > > > root@Bunkcephmon2:~# rbd clone 
> > > > > CephTestPool1/vm-100-disk-0@TestSnapper1 
> > > > > CephTestPool2/vm-100-disk-0-CLONE 
> > > > > 2021-01-20T15:06:35.854-0500 7fb889ffb700 -1 
> > > > > librbd::image::CloneRequest: 0x55c7cf8417f0 validate_parent: parent 
> > > > > snapshot must be protected 
> > > > > 
> > > > > root@Bunkcephmon2:~# rbd snap protect 
> > > > > CephTestPool1/vm-100-disk-0@TestSnapper1 
> > > > > rbd: protecting snap failed: (30) Read-only file system 
> > > > 
> > > > You have two options: (1) protect the snapshot on the primary image so 
> > > > that the protection status replicates or (2) utilize RBD clone v2 
> > > > which doesn't require protection but does require Mimic or later 
> > > > clients [1]. 
> > > > 
> > > > > 
> > > > > From: "Eugen Block" <ebl...@nde.ag> 
> > > > > To: "adamb" <ad...@medent.com> 
> > > > > Cc: "ceph-users" <ceph-users@ceph.io>, "Matt Wilder" 
> > > > > <matt.wil...@bitmex.com> 
> > > > > Sent: Wednesday, January 20, 2021 3:00:54 PM 
> > > > > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses 
> > > > > 
> > > > > But you should be able to clone the mirrored snapshot on the remote 
> > > > > cluster even though it’s not protected, IIRC. 
> > > > > 
> > > > > 
> > > > > Zitat von Adam Boyhan <ad...@medent.com>: 
> > > > > 
> > > > > > Two separate 4 node clusters with 10 OSD's in each node. Micron 
> > > > > > 9300 
> > > > > > NVMe's are the OSD drives. Heavily based on the Micron/Supermicro 
> > > > > > white papers. 
> > > > > > 
> > > > > > When I attempt to protect the snapshot on a remote image, it errors 
> > > > > > with read only. 
> > > > > > 
> > > > > > root@Bunkcephmon2:~# rbd snap protect 
> > > > > > CephTestPool1/vm-100-disk-0@TestSnapper1 
> > > > > > rbd: protecting snap failed: (30) Read-only file system 
> > > > > > _______________________________________________ 
> > > > > > ceph-users mailing list -- ceph-users@ceph.io 
> > > > > > To unsubscribe send an email to ceph-users-le...@ceph.io 
> > > > > _______________________________________________ 
> > > > > ceph-users mailing list -- ceph-users@ceph.io 
> > > > > To unsubscribe send an email to ceph-users-le...@ceph.io 
> > > > 
> > > > [1] https://ceph.io/community/new-mimic-simplified-rbd-image-cloning/ 
> > > > 
> > > > -- 
> > > > Jason 
> > > > 
> > > 
> > > 
> > > -- 
> > > Jason 
> > 
> > 
> > 
> > -- 
> > Jason 
> 
> 
> 
> -- 
> Jason 
> _______________________________________________ 
> ceph-users mailing list -- ceph-users@ceph.io 
> To unsubscribe send an email to ceph-users-le...@ceph.io 



-- 
Jason 
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses

Reply via email to