[linux-lvm] exposing snapshot block device
Hi When you create a snapshot of a logical volume. A new virtual dm- device will be created with the content of the changes from the origin. This cow device can than be used to read changed contents etc. In case of an incident, this cow device can be used to read back the changed content to its origin using the "lvmerge" command. The question I have is if there is a way to couple an external cow device to an empty equaly sized logical volume, so that the empty logical volume is aware of that all changed content are placed on this attached cow device? If that is possible, than it will help making instant recovery of LV volumes from an external source using native lvmerge command, from for example a backup server. [EMPTY LOGICAL VOLUME] ^ | lvmerge | [ATTACHED COW DEVICE] Regards Tomas ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 22. 10. 19 v 12:47 Dalebjörk, Tomas napsal(a): Hi When you create a snapshot of a logical volume. A new virtual dm- device will be created with the content of the changes from the origin. This cow device can than be used to read changed contents etc. In case of an incident, this cow device can be used to read back the changed content to its origin using the "lvmerge" command. The question I have is if there is a way to couple an external cow device to an empty equaly sized logical volume, so that the empty logical volume is aware of that all changed content are placed on this attached cow device? If that is possible, than it will help making instant recovery of LV volumes from an external source using native lvmerge command, from for example a backup server. For most info how old snapshot for so called 'thick' LVs works - check these papers: http://people.redhat.com/agk/talks/ lvconvert --merge is in fact 'instant' operation - when it happens - you can immediately access 'already merged' content while the merge is happening in the background (you can look for copies percentage in lvs command) However 'thick' LVs with old snapshots are rather 'dated' technology you should probably checkout the usage of thinly provisioned LVs. Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Thanks for feedback, I know that thick LV snapshots are out dated, and that one should use thin LV snapshots. But my understanding is that the dm- cow and dm - origin are still present and available in thin too? Example of a scenario: 1. Create a snapshot of LV testlv with the name snaplv 2. Perform a full copy of the snaplv using for example dd to a block device 3. Delete the snapshot Now I would like to re-attach this external block device as a snapshot again. After all, it is just a dm and LVM config, right? So for example: 1. create a snapshot of testlv with the name snaplv 2. re create the -cow meta data device : ... Recreate this -cow meta data device by telling the origin that all data has been changed and are in the cow device (the raw device) 3. If the above were possible to perform, than it could be possible to instantly get at copy of the LV data using the lvconvert --merge command I have already invented a way to perform "block level incremental forever"; using the -cow device. And a possibility to reverse the blocks, to copy back only changed content from external devices. But, it would be better if the cow device could be recreated in a faster way, mentioning that all blocks are present on an external device, so that the LV volume can be restored much quicker using "lvconvert --merge" command. That would be super cool! Imagine backing up multi terrabyte sized volumes in minutes to external destinations, and restoring the data in seconds using instant recovery by re-creating or emulating the cow device, and associating all blocks to an external device? Regards Tomas Den 2019-10-22 kl. 15:57, skrev Zdenek Kabelac: Dne 22. 10. 19 v 12:47 Dalebjörk, Tomas napsal(a): Hi When you create a snapshot of a logical volume. A new virtual dm- device will be created with the content of the changes from the origin. This cow device can than be used to read changed contents etc. In case of an incident, this cow device can be used to read back the changed content to its origin using the "lvmerge" command. The question I have is if there is a way to couple an external cow device to an empty equaly sized logical volume, so that the empty logical volume is aware of that all changed content are placed on this attached cow device? If that is possible, than it will help making instant recovery of LV volumes from an external source using native lvmerge command, from for example a backup server. For most info how old snapshot for so called 'thick' LVs works - check these papers: http://people.redhat.com/agk/talks/ lvconvert --merge is in fact 'instant' operation - when it happens - you can immediately access 'already merged' content while the merge is happening in the background (you can look for copies percentage in lvs command) However 'thick' LVs with old snapshots are rather 'dated' technology you should probably checkout the usage of thinly provisioned LVs. Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 22. 10. 19 v 17:29 Dalebjörk, Tomas napsal(a): Thanks for feedback, But, it would be better if the cow device could be recreated in a faster way, mentioning that all blocks are present on an external device, so that the LV volume can be restored much quicker using "lvconvert --merge" command. That would be super cool! Imagine backing up multi terrabyte sized volumes in minutes to external destinations, and restoring the data in seconds using instant recovery by re-creating or emulating the cow device, and associating all blocks to an external device? Hi I do not want to break your imagination here, but that is exactly the thing you can do with thin provisioning and thin_delta tool. You just work with LV, take snapshot1, take snapshot2, send delta between s1 -> s2 to remove machine, remove s1, take s3, send delta s2 -> s3... It's just not automated by lvm2 ATM... Using this with old snapshot would be insanely inefficient... Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
That is cool, But, are there any practical example how this could be working in reality. Eg: lvcreate -s mysnap vg/testlv thin_dump vg/mysnap > deltafile # I assume that this should be the name of the snapshot? But... How to recreate only the metadata only?, so that the meta data changes are associated to an external device? thin_restore -i metadata < deltafile # that will restore the metadata, but I also want the restored meta data to point out the location of the data from for example a file or a raw deice I have created a way to perform block level incremental forever by reading the -cow device, and thin_dump would be nice replacement for that. This can also be reversed, so that the thin_restore can be used to restore the meta data and the data at same time (If I now the format of it) But it would be much more better if one can do the restoration in background using "lvconvert --merge" tool, by first restoring the metadata (I can understand that this part is needed), and assoicate all the data to an external raw disk or much more better a file, so that all changes associated to this restored snapshot can be found on the file. Not so good to explain this, but I hope you understand how I am thinking. A destroyed thin pool, can than be restored instantly using a backup server as the cow similar device. Regards Tomas Den 2019-10-22 kl. 17:36, skrev Zdenek Kabelac: Dne 22. 10. 19 v 17:29 Dalebjörk, Tomas napsal(a): Thanks for feedback, But, it would be better if the cow device could be recreated in a faster way, mentioning that all blocks are present on an external device, so that the LV volume can be restored much quicker using "lvconvert --merge" command. That would be super cool! Imagine backing up multi terrabyte sized volumes in minutes to external destinations, and restoring the data in seconds using instant recovery by re-creating or emulating the cow device, and associating all blocks to an external device? Hi I do not want to break your imagination here, but that is exactly the thing you can do with thin provisioning and thin_delta tool. You just work with LV, take snapshot1, take snapshot2, send delta between s1 -> s2 to remove machine, remove s1, take s3, send delta s2 -> s3... It's just not automated by lvm2 ATM... Using this with old snapshot would be insanely inefficient... Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On Tue, 22 Oct 2019, Zdenek Kabelac wrote: Dne 22. 10. 19 v 17:29 Dalebjörk, Tomas napsal(a): But, it would be better if the cow device could be recreated in a faster way, mentioning that all blocks are present on an external device, so that the LV volume can be restored much quicker using "lvconvert --merge" command. I do not want to break your imagination here, but that is exactly the thing you can do with thin provisioning and thin_delta tool. lvconvert --merge does a "rollback" to the point at which the snapshot was taken. The master LV already has current data. What Tomas wants to be able to do a "rollforward" from the point at which the snapshot was taken. He also wants to be able to put the cow volume on an extern/remote medium, and add a snapshot using an already existing cow. This way, restoring means copying the full volume from backup, creating a snapshot using existing external cow, then lvconvert --merge instantly logically applies the cow changes while updating the master LV. Pros: "Old" snapshots are exactly as efficient as thin when there is exactly one. They only get inefficient with multiple snapshots. On the other hand, thin volumes are as inefficient as an old LV with one snapshot. An old LV is as efficient, and as anti-fragile, as a partition. Thin volumes are much more flexible, but depend on much more fragile database like meta-data. For this reason, I always prefer "old" LVs when the functionality of thin LVs are not actually needed. I can even manually recover from trashed meta data by editing it, as it is human readable text. Updates to the external cow can be pipelined (but then properly handling reads becomes non trivial - there are mature remote block device implementations for linux that will do the job). Cons: For the external cow to be useful, updates to it must be *strictly* serialized. This is doable, but not as obvious or trivial as it might seem at first glance. (Remote block device software will take care of this as well.) The "rollforward" must be applied to the backup image of the snapshot. If the admin gets it paired with the wrong backup, massive corruption ensues. This could be automated. E.g. the full image backup and external cow would have unique matching names. Or the full image backup could compute an md5 in parallel, which would be store with the cow. But none of those tools currently exist. -- Stuart D. Gathman "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial.___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Hi, Il 22-10-2019 18:15 Stuart D. Gathman ha scritto: "Old" snapshots are exactly as efficient as thin when there is exactly one. They only get inefficient with multiple snapshots. On the other hand, thin volumes are as inefficient as an old LV with one snapshot. An old LV is as efficient, and as anti-fragile, as a partition. Thin volumes are much more flexible, but depend on much more fragile database like meta-data. this is both true and false: while in the single-snapshot case performance remains acceptable even from fat snapshots, the btree representation (and more modern code) of the "new" (7+ years old now) thin snapshots gurantees significantly higher performance, at least on my tests. Note #1: I know that the old snapshot code uses 4K chunks by default, versus the 64K chunks of thinsnap. That said, I recorded higher thinsnap performance even when using a 64K chunk size for old fat snapshots. Note #2: I generally disable thinpool zeroing (as I use a filesystem layer on top of thin volumes). I 100% agree that old LVM code, with its plain text metadata and continuous plain-text backups, is extremely reliable and easy to fix/correct. For this reason, I always prefer "old" LVs when the functionality of thin LVs are not actually needed. I can even manually recover from trashed meta data by editing it, as it is human readable text. My main use of fat logical volumes is for boot and root filesystems, while thin vols (and zfs datasets, but this is another story...) are used for data partitions. The main thing that somewhat scares me is that (if things had not changed) thinvol uses a single root btree node: losing it means losing *all* thin volumes of a specific thin pool. Coupled with the fact that metadata dump are not as handy as with the old LVM code (no vgcfgrestore), it worries me. The "rollforward" must be applied to the backup image of the snapshot. If the admin gets it paired with the wrong backup, massive corruption ensues. This could be automated. E.g. the full image backup and external cow would have unique matching names. Or the full image backup could compute an md5 in parallel, which would be store with the cow. But none of those tools currently exist. This is the reason why I have not used thin_delta in production: an error from my part in recovering the volume (ie: applying the wrong delta) would cause massive data corruption. My current setup for instant recovery *and* added resiliance is somewhat similar to that: RAID -> DRBD -> THINPOOL -> THINVOL w/periodic snapshots (with the DRBD layer replicating to a sibling machine). Regards. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.da...@assyoma.it - i...@assyoma.it GPG public key ID: FF5F32A8 ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On Tue, 22 Oct 2019, Gionatan Danti wrote: The main thing that somewhat scares me is that (if things had not changed) thinvol uses a single root btree node: losing it means losing *all* thin volumes of a specific thin pool. Coupled with the fact that metadata dump are not as handy as with the old LVM code (no vgcfgrestore), it worries me. If you can find all the leaf nodes belonging to the root (in my btree database they are marked with the root id and can be found by sequential scan of the volume), then reconstructing the btree data is straightforward - even in place. I remember realizing this was the only way to recover a major customer's data - and had the utility written, tested, and applied in a 36 hour programming marathon (which I hope to never repeat). If this hasn't occured to thin pool programmers, I am happy to flesh out the procedure. Having such a utility available as a last resort would ratchet up the reliability of thin pools. -- Stuart D. Gathman "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Il 23-10-2019 00:53 Stuart D. Gathman ha scritto: If you can find all the leaf nodes belonging to the root (in my btree database they are marked with the root id and can be found by sequential scan of the volume), then reconstructing the btree data is straightforward - even in place. I remember realizing this was the only way to recover a major customer's data - and had the utility written, tested, and applied in a 36 hour programming marathon (which I hope to never repeat). If this hasn't occured to thin pool programmers, I am happy to flesh out the procedure. Having such a utility available as a last resort would ratchet up the reliability of thin pools. Very interesting. Can I ask you what product/database you recovered? Anyway, giving similar ability to thin Vols would be awesome. Thanks. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.da...@assyoma.it - i...@assyoma.it GPG public key ID: FF5F32A8 ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Thanks for feedbak. Think of lvmsync as a tool, which reads the block changes from the cow device. ... Lets assume that I am able to recreate this cow format instantly back to the server, and present this as a file with the name "cowfile" on the file system for simplicity. Is it possible than in some way, to use this cowfile in someway to inform LVM about the location of the snapshot area, so that lvconvert --merge can be used to restore the data quicker, using this cowfile. The cowfile will include all blocks for the logical volume. Regards Tomas Den tis 22 okt. 2019 kl 18:15 skrev Stuart D. Gathman : > On Tue, 22 Oct 2019, Zdenek Kabelac wrote: > > > Dne 22. 10. 19 v 17:29 Dalebjörk, Tomas napsal(a): > >> But, it would be better if the cow device could be recreated in a > faster > >> way, mentioning that all blocks are present on an external device, so > that > >> the LV volume can be restored much quicker using "lvconvert --merge" > >> command. > > > I do not want to break your imagination here, but that is exactly the > thing > > you can do with thin provisioning and thin_delta tool. > > lvconvert --merge does a "rollback" to the point at which the snapshot > was taken. The master LV already has current data. What Tomas wants to > be able to do a "rollforward" from the point at which the snapshot was > taken. He also wants to be able to put the cow volume on an > extern/remote medium, and add a snapshot using an already existing cow. > > This way, restoring means copying the full volume from backup, creating > a snapshot using existing external cow, then lvconvert --merge > instantly logically applies the cow changes while updating the master > LV. > > Pros: > > "Old" snapshots are exactly as efficient as thin when there is exactly > one. They only get inefficient with multiple snapshots. On the other > hand, thin volumes are as inefficient as an old LV with one snapshot. > An old LV is as efficient, and as anti-fragile, as a partition. Thin > volumes are much more flexible, but depend on much more fragile database > like meta-data. > > For this reason, I always prefer "old" LVs when the functionality of > thin LVs are not actually needed. I can even manually recover from > trashed meta data by editing it, as it is human readable text. > > Updates to the external cow can be pipelined (but then properly > handling reads becomes non trivial - there are mature remote block > device implementations for linux that will do the job). > > Cons: > > For the external cow to be useful, updates to it must be *strictly* > serialized. This is doable, but not as obvious or trivial as it might > seem at first glance. (Remote block device software will take care > of this as well.) > > The "rollforward" must be applied to the backup image of the snapshot. > If the admin gets it paired with the wrong backup, massive corruption > ensues. This could be automated. E.g. the full image backup and > external cow would have unique matching names. Or the full image backup > could compute an md5 in parallel, which would be store with the cow. > But none of those tools currently exist. > > -- > Stuart D. Gathman > "Confutatis maledictis, flamis acribus addictis" - background song for > a Microsoft sponsored "Where do you want to go from here?" commercial. ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 23. 10. 19 v 0:53 Stuart D. Gathman napsal(a): On Tue, 22 Oct 2019, Gionatan Danti wrote: The main thing that somewhat scares me is that (if things had not changed) thinvol uses a single root btree node: losing it means losing *all* thin volumes of a specific thin pool. Coupled with the fact that metadata dump are not as handy as with the old LVM code (no vgcfgrestore), it worries me. If you can find all the leaf nodes belonging to the root (in my btree database they are marked with the root id and can be found by sequential scan of the volume), then reconstructing the btree data is straightforward - even in place. I remember realizing this was the only way to recover a major customer's data - and had the utility written, tested, and applied in a 36 hour programming marathon (which I hope to never repeat). If this hasn't occured to thin pool programmers, I am happy to flesh out the procedure. Having such a utility available as a last resort would ratchet up the reliability of thin pools. There have been made great enhancements in thin_repair tool (>=0.8.5) But of course further fixes and extensions are always welcomed by Joe. There are unfortunately some 'limitations' what can be fixed with current metadata format but lots of troubles we have witnessed in past are now mostly 'covered' by the recent kernel driver. But if there is known case causing troubles - please open BZ so we can look over it. Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 22. 10. 19 v 18:13 Dalebjörk, Tomas napsal(a): That is cool, But, are there any practical example how this could be working in reality. There is not yet a practical example available from our lvm2 team yet. So we are only describing the 'model' & 'plan' we have ATM... I have created a way to perform block level incremental forever by reading the -cow device, and thin_dump would be nice replacement for that. COW is dead technology from our perspective - it can't cope with recent performance of modern drives like NVMe... So our plan is to focus on thinp technology here. Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 22. 10. 19 v 18:15 Stuart D. Gathman napsal(a): On Tue, 22 Oct 2019, Zdenek Kabelac wrote: Dne 22. 10. 19 v 17:29 Dalebjörk, Tomas napsal(a): But, it would be better if the cow device could be recreated in a faster way, mentioning that all blocks are present on an external device, so that the LV volume can be restored much quicker using "lvconvert --merge" command. I do not want to break your imagination here, but that is exactly the thing you can do with thin provisioning and thin_delta tool. lvconvert --merge does a "rollback" to the point at which the snapshot was taken. The master LV already has current data. What Tomas wants to be able to do a "rollforward" from the point at which the snapshot was taken. He also wants to be able to put the cow volume on an extern/remote medium, and add a snapshot using an already existing cow. This way, restoring means copying the full volume from backup, creating a snapshot using existing external cow, then lvconvert --merge instantly logically applies the cow changes while updating the master LV. Pros: "Old" snapshots are exactly as efficient as thin when there is exactly one. They only get inefficient with multiple snapshots. On the other hand, thin volumes are as inefficient as an old LV with one snapshot. An old LV is as efficient, and as anti-fragile, as a partition. Thin volumes are much more flexible, but depend on much more fragile database like meta-data. Just few 'comments' - it's not really comparable - the efficiency of thin-pool metadata outperforms old snapshot in BIG way (there is no point to talk about snapshots that takes just couple of MiB) There is also BIG difference about the usage of old snapshot origin and snapshot. COW of old snapshot effectively cuts performance 1/2 if you write to origin. For this reason, I always prefer "old" LVs when the functionality of thin LVs are not actually needed. I can even manually recover from trashed meta data by editing it, as it is human readable text. On the other hand you can loose COW snapshot at any moment in time if your 'COW' storage is no big enough - this is very different from thin-poo. Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On 23/10/19 12:46, Zdenek Kabelac wrote: Just few 'comments' - it's not really comparable - the efficiency of thin-pool metadata outperforms old snapshot in BIG way (there is no point to talk about snapshots that takes just couple of MiB) Yes, this matches my experience. There is also BIG difference about the usage of old snapshot origin and snapshot. COW of old snapshot effectively cuts performance 1/2 if you write to origin. If used without non-volatile RAID controller, 1/2 is generous - I measured performance as low as 1/5 (with fat snapshot). Talking about thin snapshot, an obvious performance optimization which seems to not be implemented is to skip reading source data when overwriting in larger-than-chunksize blocks. For example, consider a completely filled 64k chunk thin volume (with thinpool having ample free space). Snapshotting it and writing a 4k block on origin will obviously cause a read of the original 64k chunk, an in-memory change of the 4k block and a write of the entire modified 64k block to a new location. But writing, say, a 1 MB block should *not* cause the same read on source: after all, the read data will be immediately discarded, overwritten by the changed 1 MB block. However, my testing shows that source chunks are always read, even when completely overwritten. Am I missing something? -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.da...@assyoma.it - i...@assyoma.it GPG public key ID: FF5F32A8 ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On 23.10.2019 14:08, Gionatan Danti wrote: > > For example, consider a completely filled 64k chunk thin volume (with > thinpool having ample free space). Snapshotting it and writing a 4k > block on origin will obviously cause a read of the original 64k chunk, > an in-memory change of the 4k block and a write of the entire modified > 64k block to a new location. But writing, say, a 1 MB block should *not* > cause the same read on source: after all, the read data will be > immediately discarded, overwritten by the changed 1 MB block. > > However, my testing shows that source chunks are always read, even when > completely overwritten. Not only read but sometimes write. I watched it without snapshot. Only zeroing was enabled. Before wrote new chunks "dd bs=1048576 ..." chunks were zeroed. But for security it's good. IMHO: In this case good choice firstly write chunks to the disk and then give this chunks to the volume. > > Am I missing something? > smime.p7s Description: S/MIME Cryptographic Signature ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 23. 10. 19 v 13:08 Gionatan Danti napsal(a): On 23/10/19 12:46, Zdenek Kabelac wrote: Just few 'comments' - it's not really comparable - the efficiency of thin-pool metadata outperforms old snapshot in BIG way (there is no point to talk about snapshots that takes just couple of MiB) Yes, this matches my experience. There is also BIG difference about the usage of old snapshot origin and snapshot. COW of old snapshot effectively cuts performance 1/2 if you write to origin. If used without non-volatile RAID controller, 1/2 is generous - I measured performance as low as 1/5 (with fat snapshot). Talking about thin snapshot, an obvious performance optimization which seems to not be implemented is to skip reading source data when overwriting in larger-than-chunksize blocks. Hi There is no such optimization possible for old snapshots. You would need to write ONLY to snapshots. As soon as you start to write to origin - you have to 'read' original data from origin, copy them to COW storage, once this is finished, you can overwrite origin data area with your writing I/O. This is simply never going to work fast ;) - the fast way is thin-pool... Old snapshots were designed for 'short' lived snapshots (so you can take a backup of volume which is not being modified underneath). Any idea of improving this old snapshots target are sooner or later going to end-up with thin-pool anyway :) (we've been in this river many many years back in time...) For example, consider a completely filled 64k chunk thin volume (with thinpool having ample free space). Snapshotting it and writing a 4k block on origin There is no support of snapshot of snapshot with old snaps... It would be extremely slow to use... However, my testing shows that source chunks are always read, even when completely overwritten. Am I missing something? Yep - you would need to always jump to your 'snapshot' - so instead of keeping 'origin' on major:minor - it would need to become a 'snapshot'... Seriously complex concept to work with - especially when there is thin-pool... Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 23. 10. 19 v 14:20 Ilia Zykov napsal(a): On 23.10.2019 14:08, Gionatan Danti wrote: For example, consider a completely filled 64k chunk thin volume (with thinpool having ample free space). Snapshotting it and writing a 4k block on origin will obviously cause a read of the original 64k chunk, an in-memory change of the 4k block and a write of the entire modified 64k block to a new location. But writing, say, a 1 MB block should *not* cause the same read on source: after all, the read data will be immediately discarded, overwritten by the changed 1 MB block. However, my testing shows that source chunks are always read, even when completely overwritten. Not only read but sometimes write. I watched it without snapshot. Only zeroing was enabled. Before wrote new chunks "dd bs=1048576 ..." chunks were zeroed. But for security it's good. IMHO: In this case good choice firstly write chunks to the disk and then give this chunks to the volume. Yep - we are recommending to disable zeroing as soon as chunksize >512K. But for 'security' reason the option it's up to users to select what fits the needs in the best way - there is no 'one solution fits them all' in this case. Clearly when you put a modern filesystem (ext4, xfs...) on top of thinLV - you can't read junk data - filesystem knows very well about written portions. But if you will access thinLV device on 'block-level' with 'dd' command you might see some old data trash if zeroing is disabled... For smaller chunksizes zeroing is usually not a big deal - with bigger chunks it slows down initial provisioning in major way - but once the block is provisioned there are no further costs Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On 23/10/19 14:59, Zdenek Kabelac wrote: Dne 23. 10. 19 v 13:08 Gionatan Danti napsal(a): Talking about thin snapshot, an obvious performance optimization which seems to not be implemented is to skip reading source data when overwriting in larger-than-chunksize blocks. Hi There is no such optimization possible for old snapshots. You would need to write ONLY to snapshots. As soon as you start to write to origin - you have to 'read' original data from origin, copy them to COW storage, once this is finished, you can overwrite origin data area with your writing I/O. This is simply never going to work fast ;) - the fast way is thin-pool... Old snapshots were designed for 'short' lived snapshots (so you can take a backup of volume which is not being modified underneath). Any idea of improving this old snapshots target are sooner or later going to end-up with thin-pool anyway :) (we've been in this river many many years back in time...) For example, consider a completely filled 64k chunk thin volume (with thinpool having ample free space). Snapshotting it and writing a 4k block on origin There is no support of snapshot of snapshot with old snaps... It would be extremely slow to use... However, my testing shows that source chunks are always read, even when completely overwritten. Am I missing something? Yep - you would need to always jump to your 'snapshot' - so instead of keeping 'origin' on major:minor - it would need to become a 'snapshot'... Seriously complex concept to work with - especially when there is thin-pool... Hi, I was speaking about *thin* snapshots here. Rewriting the example given above (for clarity): "For example, consider a completely filled 64k chunk thin volume (with thinpool having ample free space). Snapshotting it and writing a 4k block on origin will obviously cause a read of the original 64k chunk, an in-memory change of the 4k block and a write of the entire modified 64k block to a new location. But writing, say, a 1 MB block should *not* cause the same read on source: after all, the read data will be immediately discarded, overwritten by the changed 1 MB block." I would expect that such large-block *thin* snapshot rewrite behavior would not cause a read/modify/write, but it really does. Is this a low-hanging fruit or there are more fundamental problem avoiding read/modify/write in this case? Thanks. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.da...@assyoma.it - i...@assyoma.it GPG public key ID: FF5F32A8 ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On 23/10/19 15:05, Zdenek Kabelac wrote: Yep - we are recommending to disable zeroing as soon as chunksize >512K. But for 'security' reason the option it's up to users to select what fits the needs in the best way - there is no 'one solution fits them all' in this case. Sure, but again: if writing a block larger than the underlying chunk, zeroing can (and should) skipped. Yet I seem to remember that the new block is zeroed in any case, even if it is going to be rewritten entirely. Do I remember wrongly? -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.da...@assyoma.it - i...@assyoma.it GPG public key ID: FF5F32A8 ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 23. 10. 19 v 16:37 Gionatan Danti napsal(a): On 23/10/19 14:59, Zdenek Kabelac wrote: Dne 23. 10. 19 v 13:08 Gionatan Danti napsal(a): Talking about thin snapshot, an obvious performance optimization which seems to not be implemented is to skip reading source data when overwriting in larger-than-chunksize blocks. "For example, consider a completely filled 64k chunk thin volume (with thinpool having ample free space). Snapshotting it and writing a 4k block on origin will obviously cause a read of the original 64k chunk, an in-memory change of the 4k block and a write of the entire modified 64k block to a new location. But writing, say, a 1 MB block should *not* cause the same read on source: after all, the read data will be immediately discarded, overwritten by the changed 1 MB block." I would expect that such large-block *thin* snapshot rewrite behavior would not cause a read/modify/write, but it really does. Is this a low-hanging fruit or there are more fundamental problem avoiding read/modify/write in this case? Hi If you use 1MiB chunksize for thin-pool and you use 'dd' with proper bs size and you write 'aligned' on 1MiB boundary (be sure you user directIO, so you are not a victim of some page cache flushing...) - there should not be any useless read. If you still do see such read - and you can easily reproduce this with latest kernel - report a bug please with your reproducer and results. Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On 23.10.2019 17:40, Gionatan Danti wrote: > On 23/10/19 15:05, Zdenek Kabelac wrote: >> Yep - we are recommending to disable zeroing as soon as chunksize >512K. >> >> But for 'security' reason the option it's up to users to select what >> fits the needs in the best way - there is no 'one solution fits them >> all' in this case. > > Sure, but again: if writing a block larger than the underlying chunk, > zeroing can (and should) skipped. Yet I seem to remember that the new At this case if we get reset before a full chunk written, the tail of the chunk will be a foreign old data (if meta data already written) - little security problem. We need firstly write a data to the disk and then give the fully written chunk to the volume. But I think it's 'little' complicate matters. > block is zeroed in any case, even if it is going to be rewritten entirely. > > Do I remember wrongly? > smime.p7s Description: S/MIME Cryptographic Signature ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Il 23-10-2019 17:37 Zdenek Kabelac ha scritto: Hi If you use 1MiB chunksize for thin-pool and you use 'dd' with proper bs size and you write 'aligned' on 1MiB boundary (be sure you user directIO, so you are not a victim of some page cache flushing...) - there should not be any useless read. If you still do see such read - and you can easily reproduce this with latest kernel - report a bug please with your reproducer and results. Regards Zdenek OK, I triple-checked my numbers and you are right: on a fully updated CentOS 7.7 x86-64 box with kernel-3.10.0-1062.4.1 and lvm2-2.02.185-2, it seems that the behavior I observed on older (>2 years ago) is not present anymore. Take this original lvm setup: [root@localhost ~]# lvs -o +chunk_size LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Chunk root centos -wi-ao <6.20g 0 swap centos -wi-ao 512.00m 0 thinpool centos twi-aot--- 1.00g 25.00 14.16 64.00k thinvol centos Vwi-a-t--- 256.00m thinpool100.00 0 Taking a snapshot (lvcreate -s /dev/centos/thinvol -n thinsnap) and overwriting 1 MB of data on origin via "dd if=/dev/urandom of=/dev/centos/thinvol bs=1M count=32 oflag=direct" results in the following I/O to/from disk: [root@localhost ~]# dstat -d -D sdc ---dsk/sdc--- read writ 1036k 32M As you can see, while 1 MB was indeed read (due to metadata read?), no other read amplification occoured. Now I got curious to see if zeroing behave in the same manner. So, I deleted thinsnap & thinvol, toggled zeroing on (lvchange -Zy centos/thinpool), and recreated thinvol: [root@localhost ~]# lvs -o +chunk_size LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Chunk root centos -wi-ao <6.20g 0 swap centos -wi-ao 512.00m 0 thinpool centos twi-aotz-- 1.00g 0.00 11.04 64.00k thinvol centos Vwi-a-tz-- 256.00m thinpool0.00 0 [root@localhost ~]# dstat -d -D sdc --dsk/sdc-- read writ 013M 520k 19M Again, no write amplificaton occoured. Kudos to all the team for optimizing lvmthin in this manner, it really is a flexible and great performing tool. Regards. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.da...@assyoma.it - i...@assyoma.it GPG public key ID: FF5F32A8 ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
And the block size for thick snapshots can be set when using the lvcreate command. And the automatic growing of a snapshot can be configured too in the lvm configuration. Same issues with both thin and thick, if you run out of space. //T Den ons 23 okt. 2019 kl 13:24 skrev Tomas Dalebjörk < tomas.dalebj...@gmail.com>: > I have tested FusionIO together with old thick snapshots. > I created the thick snapshot on a separate old traditional SATA drive, > just to check if that could be used as a snapshot target for high > performance disks; like a Fusion IO card. > For those who doesn't know about FusionIO; they can deal with 150-250,000 > IOPS. > > And to be honest, I couldn't bottle neck the SATA disk I used as a thick > snapshot target. > The reason for why is simple: > - thick snapshots uses sequential write techniques > > If I would have been using thin snapshots, than the writes would most > likely be more randomized on disk, which would have required more spindles > to coop with this. > > Anyhow; > I am still eager to hear how to use an external device to import snapshots. > And when I say "import"; I am not talking about copyback, more to use to > read data from. > > Regards Tomas > > Den ons 23 okt. 2019 kl 13:08 skrev Gionatan Danti : > >> On 23/10/19 12:46, Zdenek Kabelac wrote: >> > Just few 'comments' - it's not really comparable - the efficiency of >> > thin-pool metadata outperforms old snapshot in BIG way (there is no >> > point to talk about snapshots that takes just couple of MiB) >> >> Yes, this matches my experience. >> >> > There is also BIG difference about the usage of old snapshot origin and >> > snapshot. >> > >> > COW of old snapshot effectively cuts performance 1/2 if you write to >> > origin. >> >> If used without non-volatile RAID controller, 1/2 is generous - I >> measured performance as low as 1/5 (with fat snapshot). >> >> Talking about thin snapshot, an obvious performance optimization which >> seems to not be implemented is to skip reading source data when >> overwriting in larger-than-chunksize blocks. >> >> For example, consider a completely filled 64k chunk thin volume (with >> thinpool having ample free space). Snapshotting it and writing a 4k >> block on origin will obviously cause a read of the original 64k chunk, >> an in-memory change of the 4k block and a write of the entire modified >> 64k block to a new location. But writing, say, a 1 MB block should *not* >> cause the same read on source: after all, the read data will be >> immediately discarded, overwritten by the changed 1 MB block. >> >> However, my testing shows that source chunks are always read, even when >> completely overwritten. >> >> Am I missing something? >> >> -- >> Danti Gionatan >> Supporto Tecnico >> Assyoma S.r.l. - www.assyoma.it >> email: g.da...@assyoma.it - i...@assyoma.it >> GPG public key ID: FF5F32A8 >> > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
I have tested FusionIO together with old thick snapshots. I created the thick snapshot on a separate old traditional SATA drive, just to check if that could be used as a snapshot target for high performance disks; like a Fusion IO card. For those who doesn't know about FusionIO; they can deal with 150-250,000 IOPS. And to be honest, I couldn't bottle neck the SATA disk I used as a thick snapshot target. The reason for why is simple: - thick snapshots uses sequential write techniques If I would have been using thin snapshots, than the writes would most likely be more randomized on disk, which would have required more spindles to coop with this. Anyhow; I am still eager to hear how to use an external device to import snapshots. And when I say "import"; I am not talking about copyback, more to use to read data from. Regards Tomas Den ons 23 okt. 2019 kl 13:08 skrev Gionatan Danti : > On 23/10/19 12:46, Zdenek Kabelac wrote: > > Just few 'comments' - it's not really comparable - the efficiency of > > thin-pool metadata outperforms old snapshot in BIG way (there is no > > point to talk about snapshots that takes just couple of MiB) > > Yes, this matches my experience. > > > There is also BIG difference about the usage of old snapshot origin and > > snapshot. > > > > COW of old snapshot effectively cuts performance 1/2 if you write to > > origin. > > If used without non-volatile RAID controller, 1/2 is generous - I > measured performance as low as 1/5 (with fat snapshot). > > Talking about thin snapshot, an obvious performance optimization which > seems to not be implemented is to skip reading source data when > overwriting in larger-than-chunksize blocks. > > For example, consider a completely filled 64k chunk thin volume (with > thinpool having ample free space). Snapshotting it and writing a 4k > block on origin will obviously cause a read of the original 64k chunk, > an in-memory change of the 4k block and a write of the entire modified > 64k block to a new location. But writing, say, a 1 MB block should *not* > cause the same read on source: after all, the read data will be > immediately discarded, overwritten by the changed 1 MB block. > > However, my testing shows that source chunks are always read, even when > completely overwritten. > > Am I missing something? > > -- > Danti Gionatan > Supporto Tecnico > Assyoma S.r.l. - www.assyoma.it > email: g.da...@assyoma.it - i...@assyoma.it > GPG public key ID: FF5F32A8 > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On 23.10.2019 14:08, Gionatan Danti wrote: > > For example, consider a completely filled 64k chunk thin volume (with > thinpool having ample free space). Snapshotting it and writing a 4k > block on origin will obviously cause a read of the original 64k chunk, > an in-memory change of the 4k block and a write of the entire modified > 64k block to a new location. But writing, say, a 1 MB block should *not* > cause the same read on source: after all, the read data will be > immediately discarded, overwritten by the changed 1 MB block. > > However, my testing shows that source chunks are always read, even when > completely overwritten. Not only read but sometimes write. I watched it without snapshot. Only zeroing was enabled. Before wrote new chunks "dd bs=1048576 ..." chunks were zeroed. But for security it's good. IMHO: In this case best choice firstly write chunks to the disk and then give this chunks to the volume. > > Am I missing something? > smime.p7s Description: S/MIME Cryptographic Signature ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Thanks, Ok, looking at thin than. Is there a way to adopt similarities using thin instead? Regards Tomas Den ons 23 okt. 2019 kl 12:26 skrev Zdenek Kabelac : > Dne 22. 10. 19 v 18:13 Dalebjörk, Tomas napsal(a): > > That is cool, > > > > But, are there any practical example how this could be working in > reality. > > > > There is not yet a practical example available from our lvm2 team yet. > > So we are only describing the 'model' & 'plan' we have ATM... > > > > > > I have created a way to perform block level incremental forever by > reading the > > -cow device, and thin_dump would be nice replacement for that. > > > COW is dead technology from our perspective - it can't cope with recent > performance of modern drives like NVMe... > > So our plan is to focus on thinp technology here. > > > Zdenek > > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Many thanks for all the feedback. The idea works for those applications that supports snapshots. Like Sybase / SAP Adaptive Server Enterprise, Sybase / SAP IQ Server, DB2, MongoDB, MariaDB/MySQL, PostgreSQL etc.. Anyhow, back to the origin question: Is there a way how to re-create the cow- format. so that lvconvert --merge can be used. Or by having lvconvert --merge to accept to read from a "cow file" If that would be possible, than instant recovery would be possible from an external source, like a backup server. Regards Tomas Den ons 23 okt. 2019 kl 08:58 skrev Gionatan Danti : > Il 23-10-2019 00:53 Stuart D. Gathman ha scritto: > > If you can find all the leaf nodes belonging to the root (in my btree > > database they are marked with the root id and can be found by > > sequential > > scan of the volume), then reconstructing the btree data is > > straightforward - even in place. > > > > I remember realizing this was the only way to recover a major > > customer's > > data - and had the utility written, tested, and applied in a 36 hour > > programming marathon (which I hope to never repeat). If this hasn't > > occured to thin pool programmers, I am happy to flesh out the > > procedure. > > Having such a utility available as a last resort would ratchet up the > > reliability of thin pools. > > Very interesting. Can I ask you what product/database you recovered? > > Anyway, giving similar ability to thin Vols would be awesome. > > Thanks. > > -- > Danti Gionatan > Supporto Tecnico > Assyoma S.r.l. - www.assyoma.it > email: g.da...@assyoma.it - i...@assyoma.it > GPG public key ID: FF5F32A8 > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 23. 10. 19 v 13:24 Tomas Dalebjörk napsal(a): I have tested FusionIO together with old thick snapshots. I created the thick snapshot on a separate old traditional SATA drive, just to check if that could be used as a snapshot target for high performance disks; like a Fusion IO card. For those who doesn't know about FusionIO; they can deal with 150-250,000 IOPS. And to be honest, I couldn't bottle neck the SATA disk I used as a thick snapshot target. The reason for why is simple: - thick snapshots uses sequential write techniques If I would have been using thin snapshots, than the writes would most likely be more randomized on disk, which would have required more spindles to coop with this. Anyhow; I am still eager to hear how to use an external device to import snapshots. And when I say "import"; I am not talking about copyback, more to use to read data from. Format of 'on-disk' snapshot metadata for old snapshot is trivial - being some header + pairs of dataoffset-TO-FROM - I think googling will reveal couple python tools playing with it. You can add pre-created COW image to LV with lvconvert --snapshot and to avoid 'zeroing' metadata use option -Zn (BTW in the same way you can detach snapshot from LV with --splitsnapshot so you can look how the metadata looks like...) Although it's pretty unusual why would anyone create first the COW image with all the special layout and then merge it to LV - instead of directly merging... There is only the 'little' advantage of minimizing 'offline' time of such device (and it's the reason why --split exists). Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Wow! Impressing. This will make history! If this is possible, than we are able to implement a solution, which can do: - progressive block level incremental forever (always incremental on block level : this already exist) - instant recovery to point in time (using the mentioned methods you just described) For example, lets say that a client wants to restore a file system, or a logical volume to how it looked a like yesterday. Eventhough there are no snapshot, nor any data. Than the client (with some coding); can start from an empty volume, and re-attach a cow device, and convert that using lvconvert --merge, so that the copying can be done in background using the backup server. If you forget about "how we will re-create the cow device"; and just focusing on the LVM ideas of re-attaching a cow device. Do you think that I have understood it correctly? Den tors 24 okt. 2019 kl 18:01 skrev Zdenek Kabelac : > Dne 23. 10. 19 v 13:24 Tomas Dalebjörk napsal(a): > > I have tested FusionIO together with old thick snapshots. > > I created the thick snapshot on a separate old traditional SATA drive, > just to > > check if that could be used as a snapshot target for high performance > disks; > > like a Fusion IO card. > > For those who doesn't know about FusionIO; they can deal with > 150-250,000 IOPS. > > > > And to be honest, I couldn't bottle neck the SATA disk I used as a thick > > snapshot target. > > The reason for why is simple: > > - thick snapshots uses sequential write techniques > > > > If I would have been using thin snapshots, than the writes would most > likely > > be more randomized on disk, which would have required more spindles to > coop > > with this. > > > > Anyhow; > > I am still eager to hear how to use an external device to import > snapshots. > > And when I say "import"; I am not talking about copyback, more to use to > read > > data from. > > Format of 'on-disk' snapshot metadata for old snapshot is trivial - being > some > header + pairs of dataoffset-TO-FROM - I think googling will reveal couple > python tools playing with it. > > You can add pre-created COW image to LV with lvconvert --snapshot > and to avoid 'zeroing' metadata use option -Zn > (BTW in the same way you can detach snapshot from LV with --splitsnapshot > so > you can look how the metadata looks like...) > > Although it's pretty unusual why would anyone create first the COW image > with > all the special layout and then merge it to LV - instead of directly > merging... There is only the 'little' advantage of minimizing 'offline' > time > of such device (and it's the reason why --split exists). > > Regards > > Zdenek > > > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Hi I have some additional questions related to this. regarding this statement: “ While the merge is in progress, reads or writes to the origin appear as they were directed to the snapshot being merged. ” What exactly does that mean? Will that mean that before changes are being placed on the origin device, it has to first: read the data from the snapshot back to origin, copy the data back from origin to the snapshot, and than after that allow changes to happen? if that is the case, does it keep track of that this block should not be copied again? and will the ongoing merge priorities this block before the other background copying? how about read operations ? will the requested read operations on the origin volume be prioritized before the copying of snapshot data? I didn’t find much information about this, hence why I ask here assuming that someone has executed: lvconvert - - merge -b snapshot thanks for the feedback Skickat från min iPhone > 25 okt. 2019 kl. 18:31 skrev Tomas Dalebjörk : > > > Wow! > > Impressing. > This will make history! > > If this is possible, than we are able to implement a solution, which can do: > - progressive block level incremental forever (always incremental on block > level : this already exist) > - instant recovery to point in time (using the mentioned methods you just > described) > > For example, lets say that a client wants to restore a file system, or a > logical volume to how it looked a like yesterday. > Eventhough there are no snapshot, nor any data. > Than the client (with some coding); can start from an empty volume, and > re-attach a cow device, and convert that using lvconvert --merge, so that the > copying can be done in background using the backup server. > > If you forget about "how we will re-create the cow device"; and just focusing > on the LVM ideas of re-attaching a cow device. > Do you think that I have understood it correctly? > > > Den tors 24 okt. 2019 kl 18:01 skrev Zdenek Kabelac : >> Dne 23. 10. 19 v 13:24 Tomas Dalebjörk napsal(a): >> > I have tested FusionIO together with old thick snapshots. >> > I created the thick snapshot on a separate old traditional SATA drive, >> > just to >> > check if that could be used as a snapshot target for high performance >> > disks; >> > like a Fusion IO card. >> > For those who doesn't know about FusionIO; they can deal with 150-250,000 >> > IOPS. >> > >> > And to be honest, I couldn't bottle neck the SATA disk I used as a thick >> > snapshot target. >> > The reason for why is simple: >> > - thick snapshots uses sequential write techniques >> > >> > If I would have been using thin snapshots, than the writes would most >> > likely >> > be more randomized on disk, which would have required more spindles to >> > coop >> > with this. >> > >> > Anyhow; >> > I am still eager to hear how to use an external device to import snapshots. >> > And when I say "import"; I am not talking about copyback, more to use to >> > read >> > data from. >> >> Format of 'on-disk' snapshot metadata for old snapshot is trivial - being >> some >> header + pairs of dataoffset-TO-FROM - I think googling will reveal couple >> python tools playing with it. >> >> You can add pre-created COW image to LV with lvconvert --snapshot >> and to avoid 'zeroing' metadata use option -Zn >> (BTW in the same way you can detach snapshot from LV with --splitsnapshot so >> you can look how the metadata looks like...) >> >> Although it's pretty unusual why would anyone create first the COW image >> with >> all the special layout and then merge it to LV - instead of directly >> merging... There is only the 'little' advantage of minimizing 'offline' >> time >> of such device (and it's the reason why --split exists). >> >> Regards >> >> Zdenek >> >> ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 04. 11. 19 v 6:54 Tomas Dalebjörk napsal(a): Hi I have some additional questions related to this. regarding this statement: “ While the merge is in progress, reads or writes to the origin appear as they were directed to the snapshot being merged. ” What exactly does that mean? Will that mean that before changes are being placed on the origin device, it has to first: read the data from the snapshot back to origin, copy the data back from origin to the snapshot, and than after that allow changes to happen? if that is the case, does it keep track of that this block should not be copied again? Hi When the 'merge' is in progress - your 'origin' is no longer accessible for your normal usage. It's hiddenly active and only usable by snapshot-merge target) So during 'merging' - you can already use you snapshot like if it would be and origin - and in the background there is a process that reads data from 'snapshot' COW device and copies them back to hidden origin. (this is what you can observe with 'lvs' and copy%) So any 'new' writes to such device lends at right place - reads are either from COW (if the block has not yet been merged) or from origin. Once all blocks from 'COW' are merged into origing - tables are remapped again so all 'supportive' devices are removed and only your 'now fully merged' origin becomes present for usage (while still being fully online) Hopefully it gets more clear. For more explanation how DM works - probably visit: http://people.redhat.com/agk/talks/ and will the ongoing merge priorities this block before the other background copying? how about read operations ? will the requested read operations on the origin volume be prioritized before the copying of snapshot data? The priority is that you always get proper block. Don't seek there the 'top most' performance - the correctness was always the priority there and for long time there is no much devel effort on this ancient target - since thin-pool usage is simply way more superior 1st. note - major difficulty comes from ONLINE usage. If you do NOT need device to be online (aka you keep 'reserve' copy of device) - you can merge things directly into a device - and I simply don't see why you would want to complicate this whole with extra step of transforming data into COW format first and the do online merge. 2nd. note - clearly one cannot start 'merge' of snapshot into origin while such origin device is in-use (i.e. mounted) - as that would lead to 'modification' of such filesystem under its hands. Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 04. 11. 19 v 15:40 Tomas Dalebjörk napsal(a): Thanks for feedback. Scenario 2: A write comes want to write block LP 100, but lvconvert has not yet copied that LP block (yes, I do understand that origin is hidden now) Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block, and let the requestor write the changes directly on the origin after the copying has been performed? Or will the write be blocked until lvconvert has finished the copying of the requested block, and than a write can be accepted to the origin? Or where will the changes be written? Since the COW device contains not only 'data' but also 'metadata' blocks and during the 'merge' it's being updated so it 'knows' which data has been already merged back to origin (in other words during the merge the usage of COW is being reduced towards 0) - I assume your 'plan' stops right here and there is not much point to explore how much sub-optimal the rest of merging process is (and as said - primary aspect was robustness - so if there is crash in any moment in time - data remain correct) Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
thanks, I understand that meta data blocks needs to be update, that I can understand. how about the other questions? like : data write will happen towards which device? cow device or after the copying has been completed to the origin disk? Skickat från min iPhone > 4 nov. 2019 kl. 16:04 skrev Zdenek Kabelac : > > Dne 04. 11. 19 v 15:40 Tomas Dalebjörk napsal(a): >> Thanks for feedback. >> > >> Scenario 2: A write comes want to write block LP 100, but lvconvert has not >> yet copied that LP block (yes, I do understand that origin is hidden now) >> Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to >> /dev/vg00/lv00 for that block, and let the requestor write the changes >> directly on the origin after the copying has been performed? >> Or will the write be blocked until lvconvert has finished the copying of the >> requested block, and than a write can be accepted to the origin? >> Or where will the changes be written? > > Since the COW device contains not only 'data' but also 'metadata' blocks > and during the 'merge' it's being updated so it 'knows' which data has > been already merged back to origin (in other words during the merge the usage > of COW is being reduced towards 0) - I assume your 'plan' stops right here > and there is not much point to explore how much sub-optimal the rest of > merging process is (and as said - primary aspect was robustness - so if > there is crash in any moment in time - data remain correct) > > Regards > > Zdenek > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Thanks for feedback. Let me try to type different scenarios: We have an origin volume, lets call it: /dev/vg00/lv00 We convert a snapshot volume to origin volume, lets call it: /dev/vg00/lv00-snap - all blocks has been changed, and are represented in the /dev/vg00/lv00-snap, when we start the lvconvert process I assume that something reads the data from /dev/vg00/lv00-snap and copy that to /dev/vg00/lv00 It will most likely start from the first block, to the last block to copy. The block size is 1MB on /dev/vg00/lv00-snap, and we have for simplicity the same block size on the origin /dev/vg00/lv00 Scenario 1: A read comes want to read block LP 100, but lvconvert has not yet copied that LP block. Will the read comes from /dev/vg00/lv00-snap directly and delivered to requestor? Or will lvconvert prioritize to copy data from /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block, and let the requestor wait until the copying has been completed, so that a read operation can happen from origin? Or will the requestor have to wait until the copy data from /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block has been completed, without any prioritization? Scenario 2: A write comes want to write block LP 100, but lvconvert has not yet copied that LP block (yes, I do understand that origin is hidden now) Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block, and let the requestor write the changes directly on the origin after the copying has been performed? Or will the write be blocked until lvconvert has finished the copying of the requested block, and than a write can be accepted to the origin? Or where will the changes be written? It is important for me to understand, as the backup device that I want to map as a COW device is a read only target, and is not allowed to be written to. If read happends from the backup COW device, and writes happends to the origin, than it is possible to create an instant recovery. If writes happends to the backup COW device, than it not that easy to implement a instance reovery solution, as the backup device is write protected. Thanks in advance. Den mån 4 nov. 2019 kl 11:07 skrev Zdenek Kabelac : > Dne 04. 11. 19 v 6:54 Tomas Dalebjörk napsal(a): > > Hi > > > > I have some additional questions related to this. > > regarding this statement: > > “ While the merge is in progress, reads or writes to the origin appear > as they > > were directed to the snapshot being merged. ” > > > > What exactly does that mean? > > > > Will that mean that before changes are being placed on the origin > device, it > > has to first: > > read the data from the snapshot back to origin, copy the data back from > origin > > to the snapshot, and than after that allow changes to happen? > > if that is the case, does it keep track of that this block should not be > > copied again? > > Hi > > When the 'merge' is in progress - your 'origin' is no longer accessible > for your normal usage. It's hiddenly active and only usable by > snapshot-merge > target) > > So during 'merging' - you can already use you snapshot like if it would be > and > origin - and in the background there is a process that reads data from > 'snapshot' COW device and copies them back to hidden origin. > (this is what you can observe with 'lvs' and copy%) > > So any 'new' writes to such device lends at right place - reads are > either > from COW (if the block has not yet been merged) or from origin. > > Once all blocks from 'COW' are merged into origing - tables are remapped > again > so all 'supportive' devices are removed and only your 'now fully merged' > origin becomes present for usage (while still being fully online) > > Hopefully it gets more clear. > > > For more explanation how DM works - probably visit: > http://people.redhat.com/agk/talks/ > > > and will the ongoing merge priorities this block before the other > background > > copying? > > > > how about read operations ? > > will the requested read operations on the origin volume be prioritized > before > > the copying of snapshot data? > > The priority is that you always get proper block. > Don't seek there the 'top most' performance - the correctness was always > the > priority there and for long time there is no much devel effort on this > ancient > target - since thin-pool usage is simply way more superior > > 1st. note - major difficulty comes from ONLINE usage. If you do NOT need > device to be online (aka you keep 'reserve' copy of device) - you can > merge > things directly into a device - and I simply don't see why you would want > to > complicate this whole with extra step of transforming data into COW format > first and the do online merge. > > 2nd. note - clearly one cannot start 'merge' of snapshot into origin while > such origin device is in-use (i.e. mounted) - as that would lead to > 'modification' of such filesystem under its hands. > > Regards > > Zdenek > > ___ linux
Re: [linux-lvm] exposing snapshot block device
Dne 04. 11. 19 v 18:28 Tomas Dalebjörk napsal(a): thanks, I understand that meta data blocks needs to be update, that I can understand. how about the other questions? like : data write will happen towards which device? cow device or after the copying has been completed to the origin disk? Hi I'd assume - if the block is still mapped in COW and the block is not yet merged into origin - the 'write' needs to lend COW - as there is no 'extra' information about which 'portion' of the chunk has been already 'merged'. If you happen to 'write' your I/O to currently merged 'chunk' - you will wait till check gets merged and metadata are updated and then your I/O land in origin. But I don't think there are any optimization made - as it doesn't really matter too much in terms of the actual merging speed - if couple I/O are repeated - who cares - on the overall time of whole merging process it will have negligible impact - and as said - the preference was made towards simplicity and correctness. For the most details - just feel free to take a look at: linux/drviers/md/dm-snap.c i.e. function snapshot_merge_next_chunks() The snapshot was designed to be small and map a very low percentage of origin device - it's never been assumed to be used with 200GiB and similar snapshot COW size Regads Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On Mon, 4 Nov 2019, Tomas Dalebjörk wrote: > Thanks for feedback. > > Let me try to type different scenarios: > > We have an origin volume, lets call it: /dev/vg00/lv00 > We convert a snapshot volume to origin volume, lets call it: > /dev/vg00/lv00-snap > - all blocks has been changed, and are represented in the > /dev/vg00/lv00-snap, when we start the lvconvert process > > I assume that something reads the data from /dev/vg00/lv00-snap and copy that > to /dev/vg00/lv00 > It will most likely start from the first block, to the last block to copy. Merging starts from the last block on the lv00-snap device and it proceeds backward to the beginning. > The block size is 1MB on /dev/vg00/lv00-snap, and we have for simplicity the > same block size on the origin /dev/vg00/lv00 > > Scenario 1: A read comes want to read block LP 100, but lvconvert has not yet > copied that LP block. > Will the read comes from /dev/vg00/lv00-snap directly and delivered to > requestor? Yes. > Or will lvconvert prioritize to copy data from /dev/vg00/lv00-snap to > /dev/vg00/lv00 for that block, and let the requestor wait until the copying > has been completed, so > that a read operation can happen from origin? > Or will the requestor have to wait until the copy data from > /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block has been completed, > without any prioritization? It only waits if you attempt to read or write the block that is currently being copied. If you read data that hasn't been merged yet, it reads from the snapshot, if you read data that has been merged, it reads from the origin, if you read data that is currently being copied, it waits. > Scenario 2: A write comes want to write block LP 100, but lvconvert has not > yet copied that LP block (yes, I do understand that origin is hidden now) > Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to > /dev/vg00/lv00 for that block, and let the requestor write the changes > directly on the origin after the > copying has been performed? No. > Or will the write be blocked until lvconvert has finished the copying of the > requested block, and than a write can be accepted to the origin? > Or where will the changes be written? The changes will be written to the lv00-snap device. If you write data that hasn't been merged yet, the write is redirected to the lv00-snap device. If you write data that has already been merged, the write is directed to the origin device. If you write data that is currently being merged, it waits. > It is important for me to understand, as the backup device that I want to map > as a COW device is a read only target, and is not allowed to be written to. You can't have read-only COW device. Both metadata and data on the COW device are updated during the merge. > If read happends from the backup COW device, and writes happends to the > origin, than it is possible to create an instant recovery. > If writes happends to the backup COW device, than it not that easy to > implement a instance reovery solution, as the backup device is write > protected. > > Thanks in advance. Mikulas___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Thanks, That really helped me to understand how the snapshot works. Last question: - lets say that block 100 which is 1MB in size is in the cow device, and a write happen that wants to something or all data on that region of block 100. Than I assume; based on what have been previously said here, that the block in the cow device will be overwritten with the new changes. Regards Tomas Den tis 5 nov. 2019 kl 17:40 skrev Mikulas Patocka : > > > On Mon, 4 Nov 2019, Tomas Dalebjörk wrote: > > > Thanks for feedback. > > > > Let me try to type different scenarios: > > > > We have an origin volume, lets call it: /dev/vg00/lv00 > > We convert a snapshot volume to origin volume, lets call it: > /dev/vg00/lv00-snap > > - all blocks has been changed, and are represented in the > /dev/vg00/lv00-snap, when we start the lvconvert process > > > > I assume that something reads the data from /dev/vg00/lv00-snap and copy > that to /dev/vg00/lv00 > > It will most likely start from the first block, to the last block to > copy. > > Merging starts from the last block on the lv00-snap device and it proceeds > backward to the beginning. > > > The block size is 1MB on /dev/vg00/lv00-snap, and we have for simplicity > the same block size on the origin /dev/vg00/lv00 > > > > Scenario 1: A read comes want to read block LP 100, but lvconvert has > not yet copied that LP block. > > Will the read comes from /dev/vg00/lv00-snap directly and delivered to > requestor? > > Yes. > > > Or will lvconvert prioritize to copy data from /dev/vg00/lv00-snap to > /dev/vg00/lv00 for that block, and let the requestor wait until the copying > has been completed, so > > that a read operation can happen from origin? > > Or will the requestor have to wait until the copy data from > /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block has been completed, > without any prioritization? > > It only waits if you attempt to read or write the block that is currently > being copied. > > If you read data that hasn't been merged yet, it reads from the snapshot, > if you read data that has been merged, it reads from the origin, if you > read data that is currently being copied, it waits. > > > Scenario 2: A write comes want to write block LP 100, but lvconvert has > not yet copied that LP block (yes, I do understand that origin is hidden > now) > > Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to > /dev/vg00/lv00 for that block, and let the requestor write the changes > directly on the origin after the > > copying has been performed? > > No. > > > Or will the write be blocked until lvconvert has finished the copying of > the requested block, and than a write can be accepted to the origin? > > Or where will the changes be written? > > The changes will be written to the lv00-snap device. > > If you write data that hasn't been merged yet, the write is redirected to > the lv00-snap device. If you write data that has already been merged, the > write is directed to the origin device. If you write data that is > currently being merged, it waits. > > > It is important for me to understand, as the backup device that I want > to map as a COW device is a read only target, and is not allowed to be > written to. > > You can't have read-only COW device. Both metadata and data on the COW > device are updated during the merge. > > > If read happends from the backup COW device, and writes happends to the > origin, than it is possible to create an instant recovery. > > If writes happends to the backup COW device, than it not that easy to > implement a instance reovery solution, as the backup device is write > protected. > > > > Thanks in advance. > > Mikulas ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 05. 11. 19 v 21:56 Tomas Dalebjörk napsal(a): Thanks, That really helped me to understand how the snapshot works. Last question: - lets say that block 100 which is 1MB in size is in the cow device, and a write happen that wants to something or all data on that region of block 100. Than I assume; based on what have been previously said here, that the block in the cow device will be overwritten with the new changes. Yes - it needs to be written to 'COW' device - since when the block will be merged - it would overwrite whatever would have been written in 'origin' (as said - there is nothing else in snapshot metadata then 'from->to' block mapping table - so there is no way to store information about a portion of 'chunk' being already written into origin) - and 'merge' needs to work reliable in cases like 'power-off' in the middle of merge operation... Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On Tue, 5 Nov 2019, Tomas Dalebjörk wrote: > Thanks, > > That really helped me to understand how the snapshot works. > Last question: > - lets say that block 100 which is 1MB in size is in the cow device, and a > write happen that wants to something or all data on that region of block 100. > Than I assume; based on what have been previously said here, that the block > in the cow device will be overwritten with the new changes. Yes, the block in the cow device will be overwritten. Mikulas > Regards Tomas___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Great, thanks! Den tors 7 nov. 2019 kl 17:54 skrev Mikulas Patocka : > > > On Tue, 5 Nov 2019, Tomas Dalebjörk wrote: > > > Thanks, > > > > That really helped me to understand how the snapshot works. > > Last question: > > - lets say that block 100 which is 1MB in size is in the cow device, and > a write happen that wants to something or all data on that region of block > 100. > > Than I assume; based on what have been previously said here, that the > block in the cow device will be overwritten with the new changes. > > Yes, the block in the cow device will be overwritten. > > Mikulas > > > Regards Tomas ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 04. 09. 20 v 14:09 Tomas Dalebjörk napsal(a): hi I tried to perform as suggested # lvconvert —splitsnapshot vg/lv-snap works fine # lvconvert -s vg/lv vg/lv-snap works fine too but... if I try to converting cow data directly from the meta device, than it doesn’t work eg # lvconvert -s vg/lv /dev/mycowdev the tool doesn’t like the path I tried to place a link in /dev/vg/mycowdev -> /dev/mycowdev Hi lvm2 does only support 'objects' within VG without any plan to support 'external' devices. So user may not take any 'random' device in a system and use it for commands like lvconvert. There is always very strict requirement to place block devices as VG member first (pvcreate, vgextend...) and then user can allocate space of this device for various LVs. conclusion even though the cow device is an exact copy of the cow device that I have saved on /dev/mycowdev before the split, it wouldn’t work to use to convert back as a lvm snapshot COW data needs to be simply stored on an LV for use with lvm2. You may of course use the 'dmsetup' command directly and arrange your snapshot setup in the way to combine various kinds of devices - but this is going completely without any lvm2 command involved - in this case you have to fully manipulate all devices in your device stack with this dmsetup command. Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
hi I tried to perform as suggested # lvconvert —splitsnapshot vg/lv-snap works fine # lvconvert -s vg/lv vg/lv-snap works fine too but... if I try to converting cow data directly from the meta device, than it doesn’t work eg # lvconvert -s vg/lv /dev/mycowdev the tool doesn’t like the path I tried to place a link in /dev/vg/mycowdev -> /dev/mycowdev and retried the operations # lvconveet -s vg/lv /dev/vg/mycowdev but this doesn’t work either conclusion even though the cow device is an exact copy of the cow device that I have saved on /dev/mycowdev before the split, it wouldn’t work to use to convert back as a lvm snapshot not sure if I understand the tool correctly, or if there are other things needed to perform, such as creating virtual information about the lvm VGDA data on the first of this virtual volume named /dev/mycowdev let me know what more steps are needed beat regards Tomas Sent from my iPhone > On 7 Nov 2019, at 18:29, Tomas Dalebjörk wrote: > > > Great, thanks! > > Den tors 7 nov. 2019 kl 17:54 skrev Mikulas Patocka : >> >> >> On Tue, 5 Nov 2019, Tomas Dalebjörk wrote: >> >> > Thanks, >> > >> > That really helped me to understand how the snapshot works. >> > Last question: >> > - lets say that block 100 which is 1MB in size is in the cow device, and a >> > write happen that wants to something or all data on that region of block >> > 100. >> > Than I assume; based on what have been previously said here, that the >> > block in the cow device will be overwritten with the new changes. >> >> Yes, the block in the cow device will be overwritten. >> >> Mikulas >> >> > Regards Tomas ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
On Fri, 4 Sep 2020, Tomas Dalebjörk wrote: > hi > I tried to perform as suggested > # lvconvert —splitsnapshot vg/lv-snap > works fine > # lvconvert -s vg/lv vg/lv-snap > works fine too > > but... > if I try to converting cow data directly from the meta device, than it > doesn’t work > eg > # lvconvert -s vg/lv /dev/mycowdev > the tool doesn’t like the path > I tried to place a link in /dev/vg/mycowdev -> /dev/mycowdev > and retried the operations > # lvconveet -s vg/lv /dev/vg/mycowdev > but this doesn’t work either > > conclusion even though the cow device is an exact copy of the cow > device that I have saved on /dev/mycowdev before the split, it wouldn’t > work to use to convert back as a lvm snapshot > > not sure if I understand the tool correctly, or if there are other > things needed to perform, such as creating virtual information about the > lvm VGDA data on the first of this virtual volume named /dev/mycowdev AFAIK LVM doesn't support taking existing cow device and attaching it to an existing volume. When you create a snapshot, you start with am empty cow. Mikulas > let me know what more steps are needed > > beat regards Tomas > > Sent from my iPhone > > On 7 Nov 2019, at 18:29, Tomas Dalebjörk > wrote: > > Great, thanks! > > Den tors 7 nov. 2019 kl 17:54 skrev Mikulas Patocka : > > > On Tue, 5 Nov 2019, Tomas Dalebjörk wrote: > > > Thanks, > > > > That really helped me to understand how the snapshot works. > > Last question: > > - lets say that block 100 which is 1MB in size is in the cow device, > and a write happen that wants to something or all data on that region of block > 100. > > Than I assume; based on what have been previously said here, that the > block in the cow device will be overwritten with the new changes. > > Yes, the block in the cow device will be overwritten. > > Mikulas > > > Regards Tomas > > > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 07. 09. 20 v 16:14 Dalebjörk, Tomas napsal(a): Hi Mikulas, Thanks for the replies I am confused now with the last message? LVM doesn't support taking existing cow device and attaching it to an existing volume? Isn't that what "lvconvert --splitsnapshot" & "lvconvert -s" is ment to be doing? lets say that I create the snapshot on a different device using these steps: root@src# lvcreate -s -L 10GB -n lvsnap vg/lv /dev/sdh root@src# lvconvert ---splitsnapshot vg/lvsnap root@src# echo "I now move /dev/sdb to another server" root@tgt# lvconvert -s newvg/newlv vg/lvsnap Hi This is only supported as long as you stay within VG. So newlv & lvsnap must be in a single VG. Note - you can 'vgreduce' PV from VG1 and vgextend to VG2. But it always work on whole PV base - you can't mix LV between VGs. Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
thanks for feedback so if I understand this correctly # fallocate -l 100M /tmp/pv1 # fallocate -l 100M /tmp/pv2 # fallocate -l 100M /tmp/pv3 # losetup —find —show /tmp/pv1 # losetup —find —show /tmp/pv2 # losetup —find —show /tmp/pv3 # vgcreate vg0 /dev/loop0 # lvcreate -n lv0 -l 1 vg0 # vgextend vg0 /dev/loop1 # lvcreate -s -l 1 -n lvsnap /dev/loop1 # vgchange -a n vg0 # lvconvert —splitsnapshot vg0/lvsnap # vgreduce vg0 /dev/loop1 # vgcreate vg1 /dev/loop2 # lvcreate -n lv0 -l 1 vg1 # vgextend vg1 /dev/loop1 # lvconvert -s vg1/lvsnap vg1/lv0 not sure if the steps are correct? regards Tomas Sent from my iPhone > On 7 Sep 2020, at 16:17, Zdenek Kabelac wrote: > > Dne 07. 09. 20 v 16:14 Dalebjörk, Tomas napsal(a): >> Hi Mikulas, >> Thanks for the replies >> I am confused now with the last message? >> LVM doesn't support taking existing cow device and attaching it to an >> existing volume? >> Isn't that what "lvconvert --splitsnapshot" & "lvconvert -s" is ment to be >> doing? >> lets say that I create the snapshot on a different device using these steps: >> root@src# lvcreate -s -L 10GB -n lvsnap vg/lv /dev/sdh >> root@src# lvconvert ---splitsnapshot vg/lvsnap >> root@src# echo "I now move /dev/sdb to another server" >> root@tgt# lvconvert -s newvg/newlv vg/lvsnap > > Hi > > This is only supported as long as you stay within VG. > So newlv & lvsnap must be in a single VG. > > Note - you can 'vgreduce' PV from VG1 and vgextend to VG2. > But it always work on whole PV base - you can't mix > LV between VGs. > > Zdenek > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a): thanks for feedback so if I understand this correctly # fallocate -l 100M /tmp/pv1 # fallocate -l 100M /tmp/pv2 # fallocate -l 100M /tmp/pv3 # losetup —find —show /tmp/pv1 # losetup —find —show /tmp/pv2 # losetup —find —show /tmp/pv3 # vgcreate vg0 /dev/loop0 # lvcreate -n lv0 -l 1 vg0 # vgextend vg0 /dev/loop1 # lvcreate -s -l 1 -n lvsnap /dev/loop1 # vgchange -a n vg0 # lvconvert —splitsnapshot vg0/lvsnap # vgreduce vg0 /dev/loop1 Hi Here you would need to use 'vgsplit' rather - otherwise you loose the mapping for whatever was living on /dev/loop1 # vgcreate vg1 /dev/loop2 # lvcreate -n lv0 -l 1 vg1 # vgextend vg1 /dev/loop1 And 'vgmerge' # lvconvert -s vg1/lvsnap vg1/lv0 not sure if the steps are correct? I hope you realize the content of vg1/lv0 must be exactly same as vg0/lv0. As snapshot COW volume contains only 'diff chunks' - so if you would attach snapshot to 'different' lv - you would get only mess. Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
thanks ok vgsplit/merge instead and after that lvconvert-s yes, I am aware of the issues with corruption but if the cow device has all data, than no corruption will happen, right? if COW has a copy of all blocks than a lvconvert —merge, or mount of the snapshot volume will be without issues right? regards Tomas Sent from my iPhone > On 7 Sep 2020, at 18:42, Zdenek Kabelac wrote: > > Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a): >> thanks for feedback >> so if I understand this correctly >> # fallocate -l 100M /tmp/pv1 >> # fallocate -l 100M /tmp/pv2 >> # fallocate -l 100M /tmp/pv3 >> # losetup —find —show /tmp/pv1 >> # losetup —find —show /tmp/pv2 >> # losetup —find —show /tmp/pv3 >> # vgcreate vg0 /dev/loop0 >> # lvcreate -n lv0 -l 1 vg0 >> # vgextend vg0 /dev/loop1 >> # lvcreate -s -l 1 -n lvsnap /dev/loop1 >> # vgchange -a n vg0 >> # lvconvert —splitsnapshot vg0/lvsnap >> # vgreduce vg0 /dev/loop1 > > > Hi > > Here you would need to use 'vgsplit' rather - otherwise you > loose the mapping for whatever was living on /dev/loop1 > >> # vgcreate vg1 /dev/loop2 >> # lvcreate -n lv0 -l 1 vg1 >> # vgextend vg1 /dev/loop1 > > And 'vgmerge' > > >> # lvconvert -s vg1/lvsnap vg1/lv0 >> not sure if the steps are correct? > > > I hope you realize the content of vg1/lv0 must be exactly same > as vg0/lv0. > > As snapshot COW volume contains only 'diff chunks' - so if you > would attach snapshot to 'different' lv - you would get only mess. > > > Zdenek > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Dne 07. 09. 20 v 19:37 Tomas Dalebjörk napsal(a): thanks ok vgsplit/merge instead and after that lvconvert-s yes, I am aware of the issues with corruption but if the cow device has all data, than no corruption will happen, right? if COW has a copy of all blocks than a lvconvert —merge, or mount of the snapshot volume will be without issues If the 'COW' has all the data - why do you need then snapshot ? Why not travel whole LV instead of snapshot ? Also - nowdays this old (so called 'thick') snapshot is really slow compared with thin-provisioning - might be good if you check what kind of features you can gain/loose if you would have switched to thin-pool (clearly whole thin-pool (both data & metadata) would need to travel between your VGs.) Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
hi I tried all these steps but when I associated the snapshot cow device back to an empty origin, and typed the lvs command the data% output shows 0% instead of 37% ? so it looks like that the lvconvert -s vg1/lvsnap vg1/lv0 looses the cow data? perhaps ypu can guide me how this can be done? btw, just to emulate s full copy, I executed the dd if=/dev/vg0/lv0 of=/dev/vg1/lv0 before the lvconvert -s, to make sure the last data is there and than I tried to mount the vg1/lv0 which worked fine but the data was not at snapshot view even mounting vg1/lvsnap works fine but with wrong data confused over how and why vgmerge should be used as vgsplit does the work? regards Tomas Sent from my iPhone > On 7 Sep 2020, at 18:42, Zdenek Kabelac wrote: > > Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a): >> thanks for feedback >> so if I understand this correctly >> # fallocate -l 100M /tmp/pv1 >> # fallocate -l 100M /tmp/pv2 >> # fallocate -l 100M /tmp/pv3 >> # losetup —find —show /tmp/pv1 >> # losetup —find —show /tmp/pv2 >> # losetup —find —show /tmp/pv3 >> # vgcreate vg0 /dev/loop0 >> # lvcreate -n lv0 -l 1 vg0 >> # vgextend vg0 /dev/loop1 >> # lvcreate -s -l 1 -n lvsnap /dev/loop1 >> # vgchange -a n vg0 >> # lvconvert —splitsnapshot vg0/lvsnap >> # vgreduce vg0 /dev/loop1 > > > Hi > > Here you would need to use 'vgsplit' rather - otherwise you > loose the mapping for whatever was living on /dev/loop1 > >> # vgcreate vg1 /dev/loop2 >> # lvcreate -n lv0 -l 1 vg1 >> # vgextend vg1 /dev/loop1 > > And 'vgmerge' > > >> # lvconvert -s vg1/lvsnap vg1/lv0 >> not sure if the steps are correct? > > > I hope you realize the content of vg1/lv0 must be exactly same > as vg0/lv0. > > As snapshot COW volume contains only 'diff chunks' - so if you > would attach snapshot to 'different' lv - you would get only mess. > > > Zdenek > ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
yes, we need the snapshot data, as it is provisioned from the backup target and can’t be changed we will definitely look into thin snapshots later, but want to first making sure that we can reanimate the cow device as a device and associate this with an empty origin we want if possible be able to associate this cow to a new empty vg/lv using new vgname/lvname if possible after all, it is just an virutal volume Sent from my iPhone > On 7 Sep 2020, at 21:56, Tomas Dalebjörk wrote: > > hi > I tried all these steps > but when I associated the snapshot cow device back to an empty origin, and > typed the lvs command > the data% output shows 0% instead of 37% ? > so it looks like that the lvconvert -s vg1/lvsnap vg1/lv0 looses the cow data? > > perhaps ypu can guide me how this can be done? > > btw, just to emulate s full copy, I executed the > dd if=/dev/vg0/lv0 of=/dev/vg1/lv0 > before the lvconvert -s, to make sure the last data is there > > and than I tried to mount the vg1/lv0 which worked fine > but the data was not at snapshot view > even mounting vg1/lvsnap works fine > but with wrong data > > confused over how and why vgmerge should be used as vgsplit does the work? > > regards Tomas > > Sent from my iPhone > >> On 7 Sep 2020, at 18:42, Zdenek Kabelac wrote: >> >> Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a): >>> thanks for feedback >>> so if I understand this correctly >>> # fallocate -l 100M /tmp/pv1 >>> # fallocate -l 100M /tmp/pv2 >>> # fallocate -l 100M /tmp/pv3 >>> # losetup —find —show /tmp/pv1 >>> # losetup —find —show /tmp/pv2 >>> # losetup —find —show /tmp/pv3 >>> # vgcreate vg0 /dev/loop0 >>> # lvcreate -n lv0 -l 1 vg0 >>> # vgextend vg0 /dev/loop1 >>> # lvcreate -s -l 1 -n lvsnap /dev/loop1 >>> # vgchange -a n vg0 >>> # lvconvert —splitsnapshot vg0/lvsnap >>> # vgreduce vg0 /dev/loop1 >> >> >> Hi >> >> Here you would need to use 'vgsplit' rather - otherwise you >> loose the mapping for whatever was living on /dev/loop1 >> >>> # vgcreate vg1 /dev/loop2 >>> # lvcreate -n lv0 -l 1 vg1 >>> # vgextend vg1 /dev/loop1 >> >> And 'vgmerge' >> >> >>> # lvconvert -s vg1/lvsnap vg1/lv0 >>> not sure if the steps are correct? >> >> >> I hope you realize the content of vg1/lv0 must be exactly same >> as vg0/lv0. >> >> As snapshot COW volume contains only 'diff chunks' - so if you >> would attach snapshot to 'different' lv - you would get only mess. >> >> >> Zdenek >> ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
it worked I missed the -Zn flag Sent from my iPhone > On 7 Sep 2020, at 21:56, Tomas Dalebjörk wrote: > > hi > I tried all these steps > but when I associated the snapshot cow device back to an empty origin, and > typed the lvs command > the data% output shows 0% instead of 37% ? > so it looks like that the lvconvert -s vg1/lvsnap vg1/lv0 looses the cow data? > > perhaps ypu can guide me how this can be done? > > btw, just to emulate s full copy, I executed the > dd if=/dev/vg0/lv0 of=/dev/vg1/lv0 > before the lvconvert -s, to make sure the last data is there > > and than I tried to mount the vg1/lv0 which worked fine > but the data was not at snapshot view > even mounting vg1/lvsnap works fine > but with wrong data > > confused over how and why vgmerge should be used as vgsplit does the work? > > regards Tomas > > Sent from my iPhone > >> On 7 Sep 2020, at 18:42, Zdenek Kabelac wrote: >> >> Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a): >>> thanks for feedback >>> so if I understand this correctly >>> # fallocate -l 100M /tmp/pv1 >>> # fallocate -l 100M /tmp/pv2 >>> # fallocate -l 100M /tmp/pv3 >>> # losetup —find —show /tmp/pv1 >>> # losetup —find —show /tmp/pv2 >>> # losetup —find —show /tmp/pv3 >>> # vgcreate vg0 /dev/loop0 >>> # lvcreate -n lv0 -l 1 vg0 >>> # vgextend vg0 /dev/loop1 >>> # lvcreate -s -l 1 -n lvsnap /dev/loop1 >>> # vgchange -a n vg0 >>> # lvconvert —splitsnapshot vg0/lvsnap >>> # vgreduce vg0 /dev/loop1 >> >> >> Hi >> >> Here you would need to use 'vgsplit' rather - otherwise you >> loose the mapping for whatever was living on /dev/loop1 >> >>> # vgcreate vg1 /dev/loop2 >>> # lvcreate -n lv0 -l 1 vg1 >>> # vgextend vg1 /dev/loop1 >> >> And 'vgmerge' >> >> >>> # lvconvert -s vg1/lvsnap vg1/lv0 >>> not sure if the steps are correct? >> >> >> I hope you realize the content of vg1/lv0 must be exactly same >> as vg0/lv0. >> >> As snapshot COW volume contains only 'diff chunks' - so if you >> would attach snapshot to 'different' lv - you would get only mess. >> >> >> Zdenek >> ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Hi Mikulas, Thanks for the replies I am confused now with the last message? LVM doesn't support taking existing cow device and attaching it to an existing volume? Isn't that what "lvconvert --splitsnapshot" & "lvconvert -s" is ment to be doing? lets say that I create the snapshot on a different device using these steps: root@src# lvcreate -s -L 10GB -n lvsnap vg/lv /dev/sdh root@src# lvconvert ---splitsnapshot vg/lvsnap root@src# echo "I now move /dev/sdb to another server" root@tgt# lvconvert -s newvg/newlv vg/lvsnap Regards Tomas Den 2020-09-07 kl. 15:09, skrev Mikulas Patocka: On Fri, 4 Sep 2020, Tomas Dalebjörk wrote: hi I tried to perform as suggested # lvconvert —splitsnapshot vg/lv-snap works fine # lvconvert -s vg/lv vg/lv-snap works fine too but... if I try to converting cow data directly from the meta device, than it doesn’t work eg # lvconvert -s vg/lv /dev/mycowdev the tool doesn’t like the path I tried to place a link in /dev/vg/mycowdev -> /dev/mycowdev and retried the operations # lvconveet -s vg/lv /dev/vg/mycowdev but this doesn’t work either conclusion even though the cow device is an exact copy of the cow device that I have saved on /dev/mycowdev before the split, it wouldn’t work to use to convert back as a lvm snapshot not sure if I understand the tool correctly, or if there are other things needed to perform, such as creating virtual information about the lvm VGDA data on the first of this virtual volume named /dev/mycowdev AFAIK LVM doesn't support taking existing cow device and attaching it to an existing volume. When you create a snapshot, you start with am empty cow. Mikulas let me know what more steps are needed beat regards Tomas Sent from my iPhone On 7 Nov 2019, at 18:29, Tomas Dalebjörk wrote: Great, thanks! Den tors 7 nov. 2019 kl 17:54 skrev Mikulas Patocka : On Tue, 5 Nov 2019, Tomas Dalebjörk wrote: > Thanks, > > That really helped me to understand how the snapshot works. > Last question: > - lets say that block 100 which is 1MB in size is in the cow device, and a write happen that wants to something or all data on that region of block 100. > Than I assume; based on what have been previously said here, that the block in the cow device will be overwritten with the new changes. Yes, the block in the cow device will be overwritten. Mikulas > Regards Tomas ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Re: [linux-lvm] exposing snapshot block device
Hi, This is the steps that I did. - the COW data exists on /dev/loop1, including space for the PV header + metadata I created a fakevg template file from vgcfgbackup /tmp/fakevg.bkp ( the content of this file I created fake uuid etc... ) I craete a fake uuid for the PV # pvcreate -ff -u fake-uidx-nrxx- --restorefile /tmp/fakevg.bkp And created rhe metadata from the backup # vgcfgrestore -f /tmp/fakevg.bkp fakevg I can now see the lvsnap in fakevg Perhaps the restore can be done directly to the destination vg? not sure... Anyhow, I than used the vgsplit to move the fakevg data to the destination vg # vgsplit fakevg destvg /dev/loop1 I know have the lvsnap volume in the correct volume group From here, I connected the lvsnap to a lv destination using # lvconvert -Zn -s destvg/lvsnap destvg/destlv I know have a snapshot connected to the origin destlv From here, I can either mount the snapshot and start using it, or revert to the snapshot # lvchange -a n destvg/destlv # lvconvert --merge -b destvg/lvsnap # lvchange -a y destvg/destlv Now to my questions... is there any DBUS api that can perform the vgcfgrestore operations that I can use through C? or another ways to recreate the metadata? I have to now use two steps: pvcreate + vgcfgrestore, where I just need to actually restore just the metadata (only vgcfgrestore)? If I run vgcfgrestore without pvcreate, than vgcfgrestore will not find the pvid, and cant be executed with a parameter like: # vgcfgrestore -f vgXX.bkp /dev/nbd Instead it has to be used with the parameter vgXX pointing out the volume group... I can live with vgcfgrestore + pvcreate, but would prefer to use the libblockdev (DBUS) or another api from C directly. What options do I have? Thanks for an excellent help God Bless Tomas Den 2020-09-07 kl. 19:50, skrev Zdenek Kabelac: Dne 07. 09. 20 v 19:37 Tomas Dalebjörk napsal(a): thanks ok vgsplit/merge instead and after that lvconvert-s yes, I am aware of the issues with corruption but if the cow device has all data, than no corruption will happen, right? if COW has a copy of all blocks than a lvconvert —merge, or mount of the snapshot volume will be without issues If the 'COW' has all the data - why do you need then snapshot ? Why not travel whole LV instead of snapshot ? Also - nowdays this old (so called 'thick') snapshot is really slow compared with thin-provisioning - might be good if you check what kind of features you can gain/loose if you would have switched to thin-pool (clearly whole thin-pool (both data & metadata) would need to travel between your VGs.) Regards Zdenek ___ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/