Re: [ceph-users] snapshot, clone and mount a VM-Image
On 02/16/2013 03:51 AM, Jens Kristian Søgaard wrote: Hi Sage, 1) Decide what output format to use. We want to use something that is I have given it some thought, and my initial suggestion to keep things simple is to use the QCOW2 image format. The birds eye view of the process would be as follows: * Initial backup User supplied information: pool, image name Create rbd snapshot of the image named backup_1, where 1 could be a timestamp or an integer count. Save the snapshot to a standard qcow2 image. Similar to: qemu-img convert rbd:data/myimage@backup_1 -O qcow2 data_myimage_backup_1.qcow2 Note: I don't know if qemu-img actually supports reading from snapshots currently. It does. * Incremental backup User supplied information: pool, image name, path to initial backup or previous incremental file Create rbd snapshot of the image named backup_2, where 2 could be a timestamp or an integer count. Determine previous snapshot identifier from given file name. Determine objects changed from the snapshot given by that identifier and the newly created snapshot. Construct QCOW2 L1- and L2-tables in memory from that changeset. Create new qcow2 image with the previous backup file as the backing image, and write out the tables and changed blocks. Delete previous rbd snapshot. * Restoring and mounting The use of the QCOW2 format means that we can use existing tools for restoring and mounting the backups. To restore a backup the user can simply choose either the initial backup file or an incremental, and use qemu-img to copy that to a new rbd image. To mount the initial backup or an incremental, the user can use qemu-nbd to mount and explore the backup to determine which one to restore. The performance of restores and mounts would ofcourse be weakened if the backup consists of a large number of incrementals. In that case the existing qemu-img tool could be used to flatten the backup. * Pros/cons The QCOW2 format support compression, so we could implement compressed backups without much effort. The disadvantages to using QCOW2 like this is that we do not have any checksumming or safe guards against potential errors such as users mixing up images. Another disadvantage to this approach is that vital information is stored in the actual filename of the backup file. I don't see any place in the QCOW2 file format for storing this information inside the file, sadly. We could opt for storing it inside a plain text file that accompanies the QCOW2 file, or tarballing the qcow2 file and that plain text file. qcow2 seems like a good initial format given the existing tools. We could always add another format later, or wrap it with extra information like you suggest. Have you had a chance to start implementing this yet? It'd be great to get it working in the next month. Josh -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ceph-users] snapshot, clone and mount a VM-Image
On Thu, 28 Feb 2013, Josh Durgin wrote: On 02/16/2013 03:51 AM, Jens Kristian S?gaard wrote: We could opt for storing it inside a plain text file that accompanies the QCOW2 file, or tarballing the qcow2 file and that plain text file. qcow2 seems like a good initial format given the existing tools. We could always add another format later, or wrap it with extra information like you suggest. Have you had a chance to start implementing this yet? It'd be great to get it working in the next month. Just so you know, David has been working on the librados list snaps operation that you'll need to let the tool tell what blocks have changed or not. The code is currently in the wip-4207 branch. We expect it will be merged in the next few days, and should be part of v0.59. I suspect the next step would be a function in the 'rbd' tool that would do the export. Then a similar 'import' tool to go with it? sage -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ceph-users] snapshot, clone and mount a VM-Image
Please see: http://tracker.ceph.com/issues/4084 rbd: incremental backups http://tracker.ceph.com/issues/3387 librbd: expose changed objects since a given snapshot http://tracker.ceph.com/issues/3272 send/receive rbd snapshots It would be great if we could track this discussion in those tickets. Ian R. Colle Ceph Program Manager Inktank Cell: +1.303.601.7713 tel:%2B1.303.601.7713 Email: i...@inktank.com http://www.linkedin.com/in/ircolle http://www.twitter.com/ircolle On 2/28/13 4:33 PM, Josh Durgin josh.dur...@inktank.com wrote: On 02/16/2013 03:51 AM, Jens Kristian Søgaard wrote: Hi Sage, 1) Decide what output format to use. We want to use something that is I have given it some thought, and my initial suggestion to keep things simple is to use the QCOW2 image format. The birds eye view of the process would be as follows: * Initial backup User supplied information: pool, image name Create rbd snapshot of the image named backup_1, where 1 could be a timestamp or an integer count. Save the snapshot to a standard qcow2 image. Similar to: qemu-img convert rbd:data/myimage@backup_1 -O qcow2 data_myimage_backup_1.qcow2 Note: I don't know if qemu-img actually supports reading from snapshots currently. It does. * Incremental backup User supplied information: pool, image name, path to initial backup or previous incremental file Create rbd snapshot of the image named backup_2, where 2 could be a timestamp or an integer count. Determine previous snapshot identifier from given file name. Determine objects changed from the snapshot given by that identifier and the newly created snapshot. Construct QCOW2 L1- and L2-tables in memory from that changeset. Create new qcow2 image with the previous backup file as the backing image, and write out the tables and changed blocks. Delete previous rbd snapshot. * Restoring and mounting The use of the QCOW2 format means that we can use existing tools for restoring and mounting the backups. To restore a backup the user can simply choose either the initial backup file or an incremental, and use qemu-img to copy that to a new rbd image. To mount the initial backup or an incremental, the user can use qemu-nbd to mount and explore the backup to determine which one to restore. The performance of restores and mounts would ofcourse be weakened if the backup consists of a large number of incrementals. In that case the existing qemu-img tool could be used to flatten the backup. * Pros/cons The QCOW2 format support compression, so we could implement compressed backups without much effort. The disadvantages to using QCOW2 like this is that we do not have any checksumming or safe guards against potential errors such as users mixing up images. Another disadvantage to this approach is that vital information is stored in the actual filename of the backup file. I don't see any place in the QCOW2 file format for storing this information inside the file, sadly. We could opt for storing it inside a plain text file that accompanies the QCOW2 file, or tarballing the qcow2 file and that plain text file. qcow2 seems like a good initial format given the existing tools. We could always add another format later, or wrap it with extra information like you suggest. Have you had a chance to start implementing this yet? It'd be great to get it working in the next month. Josh -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ceph-users] snapshot, clone and mount a VM-Image
Hi Sage, 1) Decide what output format to use. We want to use something that is I have given it some thought, and my initial suggestion to keep things simple is to use the QCOW2 image format. The birds eye view of the process would be as follows: * Initial backup User supplied information: pool, image name Create rbd snapshot of the image named backup_1, where 1 could be a timestamp or an integer count. Save the snapshot to a standard qcow2 image. Similar to: qemu-img convert rbd:data/myimage@backup_1 -O qcow2 data_myimage_backup_1.qcow2 Note: I don't know if qemu-img actually supports reading from snapshots currently. * Incremental backup User supplied information: pool, image name, path to initial backup or previous incremental file Create rbd snapshot of the image named backup_2, where 2 could be a timestamp or an integer count. Determine previous snapshot identifier from given file name. Determine objects changed from the snapshot given by that identifier and the newly created snapshot. Construct QCOW2 L1- and L2-tables in memory from that changeset. Create new qcow2 image with the previous backup file as the backing image, and write out the tables and changed blocks. Delete previous rbd snapshot. * Restoring and mounting The use of the QCOW2 format means that we can use existing tools for restoring and mounting the backups. To restore a backup the user can simply choose either the initial backup file or an incremental, and use qemu-img to copy that to a new rbd image. To mount the initial backup or an incremental, the user can use qemu-nbd to mount and explore the backup to determine which one to restore. The performance of restores and mounts would ofcourse be weakened if the backup consists of a large number of incrementals. In that case the existing qemu-img tool could be used to flatten the backup. * Pros/cons The QCOW2 format support compression, so we could implement compressed backups without much effort. The disadvantages to using QCOW2 like this is that we do not have any checksumming or safe guards against potential errors such as users mixing up images. Another disadvantage to this approach is that vital information is stored in the actual filename of the backup file. I don't see any place in the QCOW2 file format for storing this information inside the file, sadly. We could opt for storing it inside a plain text file that accompanies the QCOW2 file, or tarballing the qcow2 file and that plain text file. -- Jens Kristian Søgaard, Mermaid Consulting ApS, j...@mermaidconsulting.dk, http://www.mermaidconsulting.com/ -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ceph-users] snapshot, clone and mount a VM-Image
On Sat, 16 Feb 2013, Jens Kristian S?gaard wrote: Hi Sage, 1) Decide what output format to use. We want to use something that is I have given it some thought, and my initial suggestion to keep things simple is to use the QCOW2 image format. Would qcow2 let you store the incremental portion of a backup without the actual image? A common use case would be: - take snapshot1 on cluster A - copy image to cluster B, possibly without writing to an intermediate file.. e.g. 'rbd export ... - | ssh otherhost rbd import - ...' and later, - take snapshot2 on cluster A - stream snapshot1-2 incremental to cluster B sage The birds eye view of the process would be as follows: * Initial backup User supplied information: pool, image name Create rbd snapshot of the image named backup_1, where 1 could be a timestamp or an integer count. Save the snapshot to a standard qcow2 image. Similar to: qemu-img convert rbd:data/myimage@backup_1 -O qcow2 data_myimage_backup_1.qcow2 Note: I don't know if qemu-img actually supports reading from snapshots currently. * Incremental backup User supplied information: pool, image name, path to initial backup or previous incremental file Create rbd snapshot of the image named backup_2, where 2 could be a timestamp or an integer count. Determine previous snapshot identifier from given file name. Determine objects changed from the snapshot given by that identifier and the newly created snapshot. Construct QCOW2 L1- and L2-tables in memory from that changeset. Create new qcow2 image with the previous backup file as the backing image, and write out the tables and changed blocks. Delete previous rbd snapshot. * Restoring and mounting The use of the QCOW2 format means that we can use existing tools for restoring and mounting the backups. To restore a backup the user can simply choose either the initial backup file or an incremental, and use qemu-img to copy that to a new rbd image. To mount the initial backup or an incremental, the user can use qemu-nbd to mount and explore the backup to determine which one to restore. The performance of restores and mounts would ofcourse be weakened if the backup consists of a large number of incrementals. In that case the existing qemu-img tool could be used to flatten the backup. * Pros/cons The QCOW2 format support compression, so we could implement compressed backups without much effort. The disadvantages to using QCOW2 like this is that we do not have any checksumming or safe guards against potential errors such as users mixing up images. Another disadvantage to this approach is that vital information is stored in the actual filename of the backup file. I don't see any place in the QCOW2 file format for storing this information inside the file, sadly. We could opt for storing it inside a plain text file that accompanies the QCOW2 file, or tarballing the qcow2 file and that plain text file. -- Jens Kristian S?gaard, Mermaid Consulting ApS, j...@mermaidconsulting.dk, http://www.mermaidconsulting.com/ -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ceph-users] snapshot, clone and mount a VM-Image
Hi Sage, Would qcow2 let you store the incremental portion of a backup without the actual image? A common use case would be: - take snapshot1 on cluster A - copy image to cluster B, possibly without writing to an intermediate file.. e.g. 'rbd export ... - | ssh otherhost rbd import - ...' and later, - take snapshot2 on cluster A - stream snapshot1-2 incremental to cluster B Well, qcow2 is designed for storing image files on disk, not as a streamed file format - so it is not covered by the specification. However, I don't see what should keep us from using it the way you describe. The only drawback is that standard qcow2 tools will not be able to work with the incremental file on its own - they would need the base image to be file based. Basically a qcow2 incremental file contains: - string holding the path and filename of the backing file - lookup table that tells which blocks are included in the file - actual data in 512 byte blocks In this streaming case we would not have a path and file to the backing file. Instead we could store something like rbd:pool/imagename@snapname instead of a traditional path and file name. -- Jens Kristian Søgaard, Mermaid Consulting ApS, j...@mermaidconsulting.dk, http://www.mermaidconsulting.com/ -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ceph-users] snapshot, clone and mount a VM-Image
On 02/14/2013 12:53 PM, Sage Weil wrote: Hi Jens- On Thu, 14 Feb 2013, Jens Kristian S?gaard wrote: Hi Sage, block device level. We plan to implement an incremental backup function for the relative change between two snapshots (or a snapshot and the head). It's O(n) the size of the device vs the number of files, but should be more efficient for all but the most sparse of images. The implementation should be simple; the challenge is mostly around the incremental file format, probably. That doesn't help you now, but would be a relatively self-contained piece of functionality for someone to contribute to RBD. This isn't a top I'm very interesting in having an incremental backup tool for Ceph, so if it is possible for me to do, I would like to take a shot at implementing it. It will be a spare time project, so I cannot say how fast it will progress though. If you have any details on how you would like to see the implementation work, please let me know! Great to hear you're interested in this! There is a feature in the tracker open: http://tracker.ceph.com/issues/4084 (Not that there is much information there yet!) I think this breaks down into a few different pieces: 1) Decide what output format to use. We want to use something that is resembles a portable, standard way of representing an incremental set of changes to a block device (or large file). I'm not sure what is out there, but we should look carefully before making up our own format. 2) Expose changes objects between rados snapshots. This is some generic functionality we would bake into librbd that would probably work similarly to how read_iterate() currently does (you specify a callback). We probably also want to provide this information directly to a user, so that they can get a dump of (offsets, length) pairs for integration with their own tool. I expect this is just a core librbd method. It'd be nice to implement it as more than one request at once (unlike read_iterate()'s current implementation). The interface could still be the same though. 3) Write a dumper based on #2 that outputs in format from #1. The callback would (instead of printing file offsets) write the data to the output stream with appropriate metadata indicating which part of the image it is. Ideally the output part would be modular, too, so that we can come back later and implement support for new formats easily. The output data stream should be able to be directed at stdout or a file. 4) Write an importer for #1. It would take as input an existing image, assumed to be in the state of the reference snapshot, and write all the changed bits. Take input from stdin or a file. I think it'd be good to have some kind of safety check here by default. Storing a checksum of the original snapshot with the backup and comparing to the image being restored onto would work, but would be pretty slow. Any ideas for better ways to do this? 5) If necessary, extend the above so that image resize events are properly handled. Couldn't this be handled by storing the size of the original snapshot in the diff, and resizing to the size of the diff when restoring? Is there another issue you're thinking of? Probably the trickiest bit here is #2, as it will probably involve adding some low-level rados operations to efficiently query the snapshot state from the client. With this (and any of the rest), we can help figure out how to integrate it cleanly. My suggestion is to start with #1, though (and make sure the rest of this all makes sense to everyone). THanks! sage -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ceph-users] snapshot, clone and mount a VM-Image
On Mon, 11 Feb 2013, Wolfgang Hennerbichler wrote: On 02/11/2013 03:02 PM, Wido den Hollander wrote: You are looking at a way to extract the snapshot, correct? No. Why would you want to mount it and backup the files? because then I can do things like incremental backups. There will be a ceph cluster at an ISP soon, who hosts various services on various VMs, and it is important that the mailspool for example is backed up efficiently, because it's huge and the number of files is also high. Note that an alternative way to approach incremental backups is at the block device level. We plan to implement an incremental backup function for the relative change between two snapshots (or a snapshot and the head). It's O(n) the size of the device vs the number of files, but should be more efficient for all but the most sparse of images. The implementation should be simple; the challenge is mostly around the incremental file format, probably. That doesn't help you now, but would be a relatively self-contained piece of functionality for someone to contribute to RBD. This isn't a top priority yet, so it will be a while before the inktank devs can get to it. sage Couldn't you better handle this in the Virtual Machine itself? not really. open, changing files, a lot of virtual machines that one needs to take care of, and so on. If you want to backup the virtual machines to an extern location you could use either rbd or qemu-img to get the snapshot out of the Ceph cluster: $ rbd export --snap snapshot image name destination Or use qemu-img $ qemu-img convert -f raw -O qcow2 -s snapshot rbd:rbd/image image.qcow2 You then get files which you can backup externally. Would that work? sure, but this is a very inefficient way of backing things up, because one would back up on block level. I want to back up on filesystem level. Wido thanks a lot for you answers Wolfgang -- DI (FH) Wolfgang Hennerbichler Software Development Unit Advanced Computing Technologies RISC Software GmbH A company of the Johannes Kepler University Linz IT-Center Softwarepark 35 4232 Hagenberg Austria Phone: +43 7236 3343 245 Fax: +43 7236 3343 250 wolfgang.hennerbich...@risc-software.at http://www.risc-software.at ___ ceph-users mailing list ceph-us...@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html