Re: [ceph-users] snapshot, clone and mount a VM-Image

2013-02-28 Thread Josh Durgin

On 02/16/2013 03:51 AM, Jens Kristian Søgaard wrote:

Hi Sage,


1) Decide what output format to use.  We want to use something that is


I have given it some thought, and my initial suggestion to keep things
simple is to use the QCOW2 image format.

The birds eye view of the process would be as follows:

* Initial backup

User supplied information: pool, image name

Create rbd snapshot of the image named backup_1, where 1 could be a
timestamp or an integer count.

Save the snapshot to a standard qcow2 image. Similar to:

qemu-img convert rbd:data/myimage@backup_1 -O qcow2
data_myimage_backup_1.qcow2

Note: I don't know if qemu-img actually supports reading from snapshots
currently.


It does.


* Incremental backup

User supplied information: pool, image name, path to initial backup or
previous incremental file

Create rbd snapshot of the image named backup_2, where 2 could be a
timestamp or an integer count.

Determine previous snapshot identifier from given file name.

Determine objects changed from the snapshot given by that identifier and
the newly created snapshot.

Construct QCOW2 L1- and L2-tables in memory from that changeset.

Create new qcow2 image with the previous backup file as the backing
image, and write out the tables and changed blocks.

Delete previous rbd snapshot.


* Restoring and mounting


The use of the QCOW2 format means that we can use existing tools for
restoring and mounting the backups.

To restore a backup the user can simply choose either the initial backup
file or an incremental, and use qemu-img to copy that to a new rbd image.

To mount the initial backup or an incremental, the user can use qemu-nbd
to mount and explore the backup to determine which one to restore.

The performance of restores and mounts would ofcourse be weakened if the
backup consists of a large number of incrementals. In that case the
existing qemu-img tool could be used to flatten the backup.


* Pros/cons

The QCOW2 format support compression, so we could implement compressed
backups without much effort.

The disadvantages to using QCOW2 like this is that we do not have any
checksumming or safe guards against potential errors such as users
mixing up images.

Another disadvantage to this approach is that vital information is
stored in the actual filename of the backup file. I don't see any place
in the QCOW2 file format for storing this information inside the file,
sadly.

We could opt for storing it inside a plain text file that accompanies
the QCOW2 file, or tarballing the qcow2 file and that plain text file.


qcow2 seems like a good initial format given the existing tools. We
could always add another format later, or wrap it with extra
information like you suggest.

Have you had a chance to start implementing this yet? It'd be great to
get it working in the next month.

Josh
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ceph-users] snapshot, clone and mount a VM-Image

2013-02-28 Thread Sage Weil
On Thu, 28 Feb 2013, Josh Durgin wrote:
 On 02/16/2013 03:51 AM, Jens Kristian S?gaard wrote:
  We could opt for storing it inside a plain text file that accompanies
  the QCOW2 file, or tarballing the qcow2 file and that plain text file.
 
 qcow2 seems like a good initial format given the existing tools. We
 could always add another format later, or wrap it with extra
 information like you suggest.
 
 Have you had a chance to start implementing this yet? It'd be great to
 get it working in the next month.

Just so you know, David has been working on the librados list snaps 
operation that you'll need to let the tool tell what blocks have changed 
or not.  The code is currently in the wip-4207 branch.  We expect it will 
be merged in the next few days, and should be part of v0.59.

I suspect the next step would be a function in the 'rbd' tool that would 
do the export.  Then a similar 'import' tool to go with it?

sage
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ceph-users] snapshot, clone and mount a VM-Image

2013-02-28 Thread Ian Colle
Please see:
http://tracker.ceph.com/issues/4084 rbd: incremental backups

http://tracker.ceph.com/issues/3387 librbd: expose changed objects since a
given snapshot

http://tracker.ceph.com/issues/3272 send/receive rbd snapshots


It would be great if we could track this discussion in those tickets.

Ian R. Colle
Ceph Program Manager
Inktank
Cell: +1.303.601.7713 tel:%2B1.303.601.7713
Email: i...@inktank.com


 http://www.linkedin.com/in/ircolle
 http://www.twitter.com/ircolle




On 2/28/13 4:33 PM, Josh Durgin josh.dur...@inktank.com wrote:

On 02/16/2013 03:51 AM, Jens Kristian Søgaard wrote:
 Hi Sage,

 1) Decide what output format to use.  We want to use something that is

 I have given it some thought, and my initial suggestion to keep things
 simple is to use the QCOW2 image format.

 The birds eye view of the process would be as follows:

 * Initial backup

 User supplied information: pool, image name

 Create rbd snapshot of the image named backup_1, where 1 could be a
 timestamp or an integer count.

 Save the snapshot to a standard qcow2 image. Similar to:

 qemu-img convert rbd:data/myimage@backup_1 -O qcow2
 data_myimage_backup_1.qcow2

 Note: I don't know if qemu-img actually supports reading from snapshots
 currently.

It does.

 * Incremental backup

 User supplied information: pool, image name, path to initial backup or
 previous incremental file

 Create rbd snapshot of the image named backup_2, where 2 could be a
 timestamp or an integer count.

 Determine previous snapshot identifier from given file name.

 Determine objects changed from the snapshot given by that identifier and
 the newly created snapshot.

 Construct QCOW2 L1- and L2-tables in memory from that changeset.

 Create new qcow2 image with the previous backup file as the backing
 image, and write out the tables and changed blocks.

 Delete previous rbd snapshot.


 * Restoring and mounting


 The use of the QCOW2 format means that we can use existing tools for
 restoring and mounting the backups.

 To restore a backup the user can simply choose either the initial backup
 file or an incremental, and use qemu-img to copy that to a new rbd
image.

 To mount the initial backup or an incremental, the user can use qemu-nbd
 to mount and explore the backup to determine which one to restore.

 The performance of restores and mounts would ofcourse be weakened if the
 backup consists of a large number of incrementals. In that case the
 existing qemu-img tool could be used to flatten the backup.


 * Pros/cons

 The QCOW2 format support compression, so we could implement compressed
 backups without much effort.

 The disadvantages to using QCOW2 like this is that we do not have any
 checksumming or safe guards against potential errors such as users
 mixing up images.

 Another disadvantage to this approach is that vital information is
 stored in the actual filename of the backup file. I don't see any place
 in the QCOW2 file format for storing this information inside the file,
 sadly.

 We could opt for storing it inside a plain text file that accompanies
 the QCOW2 file, or tarballing the qcow2 file and that plain text file.

qcow2 seems like a good initial format given the existing tools. We
could always add another format later, or wrap it with extra
information like you suggest.

Have you had a chance to start implementing this yet? It'd be great to
get it working in the next month.

Josh
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ceph-users] snapshot, clone and mount a VM-Image

2013-02-16 Thread Jens Kristian Søgaard

Hi Sage,

1) Decide what output format to use.  We want to use something that is 


I have given it some thought, and my initial suggestion to keep things 
simple is to use the QCOW2 image format.


The birds eye view of the process would be as follows:

* Initial backup

User supplied information: pool, image name

Create rbd snapshot of the image named backup_1, where 1 could be a 
timestamp or an integer count.


Save the snapshot to a standard qcow2 image. Similar to:

qemu-img convert rbd:data/myimage@backup_1 -O qcow2 
data_myimage_backup_1.qcow2


Note: I don't know if qemu-img actually supports reading from snapshots 
currently.



* Incremental backup

User supplied information: pool, image name, path to initial backup or 
previous incremental file


Create rbd snapshot of the image named backup_2, where 2 could be a 
timestamp or an integer count.


Determine previous snapshot identifier from given file name.

Determine objects changed from the snapshot given by that identifier and 
the newly created snapshot.


Construct QCOW2 L1- and L2-tables in memory from that changeset.

Create new qcow2 image with the previous backup file as the backing 
image, and write out the tables and changed blocks.


Delete previous rbd snapshot.


* Restoring and mounting


The use of the QCOW2 format means that we can use existing tools for 
restoring and mounting the backups.


To restore a backup the user can simply choose either the initial backup 
file or an incremental, and use qemu-img to copy that to a new rbd image.


To mount the initial backup or an incremental, the user can use qemu-nbd 
to mount and explore the backup to determine which one to restore.


The performance of restores and mounts would ofcourse be weakened if the 
backup consists of a large number of incrementals. In that case the 
existing qemu-img tool could be used to flatten the backup.



* Pros/cons

The QCOW2 format support compression, so we could implement compressed 
backups without much effort.


The disadvantages to using QCOW2 like this is that we do not have any 
checksumming or safe guards against potential errors such as users 
mixing up images.


Another disadvantage to this approach is that vital information is 
stored in the actual filename of the backup file. I don't see any place 
in the QCOW2 file format for storing this information inside the file, 
sadly.


We could opt for storing it inside a plain text file that accompanies 
the QCOW2 file, or tarballing the qcow2 file and that plain text file.


--
Jens Kristian Søgaard, Mermaid Consulting ApS,
j...@mermaidconsulting.dk,
http://www.mermaidconsulting.com/
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ceph-users] snapshot, clone and mount a VM-Image

2013-02-16 Thread Sage Weil
On Sat, 16 Feb 2013, Jens Kristian S?gaard wrote:
 Hi Sage,
 
  1) Decide what output format to use.  We want to use something that is 
 
 I have given it some thought, and my initial suggestion to keep things simple
 is to use the QCOW2 image format.

Would qcow2 let you store the incremental portion of a backup without the 
actual image?  A common use case would be:

- take snapshot1 on cluster A
- copy image to cluster B, possibly without writing to an intermediate 
  file.. e.g. 'rbd export ... - | ssh otherhost rbd import - ...'

and later,

- take snapshot2 on cluster A
- stream snapshot1-2 incremental to cluster B

sage



 
 The birds eye view of the process would be as follows:
 
 * Initial backup
 
 User supplied information: pool, image name
 
 Create rbd snapshot of the image named backup_1, where 1 could be a
 timestamp or an integer count.
 
 Save the snapshot to a standard qcow2 image. Similar to:
 
 qemu-img convert rbd:data/myimage@backup_1 -O qcow2
 data_myimage_backup_1.qcow2
 
 Note: I don't know if qemu-img actually supports reading from snapshots
 currently.
 
 
 * Incremental backup
 
 User supplied information: pool, image name, path to initial backup or
 previous incremental file
 
 Create rbd snapshot of the image named backup_2, where 2 could be a
 timestamp or an integer count.
 
 Determine previous snapshot identifier from given file name.
 
 Determine objects changed from the snapshot given by that identifier and the
 newly created snapshot.
 
 Construct QCOW2 L1- and L2-tables in memory from that changeset.
 
 Create new qcow2 image with the previous backup file as the backing image, and
 write out the tables and changed blocks.
 
 Delete previous rbd snapshot.
 
 
 * Restoring and mounting
 
 
 The use of the QCOW2 format means that we can use existing tools for restoring
 and mounting the backups.
 
 To restore a backup the user can simply choose either the initial backup file
 or an incremental, and use qemu-img to copy that to a new rbd image.
 
 To mount the initial backup or an incremental, the user can use qemu-nbd to
 mount and explore the backup to determine which one to restore.
 
 The performance of restores and mounts would ofcourse be weakened if the
 backup consists of a large number of incrementals. In that case the existing
 qemu-img tool could be used to flatten the backup.
 
 
 * Pros/cons
 
 The QCOW2 format support compression, so we could implement compressed backups
 without much effort.
 
 The disadvantages to using QCOW2 like this is that we do not have any
 checksumming or safe guards against potential errors such as users mixing up
 images.
 
 Another disadvantage to this approach is that vital information is stored in
 the actual filename of the backup file. I don't see any place in the QCOW2
 file format for storing this information inside the file, sadly.
 
 We could opt for storing it inside a plain text file that accompanies the
 QCOW2 file, or tarballing the qcow2 file and that plain text file.
 
 -- 
 Jens Kristian S?gaard, Mermaid Consulting ApS,
 j...@mermaidconsulting.dk,
 http://www.mermaidconsulting.com/
 
 
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ceph-users] snapshot, clone and mount a VM-Image

2013-02-16 Thread Jens Kristian Søgaard

Hi Sage,

Would qcow2 let you store the incremental portion of a backup without the 
actual image?  A common use case would be:



- take snapshot1 on cluster A
- copy image to cluster B, possibly without writing to an intermediate 
  file.. e.g. 'rbd export ... - | ssh otherhost rbd import - ...'

and later,
- take snapshot2 on cluster A
- stream snapshot1-2 incremental to cluster B


Well, qcow2 is designed for storing image files on disk, not as a 
streamed file format - so it is not covered by the specification. 
However, I don't see what should keep us from using it the way you describe.


The only drawback is that standard qcow2 tools will not be able to work 
with the incremental file on its own - they would need the base image to 
be file based.


Basically a qcow2 incremental file contains:

 - string holding the path and filename of the backing file
 - lookup table that tells which blocks are included in the file
 - actual data in 512 byte blocks

In this streaming case we would not have a path and file to the 
backing file. Instead we could store something like 
rbd:pool/imagename@snapname instead of a traditional path and file name.


--
Jens Kristian Søgaard, Mermaid Consulting ApS,
j...@mermaidconsulting.dk,
http://www.mermaidconsulting.com/
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ceph-users] snapshot, clone and mount a VM-Image

2013-02-14 Thread Josh Durgin

On 02/14/2013 12:53 PM, Sage Weil wrote:

Hi Jens-

On Thu, 14 Feb 2013, Jens Kristian S?gaard wrote:

Hi Sage,


block device level.  We plan to implement an incremental backup function for
the relative change between two snapshots (or a snapshot and the head).
It's O(n) the size of the device vs the number of files, but should be more
efficient for all but the most sparse of images.  The implementation should
be simple; the challenge is mostly around the incremental file format,
probably.
That doesn't help you now, but would be a relatively self-contained piece of
functionality for someone to contribute to RBD.  This isn't a top


I'm very interesting in having an incremental backup tool for Ceph, so if it
is possible for me to do, I would like to take a shot at implementing it. It
will be a spare time project, so I cannot say how fast it will progress
though.

If you have any details on how you would like to see the implementation work,
please let me know!


Great to hear you're interested in this!  There is a feature in the
tracker open:

http://tracker.ceph.com/issues/4084

(Not that there is much information there yet!)

I think this breaks down into a few different pieces:

1) Decide what output format to use.  We want to use something that is
resembles a portable, standard way of representing an incremental set of
changes to a block device (or large file).  I'm not sure what is out
there, but we should look carefully before making up our own format.

2) Expose changes objects between rados snapshots.  This is some generic
functionality we would bake into librbd that would probably work similarly
to how read_iterate() currently does (you specify a callback).  We
probably also want to provide this information directly to a user, so that
they can get a dump of (offsets, length) pairs for integration with their
own tool.  I expect this is just a core librbd method.


It'd be nice to implement it as more than one request at once (unlike
read_iterate()'s current implementation). The interface could still
be the same though.


3) Write a dumper based on #2 that outputs in format from #1.  The
callback would (instead of printing file offsets) write the data to the
output stream with appropriate metadata indicating which part of the image
it is.  Ideally the output part would be modular, too, so that we can come
back later and implement support for new formats easily.  The output data
stream should be able to be directed at stdout or a file.

4) Write an importer for #1.  It would take as input an existing image,
assumed to be in the state of the reference snapshot, and write all the
changed bits.  Take input from stdin or a file.


I think it'd be good to have some kind of safety check here by default. 
Storing a checksum of the original snapshot with the backup and

comparing to the image being restored onto would work, but would be
pretty slow. Any ideas for better ways to do this?


5) If necessary, extend the above so that image resize events are properly
handled.


Couldn't this be handled by storing the size of the original snapshot
in the diff, and resizing to the size of the diff when restoring? Is
there another issue you're thinking of?


Probably the trickiest bit here is #2, as it will probably involve adding
some low-level rados operations to efficiently query the snapshot state
from the client.  With this (and any of the rest), we can help figure out
how to integrate it cleanly.  My suggestion is to start with #1, though
(and make sure the rest of this all makes sense to everyone).

THanks!
sage



--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ceph-users] snapshot, clone and mount a VM-Image

2013-02-11 Thread Sage Weil
On Mon, 11 Feb 2013, Wolfgang Hennerbichler wrote:
 
 
 On 02/11/2013 03:02 PM, Wido den Hollander wrote:
 
  You are looking at a way to extract the snapshot, correct?
 
 No.
 
  Why would
  you want to mount it and backup the files?
 
 because then I can do things like incremental backups. There will be a
 ceph cluster at an ISP soon, who hosts various services on various VMs,
 and it is important that the mailspool for example is backed up
 efficiently, because it's huge and the number of files is also high.

Note that an alternative way to approach incremental backups is at the 
block device level.  We plan to implement an incremental backup function 
for the relative change between two snapshots (or a snapshot and the 
head).  It's O(n) the size of the device vs the number of files, but 
should be more efficient for all but the most sparse of images.  The 
implementation should be simple; the challenge is mostly around the 
incremental file format, probably.

That doesn't help you now, but would be a relatively self-contained piece 
of functionality for someone to contribute to RBD.  This isn't a top 
priority yet, so it will be a while before the inktank devs can get to it.

sage


 
  Couldn't you better handle this in the Virtual Machine itself?
 
 not really. open, changing files, a lot of virtual machines that one
 needs to take care of, and so on.
 
  If you want to backup the virtual machines to an extern location you
  could use either rbd or qemu-img to get the snapshot out of the Ceph
  cluster:
  
  $ rbd export --snap snapshot image name destination
  
  Or use qemu-img
  
  $ qemu-img convert -f raw -O qcow2 -s snapshot rbd:rbd/image
  image.qcow2
  
  You then get files which you can backup externally.
  
  Would that work?
 
 sure, but this is a very inefficient way of backing things up, because
 one would back up on block level. I want to back up on filesystem level.
 
  Wido
  
  thanks a lot for you answers
  Wolfgang
 
  
  
 
 
 -- 
 DI (FH) Wolfgang Hennerbichler
 Software Development
 Unit Advanced Computing Technologies
 RISC Software GmbH
 A company of the Johannes Kepler University Linz
 
 IT-Center
 Softwarepark 35
 4232 Hagenberg
 Austria
 
 Phone: +43 7236 3343 245
 Fax: +43 7236 3343 250
 wolfgang.hennerbich...@risc-software.at
 http://www.risc-software.at
 ___
 ceph-users mailing list
 ceph-us...@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html