Hi again,

I've been looking around the backup/restore code a bit. I'm focused on restore acceleration on Ceph RBD right know.

Sorry if I have something mistaken, I have never developed for Proxmox/Qemu.

I see in line 563 of file
https://git.proxmox.com/?p=pve-qemu-kvm.git;a=blob;f=debian/patches/pve/0011-introduce-new-vma-archive-format.patch;h=1c26209648c210f3b18576abc2c5a23768fd7c7b;hb=HEAD
the function restore_write_data, it is calling full_write (for direct to file restore) and bdrv_write (what I suppose is a QEMU abstraction of block device).

This is called from restore_extents, where a comment precisely says "try to write whole clusters to speedup restore", so this means we're writing 64KB-8Byte chunks, which is giving a hard time to Ceph-RBD because this means lots of ~64KB IOPS.

So, I suggest the following solution to your consideration:
- Create a write buffer on startup (let's asume it's 4MB for example, a number ceph rbd would like much more than 64KB). This could even be configurable and skip the buffer altogether if buffer_size=cluster_size - Wrap current "restore_write_data" with a "restore_write_data_with_buffer", that does a copy to the 4MB buffer, and only calls "restore_write_data" when it's full. * Create a new "flush_restore_write_data_buffer" to flush the write buffer when device restore reading is complete.

Do you think this is a good idea? If so I will find time to implement and test this to check whether restore time improves.

Thanks a lot
Eneko


El 20/07/16 a las 08:24, Eneko Lacunza escribió:
El 16/02/16 a las 15:52, Stefan Priebe - Profihost AG escribió:
Am 16.02.2016 um 15:50 schrieb Dmitry Petuhov:
16.02.2016 13:20, Dietmar Maurer wrote:
Storage Backend is ceph using 2x 10Gbit/s and i'm able to read from it
with 500-1500MB/s. See below for an example.
The backup process reads 64KB blocks, and it seems this slows down ceph.
This is a known behavior, but I found no solution to speed it up.
Just done script to speedup my backups from ceph. It's simply does
(actually little more):
rbd snap create $SNAP
rbd export $SNAP $DUMPDIR/$POOL-$VOLUME-$DATE.raw
rbd snap rm $SNAP
for every image in selected pools.

When exporting to file, it's faster than my temporary HDD can write
(about 120MB/s). But exporting to STDOUT ('-' instead of filename, with
compression or without it) noticeably decreases speed to qemu's levels
(20-30MB/s). That's little strange.

This method is incompatible with PVE's backup-restore tools, but good
enough for manual disaster recovery from CLI.
right - that'S working for me too but just at night and not when a
single user wants RIGHT now a backup incl. config.
Do we have any improvement related to this in the pipeline? Yesterday our 9-osd 3-node cluster restored a backup at 6MB/s... it was very boring, painfull and expensive to wait for it :) (I decided to buy a new server to replace our 7.5-year IBM while waiting ;) )

Our backups are slow too, but we do those during weekend... but usually we want to restore fast... :)

Dietmar, I haven't looked at the backup/restore code, but do you think we could do something to read/write to storage in larger chunks than the actual 64KB? I'm out of a high work load period and maybe could look at this issue this summer.

Thanks
Eneko



--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
      943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

_______________________________________________
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to