[pve-devel] Speed up PVE Backup

2016-02-16 Thread Stefan Priebe - Profihost AG
Hi,

is there any way to speed up PVE Backups?

I'm trying to evaluate the optimal method doing backups but they took
very long.

I'm trying to use vzdump on top of nfs on top of btrfs using zlib
compression.

The target FS it totally idle but the backup is running at a very low speed.

The output after 15 minutes is:
INFO: starting new backup job: vzdump 132 --remove 0 --mode snapshot
--storage vmbackup --node 1234
INFO: Starting Backup of VM 132 (qemu)
INFO: status = running
INFO: update VM 132: -lock backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating archive
'/mnt/pve/vmbackup/dump/vzdump-qemu-132-2016_02_16-09_05_28.vma'
INFO: started backup task 'e75cc760-2be6-4731-b65b-f78b2832baf9'
INFO: status: 0% (79036416/1073741824000), sparse 0% (10665984),
duration 3, 26/22 MB/s
INFO: status: 1% (10742530048/1073741824000), sparse 0% (6539657216),
duration 411, 26/10 MB/s
INFO: status: 2% (21485322240/1073741824000), sparse 1% (17281466368),
duration 820, 26/0 MB/s

This is PVE 3.4 running Qemu 2.4

Greets,
Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Andreas Steinel
Hi Stefan,

That's really slow.

I use a similar setup, but with ZFS and I backup 6 nodes in parallel to the
storage and saturate the 1 GBit network connection.
I use LZOP on the Proxmox-side as best tradeoff between size and
online-compression speed.



On Tue, Feb 16, 2016 at 9:22 AM, Stefan Priebe - Profihost AG <
s.pri...@profihost.ag> wrote:

> Hi,
>
> is there any way to speed up PVE Backups?
>
> I'm trying to evaluate the optimal method doing backups but they took
> very long.
>
> I'm trying to use vzdump on top of nfs on top of btrfs using zlib
> compression.
>
> The target FS it totally idle but the backup is running at a very low
> speed.
>
> The output after 15 minutes is:
> INFO: starting new backup job: vzdump 132 --remove 0 --mode snapshot
> --storage vmbackup --node 1234
> INFO: Starting Backup of VM 132 (qemu)
> INFO: status = running
> INFO: update VM 132: -lock backup
> INFO: backup mode: snapshot
> INFO: ionice priority: 7
> INFO: snapshots found (not included into backup)
> INFO: creating archive
> '/mnt/pve/vmbackup/dump/vzdump-qemu-132-2016_02_16-09_05_28.vma'
> INFO: started backup task 'e75cc760-2be6-4731-b65b-f78b2832baf9'
> INFO: status: 0% (79036416/1073741824000), sparse 0% (10665984),
> duration 3, 26/22 MB/s
> INFO: status: 1% (10742530048/1073741824000), sparse 0% (6539657216),
> duration 411, 26/10 MB/s
> INFO: status: 2% (21485322240/1073741824000), sparse 1% (17281466368),
> duration 820, 26/0 MB/s
>
> This is PVE 3.4 running Qemu 2.4
>
> Greets,
> Stefan
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Martin Waschbüsch
Hi Stefan,

> Am 16.02.2016 um 09:22 schrieb Stefan Priebe - Profihost AG 
> :
> 
> Hi,
> 
> is there any way to speed up PVE Backups?
> 
> I'm trying to evaluate the optimal method doing backups but they took
> very long.
> 
> I'm trying to use vzdump on top of nfs on top of btrfs using zlib
> compression.
> 
> The target FS it totally idle but the backup is running at a very low speed.
> 
> The output after 15 minutes is:
> INFO: starting new backup job: vzdump 132 --remove 0 --mode snapshot
> --storage vmbackup --node 1234
> INFO: Starting Backup of VM 132 (qemu)
> INFO: status = running
> INFO: update VM 132: -lock backup
> INFO: backup mode: snapshot
> INFO: ionice priority: 7
> INFO: snapshots found (not included into backup)
> INFO: creating archive
> '/mnt/pve/vmbackup/dump/vzdump-qemu-132-2016_02_16-09_05_28.vma'
> INFO: started backup task 'e75cc760-2be6-4731-b65b-f78b2832baf9'
> INFO: status: 0% (79036416/1073741824000), sparse 0% (10665984),
> duration 3, 26/22 MB/s
> INFO: status: 1% (10742530048/1073741824000), sparse 0% (6539657216),
> duration 411, 26/10 MB/s
> INFO: status: 2% (21485322240/1073741824000), sparse 1% (17281466368),
> duration 820, 26/0 MB/s
> 
> This is PVE 3.4 running Qemu 2.4

To me this looks like the compression is the limiting factor? What speed do you 
get for this NFS mount when just copying an existing file?

Anyway, first of all, if network bandwidth and backup disk space are not 
limiting factors, lzo compression is *way* faster than gzip.

Having said that, there is a way to speed up gzip compression: I am not sure if 
that option already exists in PVE 3.4, but new versions of vzdump recognize an 
option in /etc/vzdump.conf:
pigz: n
where n is the number of threads that pigz (parallel implementation of gzip) 
should use.
Obviously, you also need to have pigz installed (apt-get install pigz).
This can drastically speed up the backups, but for the cost of much higher load 
during the backup.

The other thing I did to minimize backup times is ensure that unused diskspace 
inside KVM guests is zeroed so that there are more sparse blocks. Helps a lot!
Personally, I do this by having images on a ceph cluster which supports using 
discard (trim), so VMs take care of this themselves.
But there are other options like zerofree, etc.

Just my five cents...

Martin
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Stefan Priebe - Profihost AG
Am 16.02.2016 um 09:54 schrieb Andreas Steinel:
> Hi Stefan,
> 
> That's really slow.

Yes

> I use a similar setup, but with ZFS and I backup 6 nodes in parallel to
> the storage and saturate the 1 GBit network connection.

Currently vzdump / qemu is only uses around 100kb/s of the 10Gbit/s
connection.

> I use LZOP on the Proxmox-side as best tradeoff between size and
> online-compression speed. 

In my tests i'm using no compression at all - but the dumping speed is
so slow.

Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Stefan Priebe - Profihost AG
Am 16.02.2016 um 09:57 schrieb Martin Waschbüsch:
> Hi Stefan,

>> This is PVE 3.4 running Qemu 2.4
> 
> To me this looks like the compression is the limiting factor? What speed do 
> you get for this NFS mount when just copying an existing file?

Which compression? There is only FS compression on the target side
involed. I'm not using any compression on the source side.

Writing to the NFS is very fast:
# du -sh /root/testfile
640M/root/testfile

# dd if=/root/testfile of=/mnt/pve/vmbackup/dump/testfile bs=4M oflag=direct
159+1 records in
159+1 records out
671088620 bytes (671 MB) copied, 0,855779 s, 784 MB/s

> Anyway, first of all, if network bandwidth and backup disk space are not 
> limiting factors, lzo compression is *way* faster than gzip.

Sure it is but that's not the problem. The problem seems to be the
dumping speed of qemu.

Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Martin Waschbüsch

> Am 16.02.2016 um 10:32 schrieb Stefan Priebe - Profihost AG 
> :
> 
> Am 16.02.2016 um 09:57 schrieb Martin Waschbüsch:
>> Hi Stefan,
> 
>>> This is PVE 3.4 running Qemu 2.4
>> 
>> To me this looks like the compression is the limiting factor? What speed do 
>> you get for this NFS mount when just copying an existing file?
> 
> Which compression? There is only FS compression on the target side
> involed. I'm not using any compression on the source side.

Sorry, reading the 26MB/s I just assumed it sort of HAD to be compressed for it 
to be so slow...
What kind of storage backend do you use for the images on the source side?
Can you dd a disk image from that backend to the nfs mount with good speed?

> Writing to the NFS is very fast:
> # du -sh /root/testfile
> 640M/root/testfile
> 
> # dd if=/root/testfile of=/mnt/pve/vmbackup/dump/testfile bs=4M oflag=direct
> 159+1 records in
> 159+1 records out
> 671088620 bytes (671 MB) copied, 0,855779 s, 784 MB/s

That looks like there should be ample bandwidth. ;-)


Martin
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Eduard Ahmatgareev
If you have restrictions cgroups on your vm, this vm backups with this
restrictions.


2016-02-16 12:02 GMT+02:00 Martin Waschbüsch :

>
> > Am 16.02.2016 um 10:32 schrieb Stefan Priebe - Profihost AG <
> s.pri...@profihost.ag>:
> >
> > Am 16.02.2016 um 09:57 schrieb Martin Waschbüsch:
> >> Hi Stefan,
> >
> >>> This is PVE 3.4 running Qemu 2.4
> >>
> >> To me this looks like the compression is the limiting factor? What
> speed do you get for this NFS mount when just copying an existing file?
> >
> > Which compression? There is only FS compression on the target side
> > involed. I'm not using any compression on the source side.
>
> Sorry, reading the 26MB/s I just assumed it sort of HAD to be compressed
> for it to be so slow...
> What kind of storage backend do you use for the images on the source side?
> Can you dd a disk image from that backend to the nfs mount with good speed?
>
> > Writing to the NFS is very fast:
> > # du -sh /root/testfile
> > 640M/root/testfile
> >
> > # dd if=/root/testfile of=/mnt/pve/vmbackup/dump/testfile bs=4M
> oflag=direct
> > 159+1 records in
> > 159+1 records out
> > 671088620 bytes (671 MB) copied, 0,855779 s, 784 MB/s
>
> That looks like there should be ample bandwidth. ;-)
>
>
> Martin
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Martin Waschbüsch
Stefan,

> The output after 15 minutes is:
> INFO: starting new backup job: vzdump 132 --remove 0 --mode snapshot
> --storage vmbackup --node 1234
> INFO: Starting Backup of VM 132 (qemu)
> INFO: status = running
> INFO: update VM 132: -lock backup
> INFO: backup mode: snapshot
> INFO: ionice priority: 7
> INFO: snapshots found (not included into backup)

The last line above indicates that there are live snapshots (and that does not 
mean the snapshot made to create the backup).
Could that be the culprit?

Can you test with a VM that does not have a live snapshot?

Martin
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Stefan Priebe - Profihost AG
Am 16.02.2016 um 11:02 schrieb Martin Waschbüsch:
> 
>> Am 16.02.2016 um 10:32 schrieb Stefan Priebe - Profihost AG 
>> :
>>
>> Am 16.02.2016 um 09:57 schrieb Martin Waschbüsch:
>>> Hi Stefan,
>>
 This is PVE 3.4 running Qemu 2.4
>>>
>>> To me this looks like the compression is the limiting factor? What speed do 
>>> you get for this NFS mount when just copying an existing file?
>>
>> Which compression? There is only FS compression on the target side
>> involed. I'm not using any compression on the source side.
> 
> Sorry, reading the 26MB/s I just assumed it sort of HAD to be compressed for 
> it to be so slow...
> What kind of storage backend do you use for the images on the source side?

Storage Backend is ceph using 2x 10Gbit/s and i'm able to read from it
with 500-1500MB/s. See below for an example.

> Can you dd a disk image from that backend to the nfs mount with good speed?

Yes.

# time rbd -p cephstor4 export vm-264-disk-1 abc.img
Exporting image: 100% complete...done.

real1m30.828s
user1m3.858s
sys 0m43.645s

# du -h abc.img
30G abc.img

Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Dietmar Maurer
> Storage Backend is ceph using 2x 10Gbit/s and i'm able to read from it
> with 500-1500MB/s. See below for an example.

The backup process reads 64KB blocks, and it seems this slows down ceph.
This is a known behavior, but I found no solution to speed it up.

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Stefan Priebe - Profihost AG

Am 16.02.2016 um 11:20 schrieb Dietmar Maurer:
>> Storage Backend is ceph using 2x 10Gbit/s and i'm able to read from it
>> with 500-1500MB/s. See below for an example.
> 
> The backup process reads 64KB blocks, and it seems this slows down ceph.
> This is a known behavior, but I found no solution to speed it up.

Thanks for that hint. Currently it's awfully slow. Do you know where to
change this to other values? (just for testing)

Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Stefan Priebe - Profihost AG

Am 16.02.2016 um 11:20 schrieb Dietmar Maurer:
>> Storage Backend is ceph using 2x 10Gbit/s and i'm able to read from it
>> with 500-1500MB/s. See below for an example.
> 
> The backup process reads 64KB blocks, and it seems this slows down ceph.
> This is a known behavior, but I found no solution to speed it up.
> 

Is it enough to just change these:
debian/patches/0002-add-basic-backup-support-to-block-driver.patch:Currently
backup cluster size is hardcoded to 65536 bytes.
debian/patches/0004-introduce-new-vma-archive-format.patch:+We use a
cluster size of 65536, and use 8 bytes for each
debian/patches/0004-introduce-new-vma-archive-format.patch:+char
buf[65536*header_clusters];
debian/patches/0004-introduce-new-vma-archive-format.patch:+#if
VMA_CLUSTER_SIZE != 65536
debian/patches/backup-vma-remove-async-queue.patch:-char
buf[65536*header_clusters];
debian/patches/backup-add-vma-binary.patch:+char
buf[65536*header_clusters];
debian/patches/backup-add-vma-binary.patch:+#if VMA_CLUSTER_SIZE != 65536

Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Dietmar Maurer
> Is it enough to just change these:

The whole backup algorithm is based on 64KB blocksize, so it
is not trivial (or impossible?) to change that.

Besides, I do not understand why reading 64KB is slow - ceph libraries
should have/use a reasonable readahead cache to make it fast?

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Michael Rasmussen
On Tue, 16 Feb 2016 11:55:07 +0100 (CET)
Dietmar Maurer  wrote:

> 
> Besides, I do not understand why reading 64KB is slow - ceph libraries
> should have/use a reasonable readahead cache to make it fast?
> 
Due to the nature of the operation that reading is considered random
block read by ceph so a randread fio test with a block size of 64KB
should show similarly bad numbers.

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael  rasmussen  cc
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
mir  datanom  net
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
mir  miras  org
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
--
/usr/games/fortune -es says:
I understand why you're confused.  You're thinking too much.
-- Carole Wallach.


pgp6hd_8xbL96.pgp
Description: OpenPGP digital signature
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Stefan Priebe - Profihost AG
Am 16.02.2016 um 11:55 schrieb Dietmar Maurer:
>> Is it enough to just change these:
> 
> The whole backup algorithm is based on 64KB blocksize, so it
> is not trivial (or impossible?) to change that.
> 
> Besides, I do not understand why reading 64KB is slow - ceph libraries
> should have/use a reasonable readahead cache to make it fast?
> 

OK i found the culprit at least in my case. I've reasanle IOP/s numbers
for my VMs set. It seems they're also active for the backup process.

But they're not optimized for 64k sequential I/O. Also this means the
user has not the full performance while doing backups.

May you have seen the same while testing? If i remove the disk throttle
limits i see around 130MB/s for the vma backup.

Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Stefan Priebe - Profihost AG

Am 16.02.2016 um 12:58 schrieb Stefan Priebe - Profihost AG:
> Am 16.02.2016 um 11:55 schrieb Dietmar Maurer:
>>> Is it enough to just change these:
>>
>> The whole backup algorithm is based on 64KB blocksize, so it
>> is not trivial (or impossible?) to change that.
>>
>> Besides, I do not understand why reading 64KB is slow - ceph libraries
>> should have/use a reasonable readahead cache to make it fast?

at least for ceph and also for the i/o limits a block size of 4MB would
be ideal. Anay chance to hack this in?

Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Dmitry Petuhov

16.02.2016 13:20, Dietmar Maurer wrote:

Storage Backend is ceph using 2x 10Gbit/s and i'm able to read from it
with 500-1500MB/s. See below for an example.

The backup process reads 64KB blocks, and it seems this slows down ceph.
This is a known behavior, but I found no solution to speed it up.
Just done script to speedup my backups from ceph. It's simply does 
(actually little more):

rbd snap create $SNAP
rbd export $SNAP $DUMPDIR/$POOL-$VOLUME-$DATE.raw
rbd snap rm $SNAP
for every image in selected pools.

When exporting to file, it's faster than my temporary HDD can write 
(about 120MB/s). But exporting to STDOUT ('-' instead of filename, with 
compression or without it) noticeably decreases speed to qemu's levels 
(20-30MB/s). That's little strange.


This method is incompatible with PVE's backup-restore tools, but good 
enough for manual disaster recovery from CLI.


___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-16 Thread Stefan Priebe - Profihost AG

Am 16.02.2016 um 15:50 schrieb Dmitry Petuhov:
> 16.02.2016 13:20, Dietmar Maurer wrote:
>>> Storage Backend is ceph using 2x 10Gbit/s and i'm able to read from it
>>> with 500-1500MB/s. See below for an example.
>> The backup process reads 64KB blocks, and it seems this slows down ceph.
>> This is a known behavior, but I found no solution to speed it up.
> Just done script to speedup my backups from ceph. It's simply does
> (actually little more):
> rbd snap create $SNAP
> rbd export $SNAP $DUMPDIR/$POOL-$VOLUME-$DATE.raw
> rbd snap rm $SNAP
> for every image in selected pools.
> 
> When exporting to file, it's faster than my temporary HDD can write
> (about 120MB/s). But exporting to STDOUT ('-' instead of filename, with
> compression or without it) noticeably decreases speed to qemu's levels
> (20-30MB/s). That's little strange.
> 
> This method is incompatible with PVE's backup-restore tools, but good
> enough for manual disaster recovery from CLI.

right - that'S working for me too but just at night and not when a
single user wants RIGHT now a backup incl. config.

> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-18 Thread Stefan Priebe

Hello Dietmar,
Am 16.02.2016 um 14:55 schrieb Stefan Priebe - Profihost AG:

Am 16.02.2016 um 12:58 schrieb Stefan Priebe - Profihost AG:

Am 16.02.2016 um 11:55 schrieb Dietmar Maurer:

Is it enough to just change these:


The whole backup algorithm is based on 64KB blocksize, so it
is not trivial (or impossible?) to change that.



Regarding readahead - you mentioned it earlier - this is not working 
with ceph and qemu backup.


Ceph drops it's own readahead cache after the first 50MB to let the OS 
do his own readahead which makes sense. But qemu itself does not have 
it's own readahead while doing backups.


Any reason to use 64k blocks and not something bigger? Backups should be 
sequential on all storage backends so something bigger shouldn't hurt.


Greets,
Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-18 Thread Alexandre DERUMIER
>>Any reason to use 64k blocks and not something bigger? Backups should be 
>>sequential on all storage backends so something bigger shouldn't hurt. 

Just found an old thread about this, when Dietmar has sent firsts backup 
patches to qemu mailing
https://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg00387.html

(Don't have read all the discussion)

I wonder how perform the native qemu backup blockjob vs proxmox vma backup 
format ?





- Mail original -
De: "Stefan Priebe" 
À: "dietmar" , "pve-devel" 
Envoyé: Jeudi 18 Février 2016 21:26:13
Objet: Re: [pve-devel] Speed up PVE Backup

Hello Dietmar, 
Am 16.02.2016 um 14:55 schrieb Stefan Priebe - Profihost AG: 
> Am 16.02.2016 um 12:58 schrieb Stefan Priebe - Profihost AG: 
>> Am 16.02.2016 um 11:55 schrieb Dietmar Maurer: 
>>>> Is it enough to just change these: 
>>> 
>>> The whole backup algorithm is based on 64KB blocksize, so it 
>>> is not trivial (or impossible?) to change that. 
>>> 

Regarding readahead - you mentioned it earlier - this is not working 
with ceph and qemu backup. 

Ceph drops it's own readahead cache after the first 50MB to let the OS 
do his own readahead which makes sense. But qemu itself does not have 
it's own readahead while doing backups. 

Any reason to use 64k blocks and not something bigger? Backups should be 
sequential on all storage backends so something bigger shouldn't hurt. 

Greets, 
Stefan 
___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-18 Thread Alexandre DERUMIER
>>Ceph drops it's own readahead cache after the first 50MB to let the OS 
>>do his own readahead which makes sense. But qemu itself does not have 
it'>>s own readahead while doing backups. 

Does is help (for backup) if you always force readahead with 

rbd readahead disable after bytes=0



- Mail original -
De: "Stefan Priebe" 
À: "dietmar" , "pve-devel" 
Envoyé: Jeudi 18 Février 2016 21:26:13
Objet: Re: [pve-devel] Speed up PVE Backup

Hello Dietmar, 
Am 16.02.2016 um 14:55 schrieb Stefan Priebe - Profihost AG: 
> Am 16.02.2016 um 12:58 schrieb Stefan Priebe - Profihost AG: 
>> Am 16.02.2016 um 11:55 schrieb Dietmar Maurer: 
>>>> Is it enough to just change these: 
>>> 
>>> The whole backup algorithm is based on 64KB blocksize, so it 
>>> is not trivial (or impossible?) to change that. 
>>> 

Regarding readahead - you mentioned it earlier - this is not working 
with ceph and qemu backup. 

Ceph drops it's own readahead cache after the first 50MB to let the OS 
do his own readahead which makes sense. But qemu itself does not have 
it's own readahead while doing backups. 

Any reason to use 64k blocks and not something bigger? Backups should be 
sequential on all storage backends so something bigger shouldn't hurt. 

Greets, 
Stefan 
___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-18 Thread Dietmar Maurer
> Any reason to use 64k blocks and not something bigger? 

Again, take a look at the qemu backup code. I guess it is
possible, but not trivial.

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-18 Thread Dietmar Maurer
> Any reason to use 64k blocks and not something bigger? Backups should be 
> sequential on all storage backends so something bigger shouldn't hurt.

Maybe you can raise that question on the qemu devel list?

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-02-19 Thread Dietmar Maurer
> I wonder how perform the native qemu backup blockjob vs proxmox vma backup
> format ?

We use the qemu backup blockjob, just slightly modified...

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-03-01 Thread Alexandre DERUMIER
Hi, qemu devs have send patches to configure backup cluster size:

http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06062.html
http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06064.html


Could be great to see if it's improve ceph backup performance.


- Mail original -
De: "dietmar" 
À: "aderumier" , "pve-devel" 
Envoyé: Vendredi 19 Février 2016 09:17:14
Objet: Re: [pve-devel] Speed up PVE Backup

> I wonder how perform the native qemu backup blockjob vs proxmox vma backup 
> format ? 

We use the qemu backup blockjob, just slightly modified... 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-03-01 Thread Stefan Priebe - Profihost AG
)

Am 01.03.2016 um 11:03 schrieb Alexandre DERUMIER:
> Hi, qemu devs have send patches to configure backup cluster size:
> 
> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06062.html
> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06064.html
> 
> 
> Could be great to see if it's improve ceph backup performance.

Yes it does. We already tested it - it just does not work with VMA. We
need to dump qcow2 files cause vma is crashing when the clustersize it
get's is not 64kb.

> 
> 
> - Mail original -
> De: "dietmar" 
> À: "aderumier" , "pve-devel" 
> Envoyé: Vendredi 19 Février 2016 09:17:14
> Objet: Re: [pve-devel] Speed up PVE Backup
> 
>> I wonder how perform the native qemu backup blockjob vs proxmox vma backup 
>> format ? 
> 
> We use the qemu backup blockjob, just slightly modified... 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-03-01 Thread Alexandre DERUMIER
>>Yes it does. We already tested it - it just does not work with VMA. We
>>need to dump qcow2 files cause vma is crashing when the clustersize it
>>get's is not 64kb.

How do you backup to qcow2 ?

@Dietmar : is it possible to improve vma format to handle differents 
clustersize(s) ?
- Mail original -
De: "Stefan Priebe" 
À: "pve-devel" , "dietmar" 
Envoyé: Mardi 1 Mars 2016 11:55:21
Objet: Re: [pve-devel] Speed up PVE Backup

) 

Am 01.03.2016 um 11:03 schrieb Alexandre DERUMIER: 
> Hi, qemu devs have send patches to configure backup cluster size: 
> 
> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06062.html 
> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06064.html 
> 
> 
> Could be great to see if it's improve ceph backup performance. 

Yes it does. We already tested it - it just does not work with VMA. We 
need to dump qcow2 files cause vma is crashing when the clustersize it 
get's is not 64kb. 

> 
> 
> - Mail original - 
> De: "dietmar"  
> À: "aderumier" , "pve-devel"  
> Envoyé: Vendredi 19 Février 2016 09:17:14 
> Objet: Re: [pve-devel] Speed up PVE Backup 
> 
>> I wonder how perform the native qemu backup blockjob vs proxmox vma backup 
>> format ? 
> 
> We use the qemu backup blockjob, just slightly modified... 
> ___ 
> pve-devel mailing list 
> pve-devel@pve.proxmox.com 
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> 
___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-03-01 Thread Stefan Priebe - Profihost AG

Am 01.03.2016 um 14:05 schrieb Alexandre DERUMIER:
>>> Yes it does. We already tested it - it just does not work with VMA. We
>>> need to dump qcow2 files cause vma is crashing when the clustersize it
>>> get's is not 64kb.
> 
> How do you backup to qcow2 ?

We're currently implementing our own backup system which uses the
drive-backup qmp routine to dump qcow2. We don't need the config files.
And this works with ceph and 4MB cluster size.

Greets,
Stefan

> - Mail original -
> De: "Stefan Priebe" 
> À: "pve-devel" , "dietmar" 
> Envoyé: Mardi 1 Mars 2016 11:55:21
> Objet: Re: [pve-devel] Speed up PVE Backup
> 
> ) 
> 
> Am 01.03.2016 um 11:03 schrieb Alexandre DERUMIER: 
>> Hi, qemu devs have send patches to configure backup cluster size: 
>>
>> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06062.html 
>> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06064.html 
>>
>>
>> Could be great to see if it's improve ceph backup performance. 
> 
> Yes it does. We already tested it - it just does not work with VMA. We 
> need to dump qcow2 files cause vma is crashing when the clustersize it 
> get's is not 64kb. 
> 
>>
>>
>> - Mail original - 
>> De: "dietmar"  
>> À: "aderumier" , "pve-devel" 
>>  
>> Envoyé: Vendredi 19 Février 2016 09:17:14 
>> Objet: Re: [pve-devel] Speed up PVE Backup 
>>
>>> I wonder how perform the native qemu backup blockjob vs proxmox vma backup 
>>> format ? 
>>
>> We use the qemu backup blockjob, just slightly modified... 
>> ___ 
>> pve-devel mailing list 
>> pve-devel@pve.proxmox.com 
>> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
>>
> ___ 
> pve-devel mailing list 
> pve-devel@pve.proxmox.com 
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-03-02 Thread Alexandre DERUMIER
>>We're currently implementing our own backup system which uses the 
>>drive-backup qmp routine to dump qcow2. We don't need the config files. 
>>And this works with ceph and 4MB cluster size. 

How about speed performance (64K,4MB) and stability vs proxmox vma format ? 

I'm curious, because I still fear to use proxmox feature when I see all the 
stability bugs reports on the forum. 
And I would like to test incremental backup too. (only full backup is not 
possible for me) 








Alexandre Derumier 
Ingénieur système et stockage 


Fixe : 03 20 68 90 88 
Fax : 03 20 68 90 81 


45 Bvd du Général Leclerc 59100 Roubaix 
12 rue Marivaux 75002 Paris 


MonSiteEstLent.com - Blog dédié à la webperformance et la gestion de pics de 
trafic 


De: "Stefan Priebe"  
À: "pve-devel"  
Envoyé: Mercredi 2 Mars 2016 08:28:52 
Objet: Re: [pve-devel] Speed up PVE Backup 

Am 01.03.2016 um 14:05 schrieb Alexandre DERUMIER: 
>>> Yes it does. We already tested it - it just does not work with VMA. We 
>>> need to dump qcow2 files cause vma is crashing when the clustersize it 
>>> get's is not 64kb. 
> 
> How do you backup to qcow2 ? 

We're currently implementing our own backup system which uses the 
drive-backup qmp routine to dump qcow2. We don't need the config files. 
And this works with ceph and 4MB cluster size. 

Greets, 
Stefan 

> - Mail original - 
> De: "Stefan Priebe"  
> À: "pve-devel" , "dietmar"  
> Envoyé: Mardi 1 Mars 2016 11:55:21 
> Objet: Re: [pve-devel] Speed up PVE Backup 
> 
> ) 
> 
> Am 01.03.2016 um 11:03 schrieb Alexandre DERUMIER: 
>> Hi, qemu devs have send patches to configure backup cluster size: 
>> 
>> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06062.html 
>> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06064.html 
>> 
>> 
>> Could be great to see if it's improve ceph backup performance. 
> 
> Yes it does. We already tested it - it just does not work with VMA. We 
> need to dump qcow2 files cause vma is crashing when the clustersize it 
> get's is not 64kb. 
> 
>> 
>> 
>> - Mail original - 
>> De: "dietmar"  
>> À: "aderumier" , "pve-devel" 
>>  
>> Envoyé: Vendredi 19 Février 2016 09:17:14 
>> Objet: Re: [pve-devel] Speed up PVE Backup 
>> 
>>> I wonder how perform the native qemu backup blockjob vs proxmox vma backup 
>>> format ? 
>> 
>> We use the qemu backup blockjob, just slightly modified... 
>> ___ 
>> pve-devel mailing list 
>> pve-devel@pve.proxmox.com 
>> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
>> 
> ___ 
> pve-devel mailing list 
> pve-devel@pve.proxmox.com 
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> ___ 
> pve-devel mailing list 
> pve-devel@pve.proxmox.com 
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> 
___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-03-04 Thread Timo Grodzinski
Hi Alexandre,

Am 02.03.2016 um 15:23 schrieb Alexandre DERUMIER:
>>>We're currently implementing our own backup system which uses the
>>>drive-backup qmp routine to dump qcow2. We don't need the config files.
>>>And this works with ceph and 4MB cluster size.
> 
> How about speed performance (64K,4MB) and stability   vs proxmox vma
> format ?

We (Stefan and I) do not have benchmarks, that are fully comparable.

We tested a 100GB virtio drive from ceph backup upped to NFS:

- with VMA, default 64kb cluster size and without read/write limits:

|  | Snapshot | Suspend |  Stop |
|--+--+-+---|
| none |09:55 |   09:37 | 09:35 |
| LZO  |10:24 |   10:26 | 10:20 |
| GZip |31:52 |   32:22 |   |

- QMP drive-backup (params sync => full, format => qcow2), 4MB cluster
size and 130 mb/s, 215 ops/s read; 90 mb/s, 155 ops/s write limits:

9:15 Minutes.

QMP drive-backup with above limits and default cluster size 64KB took
very long, more than 40 minutes...

> I'm curious, because I still fear to use proxmox feature when I see all
> the stability bugs reports on the forum.
> And I would like to test incremental backup too. (only full backup is
> not possible for me)

We don't have much experience with stability for plain QMP drive-backup
yet, sorry.

Greets,
Timo

> 
> 
> 
> 
>   
> 
> *Alexandre* *Derumier* 
> *Ingénieur système et stockage*
> 
> *Fixe :* 03 20 68 90 88 
> *Fax :* 03 20 68 90 81
> 
> 45 Bvd du Général Leclerc 59100 Roubaix 
> 12 rue Marivaux 75002 Paris
> 
> <https://twitter.com/OdisoHosting> <https://twitter.com/mindbaz> 
> <https://www.linkedin.com/company/odiso> 
> <http://www.viadeo.com/fr/company/odiso> 
> <https://www.facebook.com/monsiteestlent>
> 
> MonSiteEstLent.com <http://www.monsiteestlent.com/> - Blog dédié à la
> webperformance et la gestion de pics de trafic
> 
> 
> ----------------
> *De: *"Stefan Priebe" 
> *À: *"pve-devel" 
> *Envoyé: *Mercredi 2 Mars 2016 08:28:52
> *Objet: *Re: [pve-devel] Speed up PVE Backup
> 
> Am 01.03.2016 um 14:05 schrieb Alexandre DERUMIER:
>>>> Yes it does. We already tested it - it just does not work with VMA. We
>>>> need to dump qcow2 files cause vma is crashing when the clustersize it
>>>> get's is not 64kb.
>>
>> How do you backup to qcow2 ?
> 
> We're currently implementing our own backup system which uses the
> drive-backup qmp routine to dump qcow2. We don't need the config files.
> And this works with ceph and 4MB cluster size.
> 
> Greets,
> Stefan
> 
>> - Mail original -
>> De: "Stefan Priebe" 
>> À: "pve-devel" , "dietmar"
> 
>> Envoyé: Mardi 1 Mars 2016 11:55:21
>> Objet: Re: [pve-devel] Speed up PVE Backup
>>
>> )
>>
>> Am 01.03.2016 um 11:03 schrieb Alexandre DERUMIER:
>>> Hi, qemu devs have send patches to configure backup cluster size:
>>>
>>> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06062.html
>>> http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg06064.html
>>>
>>>
>>> Could be great to see if it's improve ceph backup performance.
>>
>> Yes it does. We already tested it - it just does not work with VMA. We
>> need to dump qcow2 files cause vma is crashing when the clustersize it
>> get's is not 64kb.
>>
>>>
>>>
>>> - Mail original -
>>> De: "dietmar" 
>>> À: "aderumier" , "pve-devel"
> 
>>> Envoyé: Vendredi 19 Février 2016 09:17:14
>>> Objet: Re: [pve-devel] Speed up PVE Backup
>>>
>>>> I wonder how perform the native qemu backup blockjob vs proxmox vma
> backup
>>>> format ?
>>>
>>> We use the qemu backup blockjob, just slightly modified...
>>> ___
>>> pve-devel mailing list
>>> pve-devel@pve.proxmox.com
>>> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>
>> ___
>> pve-devel mailing list
>> pve-devel@pve.proxmox.com
>> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>> ___
>> pve-devel mailing list
>> pve-devel@pve.proxmox.com
>> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
> ___
> pve-devel mailing list
> pve-

Re: [pve-devel] Speed up PVE Backup

2016-07-19 Thread Eneko Lacunza

Hi all,

El 16/02/16 a las 15:52, Stefan Priebe - Profihost AG escribió:

Am 16.02.2016 um 15:50 schrieb Dmitry Petuhov:

16.02.2016 13:20, Dietmar Maurer wrote:

Storage Backend is ceph using 2x 10Gbit/s and i'm able to read from it
with 500-1500MB/s. See below for an example.

The backup process reads 64KB blocks, and it seems this slows down ceph.
This is a known behavior, but I found no solution to speed it up.

Just done script to speedup my backups from ceph. It's simply does
(actually little more):
rbd snap create $SNAP
rbd export $SNAP $DUMPDIR/$POOL-$VOLUME-$DATE.raw
rbd snap rm $SNAP
for every image in selected pools.

When exporting to file, it's faster than my temporary HDD can write
(about 120MB/s). But exporting to STDOUT ('-' instead of filename, with
compression or without it) noticeably decreases speed to qemu's levels
(20-30MB/s). That's little strange.

This method is incompatible with PVE's backup-restore tools, but good
enough for manual disaster recovery from CLI.

right - that'S working for me too but just at night and not when a
single user wants RIGHT now a backup incl. config.
Do we have any improvement related to this in the pipeline? Yesterday 
our 9-osd 3-node cluster restored a backup at 6MB/s... it was very 
boring, painfull and expensive to wait for it :) (I decided to buy a new 
server to replace our 7.5-year IBM while waiting ;) )


Our backups are slow too, but we do those during weekend... but usually 
we want to restore fast... :)


Dietmar, I haven't  looked at the backup/restore code, but do you think 
we could do something to read/write to storage in larger chunks than the 
actual 64KB? I'm out of a high work load period and maybe could look at 
this issue this summer.


Thanks
Eneko

--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-20 Thread Eneko Lacunza

Hi again,

I've been looking around the backup/restore code a bit. I'm focused on 
restore acceleration on Ceph RBD right know.


Sorry if I have something mistaken, I have never developed for Proxmox/Qemu.

I see in line 563 of file
https://git.proxmox.com/?p=pve-qemu-kvm.git;a=blob;f=debian/patches/pve/0011-introduce-new-vma-archive-format.patch;h=1c26209648c210f3b18576abc2c5a23768fd7c7b;hb=HEAD
the function restore_write_data, it is calling full_write (for direct to 
file restore) and bdrv_write (what I suppose is a QEMU abstraction of 
block device).


This is called from restore_extents, where a comment precisely says "try 
to write whole clusters to speedup restore", so this means we're writing 
64KB-8Byte chunks, which is giving a hard time to Ceph-RBD because this 
means lots of ~64KB IOPS.


So, I suggest the following solution to your consideration:
- Create a write buffer on startup (let's asume it's 4MB for example, a 
number ceph rbd would like much more than 64KB). This could even be 
configurable and skip the buffer altogether if buffer_size=cluster_size
- Wrap current "restore_write_data" with a 
"restore_write_data_with_buffer", that does a copy to the 4MB buffer, 
and only calls "restore_write_data" when it's full.
* Create a new "flush_restore_write_data_buffer" to flush the write 
buffer when device restore reading is complete.


Do you think this is a good idea? If so I will find time to implement 
and test this to check whether restore time improves.


Thanks a lot
Eneko


El 20/07/16 a las 08:24, Eneko Lacunza escribió:

El 16/02/16 a las 15:52, Stefan Priebe - Profihost AG escribió:

Am 16.02.2016 um 15:50 schrieb Dmitry Petuhov:

16.02.2016 13:20, Dietmar Maurer wrote:
Storage Backend is ceph using 2x 10Gbit/s and i'm able to read 
from it

with 500-1500MB/s. See below for an example.
The backup process reads 64KB blocks, and it seems this slows down 
ceph.

This is a known behavior, but I found no solution to speed it up.

Just done script to speedup my backups from ceph. It's simply does
(actually little more):
rbd snap create $SNAP
rbd export $SNAP $DUMPDIR/$POOL-$VOLUME-$DATE.raw
rbd snap rm $SNAP
for every image in selected pools.

When exporting to file, it's faster than my temporary HDD can write
(about 120MB/s). But exporting to STDOUT ('-' instead of filename, with
compression or without it) noticeably decreases speed to qemu's levels
(20-30MB/s). That's little strange.

This method is incompatible with PVE's backup-restore tools, but good
enough for manual disaster recovery from CLI.

right - that'S working for me too but just at night and not when a
single user wants RIGHT now a backup incl. config.
Do we have any improvement related to this in the pipeline? Yesterday 
our 9-osd 3-node cluster restored a backup at 6MB/s... it was very 
boring, painfull and expensive to wait for it :) (I decided to buy a 
new server to replace our 7.5-year IBM while waiting ;) )


Our backups are slow too, but we do those during weekend... but 
usually we want to restore fast... :)


Dietmar, I haven't  looked at the backup/restore code, but do you 
think we could do something to read/write to storage in larger chunks 
than the actual 64KB? I'm out of a high work load period and maybe 
could look at this issue this summer.


Thanks
Eneko




--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-20 Thread Dietmar Maurer
> This is called from restore_extents, where a comment precisely says "try 
> to write whole clusters to speedup restore", so this means we're writing 
> 64KB-8Byte chunks, which is giving a hard time to Ceph-RBD because this 
> means lots of ~64KB IOPS.
> 
> So, I suggest the following solution to your consideration:
> - Create a write buffer on startup (let's asume it's 4MB for example, a 
> number ceph rbd would like much more than 64KB). This could even be 
> configurable and skip the buffer altogether if buffer_size=cluster_size
> - Wrap current "restore_write_data" with a 
> "restore_write_data_with_buffer", that does a copy to the 4MB buffer, 
> and only calls "restore_write_data" when it's full.
>  * Create a new "flush_restore_write_data_buffer" to flush the write 
> buffer when device restore reading is complete.
> 
> Do you think this is a good idea? If so I will find time to implement 
> and test this to check whether restore time improves.

We store those 64KB blocks out of order, so your suggestion will not work
in general.

But you can try to assemble larger blocks, and write them once you get
an out of order block...

I always thought the ceph libraries does (or should do) that anyways?
(write combining)

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-20 Thread Lindsay Mathieson

On 20/07/2016 4:24 PM, Eneko Lacunza wrote:
Yesterday our 9-osd 3-node cluster restored a backup at 6MB/s... it 
was very boring, painfull and expensive to wait for it 


One of the reasons we migrated away from ceph - snapshot and backup 
restores were unusably slow.


--
Lindsay Mathieson

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-20 Thread Eneko Lacunza

El 20/07/16 a las 17:46, Dietmar Maurer escribió:

This is called from restore_extents, where a comment precisely says "try
to write whole clusters to speedup restore", so this means we're writing
64KB-8Byte chunks, which is giving a hard time to Ceph-RBD because this
means lots of ~64KB IOPS.

So, I suggest the following solution to your consideration:
- Create a write buffer on startup (let's asume it's 4MB for example, a
number ceph rbd would like much more than 64KB). This could even be
configurable and skip the buffer altogether if buffer_size=cluster_size
- Wrap current "restore_write_data" with a
"restore_write_data_with_buffer", that does a copy to the 4MB buffer,
and only calls "restore_write_data" when it's full.
  * Create a new "flush_restore_write_data_buffer" to flush the write
buffer when device restore reading is complete.

Do you think this is a good idea? If so I will find time to implement
and test this to check whether restore time improves.

We store those 64KB blocks out of order, so your suggestion will not work
in general.

But I suppose they're mostly ordered?

But you can try to assemble larger blocks, and write them once you get
an out of order block...

Yes, this is the plan.

I always thought the ceph libraries does (or should do) that anyways?
(write combining)

Reading the docs:
http://docs.ceph.com/docs/hammer/rbd/rbd-config-ref/

It should be true when write-back rbd cache is activated. This seems to 
be the default, but maybe we're using disk cache setting on restore too?


I'll try to change the disk cache setting and will report the results.

Thanks
Eneko


--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-21 Thread Dietmar Maurer
> But I suppose they're mostly ordered?

Yes - depends how much writes happens during backup...

> > But you can try to assemble larger blocks, and write them once you get
> > an out of order block...
> Yes, this is the plan.
> > I always thought the ceph libraries does (or should do) that anyways?
> > (write combining)
> Reading the docs:
> http://docs.ceph.com/docs/hammer/rbd/rbd-config-ref/
> 
> It should be true when write-back rbd cache is activated. This seems to 
> be the default, but maybe we're using disk cache setting on restore too?
> 
> I'll try to change the disk cache setting and will report the results.

thanks!

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-21 Thread Eneko Lacunza

Hi,

El 21/07/16 a las 09:34, Dietmar Maurer escribió:



But you can try to assemble larger blocks, and write them once you get
an out of order block...

Yes, this is the plan.

I always thought the ceph libraries does (or should do) that anyways?
(write combining)

Reading the docs:
http://docs.ceph.com/docs/hammer/rbd/rbd-config-ref/

It should be true when write-back rbd cache is activated. This seems to
be the default, but maybe we're using disk cache setting on restore too?

I'll try to change the disk cache setting and will report the results.

thanks!


Looking at more docs:
http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/

This says:
"
QEMU’s cache settings override Ceph’s default settings (i.e., settings 
that are not explicitly set in the Ceph configuration file). If you 
explicitly set RBD Cache 
 settings in your 
Ceph configuration file, your Ceph settings override the QEMU cache 
settings. If you set cache settings on the QEMU command line, the QEMU 
command line settings override the Ceph configuration file settings.

"
I have been doing tests all morning with a different backup (only one 
10GB disk) so that I could perform tests faster.


I thought maybe we were restoring without writeback cache (rbd cache), 
but have tried the following ceph.conf tweaks and conclude that rbd 
cache is enabled:


1. If I set rbd cache = true I get the same performance.
2. If I set rbd cache writethrough until flush = true (rbd cache = true 
not necessary), I get x2-x3 the restore performance. This setting is a 
security measure for non-flushing virtio drivers, but it is safe for a 
restore. No writeback until a flush is detected


I think qmrestore isn't issuing any flush request (until maybe the end), 
so for ceph storage backend we should set 
rbd_cache_writethrough_until_flush=false for better performance.


Restore is happening at about 30-45MB/s vs 15MB/s before, but all this 
may be affected by a slow OSD, so I don't think my absolute figures are 
good, only the fact that there is a noticeable improvement. (we'll have 
this fixed next week).


If someone can test and confirm this, it should be quite easy to patch 
qmrestore...


Thanks

Eneko

--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Alexandre DERUMIER
>>But you can try to assemble larger blocks, and write them once you get 
>>an out of order block... 

>>I always thought the ceph libraries does (or should do) that anyways? 
>>(write combining) 

librbd is doing this if writeback is enabled. (merge coalesced block).
But I'm not sure (don't remember exactly, need to be verifed) it's working fine 
with current backup restore or offline disk cloning.
(maybe they are some fsync each 64k block)


- Mail original -
De: "dietmar" 
À: "pve-devel" , "Eneko Lacunza" 
Envoyé: Mercredi 20 Juillet 2016 17:46:12
Objet: Re: [pve-devel] Speed up PVE Backup

> This is called from restore_extents, where a comment precisely says "try 
> to write whole clusters to speedup restore", so this means we're writing 
> 64KB-8Byte chunks, which is giving a hard time to Ceph-RBD because this 
> means lots of ~64KB IOPS. 
> 
> So, I suggest the following solution to your consideration: 
> - Create a write buffer on startup (let's asume it's 4MB for example, a 
> number ceph rbd would like much more than 64KB). This could even be 
> configurable and skip the buffer altogether if buffer_size=cluster_size 
> - Wrap current "restore_write_data" with a 
> "restore_write_data_with_buffer", that does a copy to the 4MB buffer, 
> and only calls "restore_write_data" when it's full. 
> * Create a new "flush_restore_write_data_buffer" to flush the write 
> buffer when device restore reading is complete. 
> 
> Do you think this is a good idea? If so I will find time to implement 
> and test this to check whether restore time improves. 

We store those 64KB blocks out of order, so your suggestion will not work 
in general. 

But you can try to assemble larger blocks, and write them once you get 
an out of order block... 

I always thought the ceph libraries does (or should do) that anyways? 
(write combining) 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Alexandre DERUMIER
>>I think qmrestore isn't issuing any flush request (until maybe the end), 
Need to be checked! (but if I think we open restore block storage with 
writeback, so I hope we send flush)

>>so for ceph storage backend we should set 
>>rbd_cache_writethrough_until_flush=false for better performance. 

I think it's possible to pass theses flag in qemu block driver option, when 
opening the rbd storage


http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/

qemu-img {command} [options] 
rbd:{pool-name}/{image-name}[@snapshot-name][:option1=value1][:option2=value2...]


for qemu-img or with qemu drive option, I think it's possible to send as 
option, ":rbd_cache_writethrough_until_flush=false"


But if missing flush if really the problem, it should be added to restore 
command directly. (maybe 1 flush each 4MB for example)



- Mail original -
De: "Eneko Lacunza" 
À: "dietmar" , "pve-devel" 
Envoyé: Jeudi 21 Juillet 2016 13:19:10
Objet: Re: [pve-devel] Speed up PVE Backup

Hi, 

El 21/07/16 a las 09:34, Dietmar Maurer escribió: 
> 
>>> But you can try to assemble larger blocks, and write them once you get 
>>> an out of order block... 
>> Yes, this is the plan. 
>>> I always thought the ceph libraries does (or should do) that anyways? 
>>> (write combining) 
>> Reading the docs: 
>> http://docs.ceph.com/docs/hammer/rbd/rbd-config-ref/ 
>> 
>> It should be true when write-back rbd cache is activated. This seems to 
>> be the default, but maybe we're using disk cache setting on restore too? 
>> 
>> I'll try to change the disk cache setting and will report the results. 
> thanks! 
> 
Looking at more docs: 
http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/ 

This says: 
" 
QEMU’s cache settings override Ceph’s default settings (i.e., settings 
that are not explicitly set in the Ceph configuration file). If you 
explicitly set RBD Cache 
<http://docs.ceph.com/docs/hammer/rbd/rbd-config-ref/> settings in your 
Ceph configuration file, your Ceph settings override the QEMU cache 
settings. If you set cache settings on the QEMU command line, the QEMU 
command line settings override the Ceph configuration file settings. 
" 
I have been doing tests all morning with a different backup (only one 
10GB disk) so that I could perform tests faster. 

I thought maybe we were restoring without writeback cache (rbd cache), 
but have tried the following ceph.conf tweaks and conclude that rbd 
cache is enabled: 

1. If I set rbd cache = true I get the same performance. 
2. If I set rbd cache writethrough until flush = true (rbd cache = true 
not necessary), I get x2-x3 the restore performance. This setting is a 
security measure for non-flushing virtio drivers, but it is safe for a 
restore. No writeback until a flush is detected 

I think qmrestore isn't issuing any flush request (until maybe the end), 
so for ceph storage backend we should set 
rbd_cache_writethrough_until_flush=false for better performance. 

Restore is happening at about 30-45MB/s vs 15MB/s before, but all this 
may be affected by a slow OSD, so I don't think my absolute figures are 
good, only the fact that there is a noticeable improvement. (we'll have 
this fixed next week). 

If someone can test and confirm this, it should be quite easy to patch 
qmrestore... 

Thanks 

Eneko 

-- 
Zuzendari Teknikoa / Director Técnico 
Binovo IT Human Project, S.L. 
Telf. 943493611 
943324914 
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) 
www.binovo.es 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Eneko Lacunza

Hi,

El 26/07/16 a las 10:04, Alexandre DERUMIER escribió:

I think qmrestore isn't issuing any flush request (until maybe the end),

Need to be checked! (but if I think we open restore block storage with 
writeback, so I hope we send flush)


so for ceph storage backend we should set
rbd_cache_writethrough_until_flush=false for better performance.

I think it's possible to pass theses flag in qemu block driver option, when 
opening the rbd storage


http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/

qemu-img {command} [options] 
rbd:{pool-name}/{image-name}[@snapshot-name][:option1=value1][:option2=value2...]


for qemu-img or with qemu drive option, I think it's possible to send as option, 
":rbd_cache_writethrough_until_flush=false"
I developed a small patch to do this, waiting to test it in our setup 
(today or tomorrow)

But if missing flush if really the problem, it should be added to restore 
command directly. (maybe 1 flush each 4MB for example)
This flush is needed only for Ceph RBD, so I think using the flag above 
would be more correct.


There is no reason to flush a restored disk until just the end, really. 
Issuing flushes every x MB could hurt other storages without need.


In fact all this is because Ceph trying to "fix" broken virtio drivers... :)

Thanks
Eneko

--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Alexandre DERUMIER
>>There is no reason to flush a restored disk until just the end, really. 
>>Issuing flushes every x MB could hurt other storages without need. 

I'm curious to see host memory usage of a big local file storage restore 
(100GB), with writeback without any flush ?


- Mail original -
De: "Eneko Lacunza" 
À: "pve-devel" 
Envoyé: Mardi 26 Juillet 2016 10:13:59
Objet: Re: [pve-devel] Speed up PVE Backup

Hi, 

El 26/07/16 a las 10:04, Alexandre DERUMIER escribió: 
>>> I think qmrestore isn't issuing any flush request (until maybe the end), 
> Need to be checked! (but if I think we open restore block storage with 
> writeback, so I hope we send flush) 
> 
>>> so for ceph storage backend we should set 
>>> rbd_cache_writethrough_until_flush=false for better performance. 
> I think it's possible to pass theses flag in qemu block driver option, when 
> opening the rbd storage 
> 
> 
> http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/ 
> 
> qemu-img {command} [options] 
> rbd:{pool-name}/{image-name}[@snapshot-name][:option1=value1][:option2=value2...]
>  
> 
> 
> for qemu-img or with qemu drive option, I think it's possible to send as 
> option, ":rbd_cache_writethrough_until_flush=false" 
I developed a small patch to do this, waiting to test it in our setup 
(today or tomorrow) 
> But if missing flush if really the problem, it should be added to restore 
> command directly. (maybe 1 flush each 4MB for example) 
This flush is needed only for Ceph RBD, so I think using the flag above 
would be more correct. 

There is no reason to flush a restored disk until just the end, really. 
Issuing flushes every x MB could hurt other storages without need. 

In fact all this is because Ceph trying to "fix" broken virtio drivers... :) 

Thanks 
Eneko 

-- 
Zuzendari Teknikoa / Director Técnico 
Binovo IT Human Project, S.L. 
Telf. 943493611 
943324914 
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) 
www.binovo.es 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Eneko Lacunza

Hi,

El 26/07/16 a las 10:32, Alexandre DERUMIER escribió:

There is no reason to flush a restored disk until just the end, really.
Issuing flushes every x MB could hurt other storages without need.

I'm curious to see host memory usage of a big local file storage restore 
(100GB), with writeback without any flush ?
This is how it works right now ;) - not flushing doesn't mean system 
won't write data; it can just do so when it thinks is a good time.


Cheers


--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Alexandre DERUMIER
>>This is how it works right now ;) - not flushing doesn't mean system 
>>won't write data; it can just do so when it thinks is a good time.

If think is true with filesystems (fs will try to flush at regular interval), 
but I'm not sure when you write to a block device without doing any flush.

I remember in past to use cache=unsafe with qemu, and a scsi drive, I was able 
to write gibabytes of datas in host memory
without any flush occuring.



- Mail original -
De: "Eneko Lacunza" 
À: "pve-devel" 
Envoyé: Mardi 26 Juillet 2016 13:19:28
Objet: Re: [pve-devel] Speed up PVE Backup

Hi, 

El 26/07/16 a las 10:32, Alexandre DERUMIER escribió: 
>>> There is no reason to flush a restored disk until just the end, really. 
>>> Issuing flushes every x MB could hurt other storages without need. 
> I'm curious to see host memory usage of a big local file storage restore 
> (100GB), with writeback without any flush ? 
This is how it works right now ;) - not flushing doesn't mean system 
won't write data; it can just do so when it thinks is a good time. 

Cheers 


-- 
Zuzendari Teknikoa / Director Técnico 
Binovo IT Human Project, S.L. 
Telf. 943493611 
943324914 
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) 
www.binovo.es 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel