Re: [pve-devel] ZFS-over-NFS

2016-07-26 Thread Dmitry Petuhov
26.07.2016 19:08, Andreas Steinel wrote:
>> Why not use qcow2 format over generic NFS? It will give you
>> shapshot-rollback
>> features and I don't think that with much worse speed than these features
>> on ZFS level.
> I want to have send/receive also and I use QCOW2 on top of ZFS to have
> "switch-to-that-snapshot" capability, what ZFS not supports.
What you mean by send/receive?
So you want to combine ZFS and qcow2 exclusive snapshot features at same time?
I don't think that's possible.

>> But on my experience, NFS storage for VMs is bad idea: it causes huge
>> latencies under
>> load, leading to chashes. If you want ZFS-based shared storage, then ZFS
>> over iSCSI
>> is your choice.
> I don't want to use another FS over ZFS which I need to use when using LXC
> (in addition to KVM). Then I need to trim it manually to get back free
> space in ZFS. I have this at the moment and I do not like it.
Not understood that too. You asked about "ZFS filesystems exported by NFS".
NFS IS another FS on top of ZFS.

So you want to use part of host's local ZFS [almost] directly in containers 
like it was in
PVE 1-3 with OpenVZ?

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] ZFS-over-NFS

2016-07-26 Thread Andreas Steinel
On Mon, Jul 25, 2016 at 9:07 PM, Dmitry Petuhov 
wrote:

> Why not use qcow2 format over generic NFS? It will give you
> shapshot-rollback
> features and I don't think that with much worse speed than these features
> on ZFS level.
>

I want to have send/receive also and I use QCOW2 on top of ZFS to have
"switch-to-that-snapshot" capability, what ZFS not supports.


> But on my experience, NFS storage for VMs is bad idea: it causes huge
> latencies under
> load, leading to chashes. If you want ZFS-based shared storage, then ZFS
> over iSCSI
> is your choice.
>

I don't want to use another FS over ZFS which I need to use when using LXC
(in addition to KVM). Then I need to trim it manually to get back free
space in ZFS. I have this at the moment and I do not like it.
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] [PATCH] rbd : disable cache_writethtrough_until_flush with cache=unsafe

2016-07-26 Thread Alexandre Derumier
Signed-off-by: Alexandre Derumier 
---
 ...-rbd_cache_writethrough_until_flush-with-.patch | 29 ++
 debian/patches/series  |  1 +
 2 files changed, 30 insertions(+)
 create mode 100644 
debian/patches/pve/0054-rbd-disable-rbd_cache_writethrough_until_flush-with-.patch

diff --git 
a/debian/patches/pve/0054-rbd-disable-rbd_cache_writethrough_until_flush-with-.patch
 
b/debian/patches/pve/0054-rbd-disable-rbd_cache_writethrough_until_flush-with-.patch
new file mode 100644
index 000..e1fab0b
--- /dev/null
+++ 
b/debian/patches/pve/0054-rbd-disable-rbd_cache_writethrough_until_flush-with-.patch
@@ -0,0 +1,29 @@
+From da5bf657823ed2f5a790363b5338f30be68de62b Mon Sep 17 00:00:00 2001
+From: Alexandre Derumier 
+Date: Tue, 26 Jul 2016 16:51:00 +0200
+Subject: [PATCH] rbd: disable rbd_cache_writethrough_until_flush with
+ cache=unsafe
+
+Signed-off-by: Alexandre Derumier 
+---
+ block/rbd.c | 4 
+ 1 file changed, 4 insertions(+)
+
+diff --git a/block/rbd.c b/block/rbd.c
+index 5bc5b32..5656028 100644
+--- a/block/rbd.c
 b/block/rbd.c
+@@ -544,6 +544,10 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
*options, int flags,
+ rados_conf_set(s->cluster, "rbd_cache", "true");
+ }
+ 
++if (flags & BDRV_O_NO_FLUSH) {
++  rados_conf_set(s->cluster, "rbd_cache_writethrough_until_flush", 
"false");
++}
++
+ r = rados_connect(s->cluster);
+ if (r < 0) {
+ error_setg(errp, "error connecting");
+-- 
+2.1.4
+
diff --git a/debian/patches/series b/debian/patches/series
index 3614309..c77c5da 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -51,6 +51,7 @@ pve/0050-fix-possible-unitialised-return-value.patch
 pve/0051-net-NET_CLIENT_OPTIONS_KIND_MAX-changed.patch
 pve/0052-vnc-refactor-to-QIOChannelSocket.patch
 pve/0053-vma-use-BlockBackend-on-extract.patch
+pve/0054-rbd-disable-rbd_cache_writethrough_until_flush-with-.patch
 #see https://bugs.launchpad.net/qemu/+bug/1488363?comments=all
 extra/0001-Revert-target-i386-disable-LINT0-after-reset.patch
 extra/0001-i386-kvmvapic-initialise-imm32-variable.patch
-- 
2.1.4

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD, activating writeback cache.

2016-07-26 Thread Alexandre DERUMIER
>>I wonder if this couldn't be fixed directly in rbd.c block driver

I have send a patch, can you test ?



- Mail original -
De: "aderumier" 
À: "pve-devel" 
Envoyé: Mardi 26 Juillet 2016 16:46:58
Objet: Re: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD, 
activating writeback cache.

I wonder if this couldn't be fixed directly in rbd.c block driver 


currently: 

block/rbd.c 

if (flags & BDRV_O_NOCACHE) { 
rados_conf_set(s->cluster, "rbd_cache", "false"); 
} else { 
rados_conf_set(s->cluster, "rbd_cache", "true"); 
} 


and in block.c 

int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough) 
{ 
*flags &= ~BDRV_O_CACHE_MASK; 

if (!strcmp(mode, "off") || !strcmp(mode, "none")) { 
*writethrough = false; 
*flags |= BDRV_O_NOCACHE; 
} else if (!strcmp(mode, "directsync")) { 
*writethrough = true; 
*flags |= BDRV_O_NOCACHE; 
} else if (!strcmp(mode, "writeback")) { 
*writethrough = false; 
} else if (!strcmp(mode, "unsafe")) { 
*writethrough = false; 
*flags |= BDRV_O_NO_FLUSH; 
} else if (!strcmp(mode, "writethrough")) { 
*writethrough = true; 
} else { 
return -1; 
} 

return 0; 
} 



as in qemu-img convert we don't send flush and in vma restore too, we use 
BDRV_O_NO_FLUSH. 


I think we should add in rbd.c something like 

if (flags & BDRV_O_NOCACHE) { 
rados_conf_set(s->cluster, "rbd_cache", "false"); 
} else { 
rados_conf_set(s->cluster, "rbd_cache", "true"); 
} 

+ if (flags & BDRV_O_NO_FLUSH) { 
+ rados_conf_set(s->cluster, "rbd_cache_writethrough_until_flush", "false"); 
+ } 


I really think it's the right way (or it's impossible to use unsafe cache 
option) and could be pushed to qemu upsteam. 


- Mail original - 
De: "Eneko Lacunza"  
À: "pve-devel"  
Envoyé: Mardi 26 Juillet 2016 16:05:52 
Objet: Re: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD, 
activating writeback cache. 

Hi, 

I'm unsure about wether it would be better to have one patch or two 
independent patches? 

El 26/07/16 a las 16:02, Alexandre DERUMIER escribió: 
> Hi, 
> 
> I think that qemu-img convert have the same problem 
> 
> I have found a commit from 2015 
> https://git.greensocs.com/fkonrad/mttcg/commit/80ccf93b884a2edab5ec62634758e942bba81b7c
>  
> 
> By default it don't send flush (cache=unsafe), and the commit add a flush on 
> image closing, because some storage like sheepdog, 
> can't do the flush by themself. 
> 
> maybe can we add the same flush just after the opening in convert. 
> 
> 
> 
> - Mail original - 
> De: "Eneko Lacunza"  
> À: "pve-devel"  
> Cc: "Eneko Lacunza"  
> Envoyé: Mardi 26 Juillet 2016 15:18:55 
> Objet: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD, activating 
> writeback cache. 
> 
> From: Eneko Lacunza  
> 
> Signed-off-by: Eneko Lacunza  
> --- 
> .../0054-vma-force-enable-rbd-cache-for-qmrestore.patch | 17 
> + 
> debian/patches/series | 1 + 
> 2 files changed, 18 insertions(+) 
> create mode 100644 
> debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> 
> diff --git 
> a/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> new file mode 100644 
> index 000..d9722c7 
> --- /dev/null 
> +++ b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> @@ -0,0 +1,17 @@ 
> +Issue a bogus flush so that Ceph activates rbd cache, accelerating qmrestore 
> to RBD. 
> +--- 
> +Index: b/vma.c 
> +=== 
> +--- a/vma.c 
>  b/vma.c 
> +@@ -335,6 +335,9 @@ static int extract_content(int argc, cha 
> + 
> + BlockDriverState *bs = blk_bs(blk); 
> + 
> ++ /* This is needed to activate rbd cache (writeback/coalesce) */ 
> ++ bdrv_flush(bs); 
> ++ 
> + if (vma_reader_register_bs(vmar, i, bs, write_zero, ) < 0) { 
> + g_error("%s", error_get_pretty(errp)); 
> + } 
> + 
> diff --git a/debian/patches/series b/debian/patches/series 
> index 3614309..c858a30 100644 
> --- a/debian/patches/series 
> +++ b/debian/patches/series 
> @@ -51,6 +51,7 @@ pve/0050-fix-possible-unitialised-return-value.patch 
> pve/0051-net-NET_CLIENT_OPTIONS_KIND_MAX-changed.patch 
> pve/0052-vnc-refactor-to-QIOChannelSocket.patch 
> pve/0053-vma-use-BlockBackend-on-extract.patch 
> +pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> #see https://bugs.launchpad.net/qemu/+bug/1488363?comments=all 
> extra/0001-Revert-target-i386-disable-LINT0-after-reset.patch 
> extra/0001-i386-kvmvapic-initialise-imm32-variable.patch 


-- 
Zuzendari Teknikoa / Director Técnico 
Binovo IT Human Project, S.L. 
Telf. 943493611 
943324914 
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) 

Re: [pve-devel] OpenVZ 7

2016-07-26 Thread Dietmar Maurer
> With openvz 7 just being released
> (https://lists.openvz.org/pipermail/announce/2016-July/000664.html), are there
> any possible plans to add openvz back into the latest proxmox versions?

No (no way). We moved to LXC long time ago...

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] OpenVZ 7

2016-07-26 Thread Thomas Lamprecht



On 07/26/2016 04:52 PM, Alex Wacker wrote:

Hello,

With openvz 7 just being released 
(https://lists.openvz.org/pipermail/announce/2016-July/000664.html), are there 
any possible plans to add openvz back into the latest proxmox versions?



OpenVZ, while the same name is now not the container technology but a 
management tools for VMs and CTs, so not sure if that question makes 
sense... :)


Quoting your posted link:


[..]
* Containers use cgroups and namespaces that limit, account for, and isolate
resource usage as isolated namespaces of a collection of processes. The
beancounters interface remains in place for backward compatibility and, at the
same time, acts as a proxy for actual cgroups and namespaces implementation.



Exactly the same thing LXC uses, so what would be any reason to add 
OpenVZ back to PVE, if there would be even a possibility to do so (which 
not really is there).


Did you read the mail, they have now KVM/QEMU which Proxmox VE has since 
it exists, do not use modified Custom Kernels to support containers but 
use techologies similar to LXC (I did not look closely but maybe even 
the use LXC), that should be on the level what we are using since PVE 
4.0 (i.e. LXC)


I do not see a single feature we do not have,

KSM, Memory hot-plugging, UUIDS for VMs and CTs (in our case named 
VMIDs) qcows2 disks (even if I would not count that as new feature as 
they exists for a long time), unified management tool, KVM/QEMU 
hypervisor are all things that PVE has since quite some time..


So what do you mean exactly when writing "adding OpenVZ back to PVE"?

cheers,
Thomas


___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] OpenVZ 7

2016-07-26 Thread Alex Wacker
Hello,

With openvz 7 just being released 
(https://lists.openvz.org/pipermail/announce/2016-July/000664.html), are there 
any possible plans to add openvz back into the latest proxmox versions?


-- 
Alex Wacker

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD, activating writeback cache.

2016-07-26 Thread Alexandre DERUMIER
I wonder if this couldn't be fixed directly in rbd.c block driver


currently:

block/rbd.c

if (flags & BDRV_O_NOCACHE) {
rados_conf_set(s->cluster, "rbd_cache", "false");
} else {
rados_conf_set(s->cluster, "rbd_cache", "true");
}


and in block.c

int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough)
{
*flags &= ~BDRV_O_CACHE_MASK;

if (!strcmp(mode, "off") || !strcmp(mode, "none")) {
*writethrough = false;
*flags |= BDRV_O_NOCACHE;
} else if (!strcmp(mode, "directsync")) {
*writethrough = true;
*flags |= BDRV_O_NOCACHE;
} else if (!strcmp(mode, "writeback")) {
*writethrough = false;
} else if (!strcmp(mode, "unsafe")) {
*writethrough = false;
*flags |= BDRV_O_NO_FLUSH;
} else if (!strcmp(mode, "writethrough")) {
*writethrough = true;
} else {
return -1;
}

return 0;
}



as in qemu-img convert we don't send flush and in vma restore too, we use 
BDRV_O_NO_FLUSH.


I think we should add in rbd.c something like

   if (flags & BDRV_O_NOCACHE) {
   rados_conf_set(s->cluster, "rbd_cache", "false");
   } else {
   rados_conf_set(s->cluster, "rbd_cache", "true");
   }

+   if (flags & BDRV_O_NO_FLUSH) {
+rados_conf_set(s->cluster, "rbd_cache_writethrough_until_flush", 
"false");
+   }


I really think it's the right way (or it's impossible to use unsafe cache 
option) and could be pushed to qemu upsteam.


- Mail original -
De: "Eneko Lacunza" 
À: "pve-devel" 
Envoyé: Mardi 26 Juillet 2016 16:05:52
Objet: Re: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD, 
activating writeback cache.

Hi, 

I'm unsure about wether it would be better to have one patch or two 
independent patches? 

El 26/07/16 a las 16:02, Alexandre DERUMIER escribió: 
> Hi, 
> 
> I think that qemu-img convert have the same problem 
> 
> I have found a commit from 2015 
> https://git.greensocs.com/fkonrad/mttcg/commit/80ccf93b884a2edab5ec62634758e942bba81b7c
>  
> 
> By default it don't send flush (cache=unsafe), and the commit add a flush on 
> image closing, because some storage like sheepdog, 
> can't do the flush by themself. 
> 
> maybe can we add the same flush just after the opening in convert. 
> 
> 
> 
> - Mail original - 
> De: "Eneko Lacunza"  
> À: "pve-devel"  
> Cc: "Eneko Lacunza"  
> Envoyé: Mardi 26 Juillet 2016 15:18:55 
> Objet: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD, activating 
> writeback cache. 
> 
> From: Eneko Lacunza  
> 
> Signed-off-by: Eneko Lacunza  
> --- 
> .../0054-vma-force-enable-rbd-cache-for-qmrestore.patch | 17 
> + 
> debian/patches/series | 1 + 
> 2 files changed, 18 insertions(+) 
> create mode 100644 
> debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> 
> diff --git 
> a/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> new file mode 100644 
> index 000..d9722c7 
> --- /dev/null 
> +++ b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> @@ -0,0 +1,17 @@ 
> +Issue a bogus flush so that Ceph activates rbd cache, accelerating qmrestore 
> to RBD. 
> +--- 
> +Index: b/vma.c 
> +=== 
> +--- a/vma.c 
>  b/vma.c 
> +@@ -335,6 +335,9 @@ static int extract_content(int argc, cha 
> + 
> + BlockDriverState *bs = blk_bs(blk); 
> + 
> ++ /* This is needed to activate rbd cache (writeback/coalesce) */ 
> ++ bdrv_flush(bs); 
> ++ 
> + if (vma_reader_register_bs(vmar, i, bs, write_zero, ) < 0) { 
> + g_error("%s", error_get_pretty(errp)); 
> + } 
> + 
> diff --git a/debian/patches/series b/debian/patches/series 
> index 3614309..c858a30 100644 
> --- a/debian/patches/series 
> +++ b/debian/patches/series 
> @@ -51,6 +51,7 @@ pve/0050-fix-possible-unitialised-return-value.patch 
> pve/0051-net-NET_CLIENT_OPTIONS_KIND_MAX-changed.patch 
> pve/0052-vnc-refactor-to-QIOChannelSocket.patch 
> pve/0053-vma-use-BlockBackend-on-extract.patch 
> +pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
> #see https://bugs.launchpad.net/qemu/+bug/1488363?comments=all 
> extra/0001-Revert-target-i386-disable-LINT0-after-reset.patch 
> extra/0001-i386-kvmvapic-initialise-imm32-variable.patch 


-- 
Zuzendari Teknikoa / Director Técnico 
Binovo IT Human Project, S.L. 
Telf. 943493611 
943324914 
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) 
www.binovo.es 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel 

Re: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD, activating writeback cache.

2016-07-26 Thread Eneko Lacunza

Hi,

I'm unsure about wether it would be better to have one patch or two 
independent patches?


El 26/07/16 a las 16:02, Alexandre DERUMIER escribió:

Hi,

I think that qemu-img convert have the same problem

I have found a commit from 2015
https://git.greensocs.com/fkonrad/mttcg/commit/80ccf93b884a2edab5ec62634758e942bba81b7c

By default it don't send flush (cache=unsafe), and the commit add a flush on 
image closing, because some storage like sheepdog,
can't do the flush by themself.

maybe can we add the same flush just after the opening in convert.



- Mail original -
De: "Eneko Lacunza" 
À: "pve-devel" 
Cc: "Eneko Lacunza" 
Envoyé: Mardi 26 Juillet 2016 15:18:55
Objet: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD,   
activating writeback cache.

From: Eneko Lacunza 

Signed-off-by: Eneko Lacunza 
---
.../0054-vma-force-enable-rbd-cache-for-qmrestore.patch | 17 +
debian/patches/series | 1 +
2 files changed, 18 insertions(+)
create mode 100644 
debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch

diff --git 
a/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch
new file mode 100644
index 000..d9722c7
--- /dev/null
+++ b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch
@@ -0,0 +1,17 @@
+Issue a bogus flush so that Ceph activates rbd cache, accelerating qmrestore 
to RBD.
+---
+Index: b/vma.c
+===
+--- a/vma.c
 b/vma.c
+@@ -335,6 +335,9 @@ static int extract_content(int argc, cha
+
+ BlockDriverState *bs = blk_bs(blk);
+
++ /* This is needed to activate rbd cache (writeback/coalesce) */
++ bdrv_flush(bs);
++
+ if (vma_reader_register_bs(vmar, i, bs, write_zero, ) < 0) {
+ g_error("%s", error_get_pretty(errp));
+ }
+
diff --git a/debian/patches/series b/debian/patches/series
index 3614309..c858a30 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -51,6 +51,7 @@ pve/0050-fix-possible-unitialised-return-value.patch
pve/0051-net-NET_CLIENT_OPTIONS_KIND_MAX-changed.patch
pve/0052-vnc-refactor-to-QIOChannelSocket.patch
pve/0053-vma-use-BlockBackend-on-extract.patch
+pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch
#see https://bugs.launchpad.net/qemu/+bug/1488363?comments=all
extra/0001-Revert-target-i386-disable-LINT0-after-reset.patch
extra/0001-i386-kvmvapic-initialise-imm32-variable.patch



--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD, activating writeback cache.

2016-07-26 Thread Alexandre DERUMIER
Hi,

I think that qemu-img convert have the same problem

I have found a commit from 2015
https://git.greensocs.com/fkonrad/mttcg/commit/80ccf93b884a2edab5ec62634758e942bba81b7c

By default it don't send flush (cache=unsafe), and the commit add a flush on 
image closing, because some storage like sheepdog,
can't do the flush by themself.

maybe can we add the same flush just after the opening in convert.



- Mail original -
De: "Eneko Lacunza" 
À: "pve-devel" 
Cc: "Eneko Lacunza" 
Envoyé: Mardi 26 Juillet 2016 15:18:55
Objet: [pve-devel] [PATCH] Add patch to improve qmrestore to RBD,   
activating writeback cache.

From: Eneko Lacunza  

Signed-off-by: Eneko Lacunza  
--- 
.../0054-vma-force-enable-rbd-cache-for-qmrestore.patch | 17 + 
debian/patches/series | 1 + 
2 files changed, 18 insertions(+) 
create mode 100644 
debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 

diff --git 
a/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
new file mode 100644 
index 000..d9722c7 
--- /dev/null 
+++ b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
@@ -0,0 +1,17 @@ 
+Issue a bogus flush so that Ceph activates rbd cache, accelerating qmrestore 
to RBD. 
+--- 
+Index: b/vma.c 
+=== 
+--- a/vma.c 
 b/vma.c 
+@@ -335,6 +335,9 @@ static int extract_content(int argc, cha 
+ 
+ BlockDriverState *bs = blk_bs(blk); 
+ 
++ /* This is needed to activate rbd cache (writeback/coalesce) */ 
++ bdrv_flush(bs); 
++ 
+ if (vma_reader_register_bs(vmar, i, bs, write_zero, ) < 0) { 
+ g_error("%s", error_get_pretty(errp)); 
+ } 
+ 
diff --git a/debian/patches/series b/debian/patches/series 
index 3614309..c858a30 100644 
--- a/debian/patches/series 
+++ b/debian/patches/series 
@@ -51,6 +51,7 @@ pve/0050-fix-possible-unitialised-return-value.patch 
pve/0051-net-NET_CLIENT_OPTIONS_KIND_MAX-changed.patch 
pve/0052-vnc-refactor-to-QIOChannelSocket.patch 
pve/0053-vma-use-BlockBackend-on-extract.patch 
+pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
#see https://bugs.launchpad.net/qemu/+bug/1488363?comments=all 
extra/0001-Revert-target-i386-disable-LINT0-after-reset.patch 
extra/0001-i386-kvmvapic-initialise-imm32-variable.patch 
-- 
2.1.4 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] [PATCH] Add patch to improve qmrestore to RBD, activating writeback cache.

2016-07-26 Thread elacunza
From: Eneko Lacunza 

Signed-off-by: Eneko Lacunza 
---
 .../0054-vma-force-enable-rbd-cache-for-qmrestore.patch | 17 +
 debian/patches/series   |  1 +
 2 files changed, 18 insertions(+)
 create mode 100644 
debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch

diff --git 
a/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch 
b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch
new file mode 100644
index 000..d9722c7
--- /dev/null
+++ b/debian/patches/pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch
@@ -0,0 +1,17 @@
+Issue a bogus flush so that Ceph activates rbd cache, accelerating qmrestore 
to RBD.
+---
+Index: b/vma.c
+===
+--- a/vma.c
 b/vma.c
+@@ -335,6 +335,9 @@ static int extract_content(int argc, cha
+
+BlockDriverState *bs = blk_bs(blk);
+
++/* This is needed to activate rbd cache (writeback/coalesce) */
++bdrv_flush(bs);
++
+ if (vma_reader_register_bs(vmar, i, bs, write_zero, ) < 0) {
+ g_error("%s", error_get_pretty(errp));
+ }
+
diff --git a/debian/patches/series b/debian/patches/series
index 3614309..c858a30 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -51,6 +51,7 @@ pve/0050-fix-possible-unitialised-return-value.patch
 pve/0051-net-NET_CLIENT_OPTIONS_KIND_MAX-changed.patch
 pve/0052-vnc-refactor-to-QIOChannelSocket.patch
 pve/0053-vma-use-BlockBackend-on-extract.patch
+pve/0054-vma-force-enable-rbd-cache-for-qmrestore.patch
 #see https://bugs.launchpad.net/qemu/+bug/1488363?comments=all
 extra/0001-Revert-target-i386-disable-LINT0-after-reset.patch
 extra/0001-i386-kvmvapic-initialise-imm32-variable.patch
-- 
2.1.4

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] Force enable rbd cache for qmrestore v3

2016-07-26 Thread elacunza
This time this is generated against git repo.
Patch issues a "bogus" flush after opening restore destination device to enable 
rbd cache (writeback/coalescing)
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH WIP/test lxc 0/4] update to 2.0.3

2016-07-26 Thread Wolfgang Bumiller
This seems to work. I also just tested it with cgmanager disabled and
using cgroup namespaces. Seems to be functioning so far.

With cgroup namespaces however, manual intervention is required for
people who use custom apparmor profiles, because they must be based on
lxc-container-default-cgns instead of just lxc-container-default.

I think we should go push this set + --disable-cgmanager to
staging/testing soon.

On Tue, Jul 12, 2016 at 09:27:41AM +0200, Dominik Csapak wrote:
> this patch series updates to lxc 2.0.3
> just for testing purposes
> do a "make download" before building
> to get the latest source from github
> 
> Dominik Csapak (4):
>   update to 2.0.3
>   rebase systemd service patch and var lib vz patch
>   drop patches applied upstream
>   update changelog and pkg version
> 
>  Makefile   |   6 +-
>  debian/changelog   |   6 +
>  ...rmor-add-make-rslave-to-usr.bin.lxc-start.patch |  32 ---
>  debian/patches/0001-added-stop-hook-entries.patch  |  72 --
>  ...armor-allow-binding-run-lock-var-run-lock.patch |  32 ---
>  .../patches/0002-Added-lxc.monitor.unshare.patch   | 131 ---
>  ...-hook-between-STOPPING-and-STOPPED-states.patch |  27 ---
>  ...3-pass-namespace-handles-to-the-stop-hook.patch |  53 -
>  debian/patches/0004-document-the-stop-hook.patch   |  60 -
>  .../0005-added-the-unmount-namespace-hook.patch| 250 
> -
>  ...oks-put-binary-hooks-in-usr-lib-lxc-hooks.patch |  44 
>  debian/patches/fix-systemd-service-depends.patch   |   2 +-
>  debian/patches/series  |  10 -
>  debian/patches/use-var-lib-vz-as-default-dir.patch |  16 +-
>  14 files changed, 17 insertions(+), 724 deletions(-)
>  delete mode 100644 
> debian/patches/0001-AppArmor-add-make-rslave-to-usr.bin.lxc-start.patch
>  delete mode 100644 debian/patches/0001-added-stop-hook-entries.patch
>  delete mode 100644 
> debian/patches/0001-apparmor-allow-binding-run-lock-var-run-lock.patch
>  delete mode 100644 debian/patches/0002-Added-lxc.monitor.unshare.patch
>  delete mode 100644 
> debian/patches/0002-run-stop-hook-between-STOPPING-and-STOPPED-states.patch
>  delete mode 100644 
> debian/patches/0003-pass-namespace-handles-to-the-stop-hook.patch
>  delete mode 100644 debian/patches/0004-document-the-stop-hook.patch
>  delete mode 100644 debian/patches/0005-added-the-unmount-namespace-hook.patch
>  delete mode 100644 
> debian/patches/0006-hooks-put-binary-hooks-in-usr-lib-lxc-hooks.patch
> 
> -- 
> 2.1.4
> 
> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH access-control] don't import 'RFC' from MIME::Base32

2016-07-26 Thread Fabian Grünbichler
applied

On Mon, Jul 25, 2016 at 08:33:29AM +0200, Wolfgang Bumiller wrote:
> call encode_rfc3548 explicitly instead as newer versions of
> the base32 package will drop this import scheme (stretch)
> ---
> One less breakage to worry about when we move to newer debian
> releases in the future.
> 
> Note that in the code in PVE/AccessControl.pm I had already used the
> explicit call, so only the import line was updated in that file.
> 
> Tested successfully with both jessie's libmime-base32-perl=1.02a-1
> as well as stretch's version 1.301-1.
> 
>  PVE/AccessControl.pm | 2 +-
>  oathkeygen   | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/PVE/AccessControl.pm b/PVE/AccessControl.pm
> index 0b64374..ea4245c 100644
> --- a/PVE/AccessControl.pm
> +++ b/PVE/AccessControl.pm
> @@ -8,7 +8,7 @@ use Crypt::OpenSSL::RSA;
>  use Net::SSLeay;
>  use Net::IP;
>  use MIME::Base64;
> -use MIME::Base32 qw(RFC); #libmime-base32-perl
> +use MIME::Base32; #libmime-base32-perl
>  use Digest::SHA;
>  use URI::Escape;
>  use LWP::UserAgent;
> diff --git a/oathkeygen b/oathkeygen
> index 84b6441..89e385a 100755
> --- a/oathkeygen
> +++ b/oathkeygen
> @@ -2,10 +2,10 @@
>  
>  use strict;
>  use warnings;
> -use MIME::Base32 qw(RFC); #libmime-base32-perl
> +use MIME::Base32; #libmime-base32-perl
>  
>  my $test;
>  open(RND, "/dev/urandom");
>  sysread(RND, $test, 10) == 10 || die "read randon data failed\n";
> -print MIME::Base32::encode($test) . "\n";
> +print MIME::Base32::encode_rfc3548($test) . "\n";
>  
> -- 
> 2.1.4
> 
> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH container] add status call to pct

2016-07-26 Thread Fabian Grünbichler
applied

On Mon, Jul 18, 2016 at 01:47:53PM +0200, Dominik Csapak wrote:
> mostly copied from QemuServer
> 
> Signed-off-by: Dominik Csapak 
> ---
>  src/PVE/CLI/pct.pm | 40 
>  1 file changed, 40 insertions(+)
> 
> diff --git a/src/PVE/CLI/pct.pm b/src/PVE/CLI/pct.pm
> index 3e99313..cf4a014 100755
> --- a/src/PVE/CLI/pct.pm
> +++ b/src/PVE/CLI/pct.pm
> @@ -32,6 +32,45 @@ my $upid_exit = sub {
>  exit($status eq 'OK' ? 0 : -1);
>  };
>  
> +__PACKAGE__->register_method ({
> +name => 'status',
> +path => 'status',
> +method => 'GET',
> +description => "Show CT status.",
> +parameters => {
> + additionalProperties => 0,
> + properties => {
> + vmid => get_standard_option('pve-vmid', { completion => 
> \::LXC::complete_ctid }),
> + verbose => {
> + description => "Verbose output format",
> + type => 'boolean',
> + optional => 1,
> + }
> + },
> +},
> +returns => { type => 'null'},
> +code => sub {
> + my ($param) = @_;
> +
> + # test if CT exists
> + my $conf = PVE::LXC::Config->load_config ($param->{vmid});
> +
> + my $vmstatus = PVE::LXC::vmstatus($param->{vmid});
> + my $stat = $vmstatus->{$param->{vmid}};
> + if ($param->{verbose}) {
> + foreach my $k (sort (keys %$stat)) {
> + my $v = $stat->{$k};
> + next if !defined($v);
> + print "$k: $v\n";
> + }
> + } else {
> + my $status = $stat->{status} || 'unknown';
> + print "status: $status\n";
> + }
> +
> + return undef;
> +}});
> +
>  sub read_password {
>  my $term = new Term::ReadLine ('pct');
>  my $attribs = $term->Attribs;
> @@ -655,6 +694,7 @@ our $cmddef = {
>  clone => [ "PVE::API2::LXC", 'clone_vm', ['vmid', 'newid'], { node => 
> $nodename }, $upid_exit ],
>  migrate => [ "PVE::API2::LXC", 'migrate_vm', ['vmid', 'target'], { node 
> => $nodename }, $upid_exit],
>  
> +status => [ __PACKAGE__, 'status', ['vmid']],
>  console => [ __PACKAGE__, 'console', ['vmid']],
>  enter => [ __PACKAGE__, 'enter', ['vmid']],
>  unlock => [ __PACKAGE__, 'unlock', ['vmid']],
> -- 
> 2.1.4
> 
> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH qemu-server] fix verbose qm status output

2016-07-26 Thread Fabian Grünbichler
applied

On Mon, Jul 18, 2016 at 10:50:31AM +0200, Dominik Csapak wrote:
> we did not check if some values were hash refs in
> the verbose output.
> 
> this patch adds a recursive hash print sub and uses it
> 
> Signed-off-by: Dominik Csapak 
> ---
>  PVE/CLI/qm.pm | 29 +++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
> index d0d7a6c..e513f33 100755
> --- a/PVE/CLI/qm.pm
> +++ b/PVE/CLI/qm.pm
> @@ -75,6 +75,32 @@ sub run_vnc_proxy {
>  exit(0);
>  }
>  
> +sub print_recursive_hash {
> +my ($prefix, $hash, $key) = @_;
> +
> +if (ref($hash) eq 'HASH') {
> + if (defined($key)) {
> + print "$prefix$key:\n";
> + }
> + foreach my $itemkey (keys %$hash) {
> + print_recursive_hash("\t$prefix", $hash->{$itemkey}, $itemkey);
> + }
> +} elsif (ref($hash) eq 'ARRAY') {
> + if (defined($key)) {
> + print "$prefix$key:\n";
> + }
> + foreach my $item (@$hash) {
> + print_recursive_hash("\t$prefix", $item);
> + }
> +} elsif (!ref($hash) && defined($hash)) {
> + if (defined($key)) {
> + print "$prefix$key: $hash\n";
> + } else {
> + print "$prefix$hash\n";
> + }
> +}
> +}
> +
>  __PACKAGE__->register_method ({
>  name => 'showcmd',
>  path => 'showcmd',
> @@ -125,8 +151,7 @@ __PACKAGE__->register_method ({
>   foreach my $k (sort (keys %$stat)) {
>   next if $k eq 'cpu' || $k eq 'relcpu'; # always 0
>   my $v = $stat->{$k};
> - next if !defined($v);
> - print "$k: $v\n";
> + print_recursive_hash("", $v, $k);
>   }
>   } else {
>   my $status = $stat->{qmpstatus} || 'unknown';
> -- 
> 2.1.4
> 
> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH qemu-server] Fix #1057: make protection a fast-plug option

2016-07-26 Thread Fabian Grünbichler
applied

On Tue, Jul 19, 2016 at 09:17:36AM +0200, Wolfgang Bumiller wrote:
> Otherwise you need to shutdown a VM to disable protection,
> which is inconvenient for a few tasks such as for instance
> deleting an unused disk.
> ---
> This is already the case for containers btw.
> 
>  PVE/QemuServer.pm | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index fb91862..7778fb8 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -3871,6 +3871,7 @@ my $fast_plug_option = {
>  'shares' => 1,
>  'startup' => 1,
>  'description' => 1,
> +'protection' => 1,
>  };
>  
>  # hotplug changes in [PENDING]
> -- 
> 2.1.4
> 
> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [pve-manager] prevent KRDB and Monitor(s) being localized

2016-07-26 Thread Fabian Grünbichler
(rebased and) applied

On Tue, Jul 12, 2016 at 12:14:01PM +0200, Emmanuel Kasper wrote:
> Both terms are rather domain specific and should not be translated.
> See http://pve.proxmox.com/pipermail/pve-devel/2016-July/021975.html
> for the problems of Monitor Host being wrongly translated
> ---
>  www/manager6/storage/RBDEdit.js | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/www/manager6/storage/RBDEdit.js b/www/manager6/storage/RBDEdit.js
> index a38ea52..5e5c802 100644
> --- a/www/manager6/storage/RBDEdit.js
> +++ b/www/manager6/storage/RBDEdit.js
> @@ -40,7 +40,7 @@ Ext.define('PVE.storage.RBDInputPanel', {
>   xtype: me.create ? 'textfield' : 'displayfield',
>   name: 'monhost',
>   value: '',
> - fieldLabel: gettext('Monitor Host'),
> + fieldLabel: 'Monitor(s)',
>   allowBlank: false
>   },
>   {
> @@ -76,7 +76,7 @@ Ext.define('PVE.storage.RBDInputPanel', {
>   xtype: 'pvecheckbox',
>   name: 'krbd',
>   uncheckedValue: 0,
> - fieldLabel: gettext('KRBD')
> + fieldLabel: 'KRBD'
>   }
>   ];
>   /*jslint confusion: false*/
> -- 
> 2.1.4
> 
> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH v2 manager] Add mac prefix to the datacenter options

2016-07-26 Thread Fabian Grünbichler
applied with whitespace cleanup

On Fri, Jul 15, 2016 at 10:34:37AM +0200, Wolfgang Bumiller wrote:
> ---
>   * anchored the regex
>   * added regexText with an example
>   * changed the empty text to PVE.Util.noneText
> 
>  www/manager6/dc/OptionView.js | 37 +
>  1 file changed, 37 insertions(+)
> 
> diff --git a/www/manager6/dc/OptionView.js b/www/manager6/dc/OptionView.js
> index 2149cb0..f02b870 100644
> --- a/www/manager6/dc/OptionView.js
> +++ b/www/manager6/dc/OptionView.js
> @@ -99,6 +99,32 @@ Ext.define('PVE.dc.EmailFromEdit', {
>  }
>  });
>  
> +Ext.define('PVE.dc.MacPrefixEdit', {
> +extend: 'PVE.window.Edit',
> +
> +initComponent : function() {
> + var me = this;
> +
> + Ext.applyIf(me, {
> + subject: gettext('MAC address prefix'),
> + items: {
> + xtype: 'pvetextfield',
> + name: 'mac_prefix',
> + regex: /^[a-f0-9]{2}(?::[a-f0-9]{2}){0,2}:?$/i,
> + regexText: gettext('Example') + ': 02:8f',
> + emptyText: PVE.Utils.noneText,
> + deleteEmpty: true,
> + value: '',
> + fieldLabel: gettext('MAC address prefix')
> + }
> + });
> +
> + me.callParent();
> +
> + me.load();
> +}
> +});
> +
>  Ext.define('PVE.dc.OptionView', {
>  extend: 'PVE.grid.ObjectGrid',
>  alias: ['widget.pveDcOptionView'],
> @@ -146,6 +172,17 @@ Ext.define('PVE.dc.OptionView', {
>   }
>   return value;
>   }
> + },
> + mac_prefix: {
> + header: gettext('MAC address prefix'),
> + editor: 'PVE.dc.MacPrefixEdit', 
> + required: true,
> + renderer: function(value) {
> + if (!value) {
> + return PVE.Utils.noneText;
> + }
> + return value;
> + }
>   }
>   };
>  
> -- 
> 2.1.4
> 
> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH pve-qemu-kvm] Force enable rbd cache for qmrestore v2

2016-07-26 Thread Thomas Lamprecht

Hi Eneko,


On 07/26/2016 02:05 PM, Eneko Lacunza wrote:

Hi Thomas,

El 26/07/16 a las 13:55, Thomas Lamprecht escribió:
Hi, first thanks for the contribution! Not commenting on the code 
itself but we need a CLA for being able to add your contributions,

we use the Harmony CLA, a community-centered CLA for FOSS projects,
see 
https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright

Thanks, will send this shortly.


Perfect, thanks!



Also I would suggest using git format-patch and git send-email for 
sending the patch to the mailing list, so it is ensured that it can 
be applied here without problems.
A small example of the necessary commands can be found at 
https://pve.proxmox.com/wiki/Developer_Documentation#Git_commands_summary


I saw that, but in this case the modified vma.c file is in a tarball 
in the git repository, how to proceed in this case? The output sent is 
created by quilt diff command. :)




Ah okay I understand. I'd do the following, the patches from our side 
which make upstream qemu a PVE qemu reside in 
/debian/patches/
In the "pve" folder are feature and fixes patches from us and in the 
"extra" folder are mostly security fixes which aren't in upstream yet.


So your patch would go into the pve folder, add a patch file there and 
add it in /debian/patches/series file, e.g.:


if you add pve/force-enable-rbd-cache-for-qmrestore.patch

a line with:

pve/force-enable-rbd-cache-for-qmrestore.patch

Should be added to the "series" file.

It's a bit complicated, I know...
Personally I do not use quilt at all but use an standard qemu repo with 
our patches applied on top on separate branches, this way I can do all 
the patchwork with git (which I'm used to).

But I do not think that that would make your case easier... :)

If you say it's just to tiresome/complicated for now I maybe can help 
you and add a commit with a "Signed-off by Eneko Lacunza" line in there, 
so that copyrigth and commit attribution is correctly handled :)


cheers,
Thomas

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] 3 numa topology issues

2016-07-26 Thread Alexandre DERUMIER
> >>Issue #1: The above code currently does not honor our 'hostnodes' option 
> >>and breaks when trying to use them together. 

Also I need to check how to allocated hugepage, when hostnodes is defined with 
range like : "hostnodes:0-1".





>>Useless, yes, which is why I'm wondering whether this should be 
>>supported/warned about/error... 

I think we could force to define "hostnodes".
I don't known if a lot of people already use numaX option, but as we never 
exposed it in GUI, i don't think it could break setup of too many people.





- Mail original -
De: "Wolfgang Bumiller" 
À: "aderumier" 
Cc: "pve-devel" 
Envoyé: Mardi 26 Juillet 2016 13:59:42
Objet: Re: 3 numa topology issues

On Tue, Jul 26, 2016 at 01:35:50PM +0200, Alexandre DERUMIER wrote: 
> Hi Wolfgang, 
> 
> I just come back from holiday. 

Hope you had a good time :-) 

> 
> 
> 
> >>Issue #1: The above code currently does not honor our 'hostnodes' option 
> >>and breaks when trying to use them together. 
> 
> mmm indeed. I think this can be improved. I'll try to check that next week. 
> 
> 
> 
> >>Issue #2: We create one node per *virtual* socket, which means enabling 
> >>hugepages with more virtual sockets than physical numa nodes will die 
> >>with the error that the numa node doesn't exist. This should be fixable 
> >>as far as I can tell, as nothing really prevents us from putting them on 
> >>the same node? At least this used to work and I've already asked this 
> >>question at some point. You said the host kernel will try to map them, 
> >>yet it worked without issues before, so I'm still not sure about this. 
> >>Here's the conversation snippet: 
> 
> you can create more virtual numa node than physical, only if you don't define 
> "hostnodes" option. 
> 
> (from my point of vue, it's totally useless, as the whole point of numa 
> option is to map virtual node to physical node, to avoid memory access 
> bottleneck) 

Useless, yes, which is why I'm wondering whether this should be 
supported/warned about/error... 

> 
> if hostnodes is defined, you need to have physical numa node available (vm 
> with 2 numa node need host with 2 numa node) 
> 
> With hugepage enabled, I have added a restriction to have hostnode defined, 
> because you want to be sure that memory is on same node. 
> 
> 
> # hostnodes 
> my $hostnodelists = $numa->{hostnodes}; 
> if (defined($hostnodelists)) { 
> my $hostnodes; 
> foreach my $hostnoderange (@$hostnodelists) { 
> my ($start, $end) = @$hostnoderange; 
> $hostnodes .= ',' if $hostnodes; 
> $hostnodes .= $start; 
> $hostnodes .= "-$end" if defined($end); 
> $end //= $start; 
> for (my $i = $start; $i <= $end; ++$i ) { 
> die "host NUMA node$i doesn't exist\n" if ! -d 
> "/sys/devices/system/node/node$i/"; 
> } 
> } 
> 
> # policy 
> my $policy = $numa->{policy}; 
> die "you need to define a policy for hostnode $hostnodes\n" if !$policy; 
> $mem_object .= ",host-nodes=$hostnodes,policy=$policy"; 
> } else { 
> die "numa hostnodes need to be defined to use hugepages" if 
> $conf->{hugepages}; 
> } 
> 
> 
> >>Issue #3: Actually just an extension to #2: we currently cannot enable 
> >>NUMA at all (even without hugepages) when there are more virtual sockets 
> >>than physical numa nodes, and this used to work. The big question is 
> >>now: does this even make sense? Or should we tell users not to do this? 
> 
> That's strange, it should work if you don't defined hugepages and hostnodes 
> option(in numaX) 

Actually this one was my own faulty configuration, sorry. 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH pve-qemu-kvm] Force enable rbd cache for qmrestore v2

2016-07-26 Thread Eneko Lacunza

Hi Thomas,

El 26/07/16 a las 13:55, Thomas Lamprecht escribió:
Hi, first thanks for the contribution! Not commenting on the code 
itself but we need a CLA for being able to add your contributions,

we use the Harmony CLA, a community-centered CLA for FOSS projects,
see 
https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright

Thanks, will send this shortly.


Also I would suggest using git format-patch and git send-email for 
sending the patch to the mailing list, so it is ensured that it can be 
applied here without problems.
A small example of the necessary commands can be found at 
https://pve.proxmox.com/wiki/Developer_Documentation#Git_commands_summary


I saw that, but in this case the modified vma.c file is in a tarball in 
the git repository, how to proceed in this case? The output sent is 
created by quilt diff command. :)


Thanks
Eneko

--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH manager] fix influxdb field assignment and allow non integer field

2016-07-26 Thread Dominik Csapak




could you split this into two (or more) parts? mixing cosmetic changes
like variable renaming and style fixes with actual changes makes it hard
to read (and is also bad for git blameability ;)) if the two issues are
easily split into their own commits that might make sense as well



mhmm, well in this case this is not so easy, because i rewrote most of 
the sub


so if i want to split it in two patches, i have to either fix code which 
gets then shifted and unindented (making the second patch not smaller) 
or i have to reintroduce the bug in the rewrite and then fixing it...


it this really desireable?

sadly even with --patience, git is not smart enough to see that the
function is almost completely different

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] 3 numa topology issues

2016-07-26 Thread Wolfgang Bumiller
On Tue, Jul 26, 2016 at 01:35:50PM +0200, Alexandre DERUMIER wrote:
> Hi Wolfgang,
> 
> I just come back from holiday.

Hope you had a good time :-)

> 
> 
> 
> >>Issue #1: The above code currently does not honor our 'hostnodes' option 
> >>and breaks when trying to use them together. 
> 
> mmm indeed. I think this can be improved. I'll try to check that next week.
> 
> 
> 
> >>Issue #2: We create one node per *virtual* socket, which means enabling 
> >>hugepages with more virtual sockets than physical numa nodes will die 
> >>with the error that the numa node doesn't exist. This should be fixable 
> >>as far as I can tell, as nothing really prevents us from putting them on 
> >>the same node? At least this used to work and I've already asked this 
> >>question at some point. You said the host kernel will try to map them, 
> >>yet it worked without issues before, so I'm still not sure about this. 
> >>Here's the conversation snippet: 
> 
> you can create more virtual numa node than physical, only if you don't define 
> "hostnodes" option.
> 
> (from my point of vue, it's totally useless, as the whole point of numa 
> option is to map virtual node to physical node, to avoid memory access 
> bottleneck)

Useless, yes, which is why I'm wondering whether this should be
supported/warned about/error...

> 
> if hostnodes is defined, you need to have physical numa node available (vm 
> with 2 numa node need host with 2 numa node)
> 
> With hugepage enabled, I have added a restriction to have hostnode defined, 
> because you want to be sure that memory is on same node.
> 
> 
> # hostnodes
> my $hostnodelists = $numa->{hostnodes};
> if (defined($hostnodelists)) {
> my $hostnodes;
> foreach my $hostnoderange (@$hostnodelists) {
> my ($start, $end) = @$hostnoderange;
> $hostnodes .= ',' if $hostnodes;
> $hostnodes .= $start;
> $hostnodes .= "-$end" if defined($end);
> $end //= $start;
> for (my $i = $start; $i <= $end; ++$i ) {
> die "host NUMA node$i doesn't exist\n" if ! -d 
> "/sys/devices/system/node/node$i/";
> }
> }
> 
> # policy
> my $policy = $numa->{policy};
> die "you need to define a policy for hostnode $hostnodes\n" 
> if !$policy;
> $mem_object .= ",host-nodes=$hostnodes,policy=$policy";
> } else {
> die "numa hostnodes need to be defined to use hugepages" if 
> $conf->{hugepages};
> }
> 
> 
> >>Issue #3: Actually just an extension to #2: we currently cannot enable 
> >>NUMA at all (even without hugepages) when there are more virtual sockets 
> >>than physical numa nodes, and this used to work. The big question is 
> >>now: does this even make sense? Or should we tell users not to do this? 
> 
> That's strange, it should work if you don't defined hugepages and hostnodes 
> option(in numaX)

Actually this one was my own faulty configuration, sorry.

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH manager] fix influxdb field assignment and allow non integer field

2016-07-26 Thread Fabian Grünbichler
On Tue, Jul 26, 2016 at 11:53:29AM +0200, Dominik Csapak wrote:
> this patch fixes an issue where we assemble the influxdb
> key value pairs to the wrong measurement
> 
> and also we did only allow integer fields,
> excluding all cpu,load and wait measurements
> 
> this patch fixes both issues with a rewrite of the
> recursive build_influxdb_payload sub
> 
> Signed-off-by: Dominik Csapak 
> ---

could you split this into two (or more) parts? mixing cosmetic changes
like variable renaming and style fixes with actual changes makes it hard
to read (and is also bad for git blameability ;)) if the two issues are
easily split into their own commits that might make sense as well

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH pve-qemu-kvm] Force enable rbd cache for qmrestore v2

2016-07-26 Thread Thomas Lamprecht
Hi, first thanks for the contribution! Not commenting on the code itself 
but we need a CLA for being able to add your contributions,

we use the Harmony CLA, a community-centered CLA for FOSS projects,
see 
https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright


Also I would suggest using git format-patch and git send-email for 
sending the patch to the mailing list, so it is ensured that it can be 
applied here without problems.
A small example of the necessary commands can be found at 
https://pve.proxmox.com/wiki/Developer_Documentation#Git_commands_summary


Cheers,
Thomas


On 07/26/2016 01:29 PM, Eneko Lacunza wrote:

Hi,

I just tested this patch to work as well as the previous one. Instead 
of setting rbd_cache_writethrough_until_flush=false in devfn, issue a 
bogus flush so that Ceph activated rbd cache.


---
Index: b/vma.c
===
--- a/vma.c
+++ b/vma.c
@@ -335,6 +335,9 @@ static int extract_content(int argc, cha

BlockDriverState *bs = blk_bs(blk);

+/* This is needed to activate rbd cache 
(writeback/coalesce) */

+bdrv_flush(bs);
+
 if (vma_reader_register_bs(vmar, i, bs, write_zero, 
) < 0) {

 g_error("%s", error_get_pretty(errp));
 }





___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Alexandre DERUMIER
>>This is how it works right now ;) - not flushing doesn't mean system 
>>won't write data; it can just do so when it thinks is a good time.

If think is true with filesystems (fs will try to flush at regular interval), 
but I'm not sure when you write to a block device without doing any flush.

I remember in past to use cache=unsafe with qemu, and a scsi drive, I was able 
to write gibabytes of datas in host memory
without any flush occuring.



- Mail original -
De: "Eneko Lacunza" 
À: "pve-devel" 
Envoyé: Mardi 26 Juillet 2016 13:19:28
Objet: Re: [pve-devel] Speed up PVE Backup

Hi, 

El 26/07/16 a las 10:32, Alexandre DERUMIER escribió: 
>>> There is no reason to flush a restored disk until just the end, really. 
>>> Issuing flushes every x MB could hurt other storages without need. 
> I'm curious to see host memory usage of a big local file storage restore 
> (100GB), with writeback without any flush ? 
This is how it works right now ;) - not flushing doesn't mean system 
won't write data; it can just do so when it thinks is a good time. 

Cheers 


-- 
Zuzendari Teknikoa / Director Técnico 
Binovo IT Human Project, S.L. 
Telf. 943493611 
943324914 
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) 
www.binovo.es 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] 3 numa topology issues

2016-07-26 Thread Alexandre DERUMIER
Hi Wolfgang,

I just come back from holiday.



>>Issue #1: The above code currently does not honor our 'hostnodes' option 
>>and breaks when trying to use them together. 

mmm indeed. I think this can be improved. I'll try to check that next week.



>>Issue #2: We create one node per *virtual* socket, which means enabling 
>>hugepages with more virtual sockets than physical numa nodes will die 
>>with the error that the numa node doesn't exist. This should be fixable 
>>as far as I can tell, as nothing really prevents us from putting them on 
>>the same node? At least this used to work and I've already asked this 
>>question at some point. You said the host kernel will try to map them, 
>>yet it worked without issues before, so I'm still not sure about this. 
>>Here's the conversation snippet: 

you can create more virtual numa node than physical, only if you don't define 
"hostnodes" option.

(from my point of vue, it's totally useless, as the whole point of numa option 
is to map virtual node to physical node, to avoid memory access bottleneck)

if hostnodes is defined, you need to have physical numa node available (vm with 
2 numa node need host with 2 numa node)

With hugepage enabled, I have added a restriction to have hostnode defined, 
because you want to be sure that memory is on same node.


# hostnodes
my $hostnodelists = $numa->{hostnodes};
if (defined($hostnodelists)) {
my $hostnodes;
foreach my $hostnoderange (@$hostnodelists) {
my ($start, $end) = @$hostnoderange;
$hostnodes .= ',' if $hostnodes;
$hostnodes .= $start;
$hostnodes .= "-$end" if defined($end);
$end //= $start;
for (my $i = $start; $i <= $end; ++$i ) {
die "host NUMA node$i doesn't exist\n" if ! -d 
"/sys/devices/system/node/node$i/";
}
}

# policy
my $policy = $numa->{policy};
die "you need to define a policy for hostnode $hostnodes\n" if 
!$policy;
$mem_object .= ",host-nodes=$hostnodes,policy=$policy";
} else {
die "numa hostnodes need to be defined to use hugepages" if 
$conf->{hugepages};
}


>>Issue #3: Actually just an extension to #2: we currently cannot enable 
>>NUMA at all (even without hugepages) when there are more virtual sockets 
>>than physical numa nodes, and this used to work. The big question is 
>>now: does this even make sense? Or should we tell users not to do this? 

That's strange, it should work if you don't defined hugepages and hostnodes 
option(in numaX)

(but without hostnodes, for me, i don't see any reason to do this)


- Mail original -
De: "Wolfgang Bumiller" 
À: "aderumier" 
Cc: "pve-devel" 
Envoyé: Mardi 26 Juillet 2016 12:36:44
Objet: 3 numa topology issues

Currently we have the following code in hugepages_topology(): 

| for (my $i = 0; $i < $MAX_NUMA; $i++) { 
| next if !$conf->{"numa$i"}; 
| my $numa = PVE::QemuServer::parse_numa($conf->{"numa$i"}); 
(...) 
| $hugepages_topology->{$hugepages_size}->{$i} += hugepages_nr($numa_memory, 
$hugepages_size); 
| } 

The way $hugepages_topology is used this means that numa node 0 will 
always allocate from the host's numa node 0, 1 from 1 and so on: 

From hugepages_allocate(): 

| my $nodes = $hugepages_topology->{$size}; 
| 
| foreach my $numanode (keys %$nodes) { 
(...) 
| my $path = 
"/sys/devices/system/node/node${numanode}/hugepages/hugepages-${hugepages_size}kB/";
 
(...) 
| } 

Issue #1: The above code currently does not honor our 'hostnodes' option 
and breaks when trying to use them together. 

Issue #2: We create one node per *virtual* socket, which means enabling 
hugepages with more virtual sockets than physical numa nodes will die 
with the error that the numa node doesn't exist. This should be fixable 
as far as I can tell, as nothing really prevents us from putting them on 
the same node? At least this used to work and I've already asked this 
question at some point. You said the host kernel will try to map them, 
yet it worked without issues before, so I'm still not sure about this. 
Here's the conversation snippet: 

| >>When adding more numaX entries to the VM's config than the host has this 
| >>now produces an 'Use of uninitialized value' error. 
| >>Better check for whether /sys/devices/system/node/node$numanode exists 
| >>and throw a useful error. 
| >>But should this even be fixed to host nodes? Without hugepages I was 
| >>able to provide more smaller numa nodes to the guest (iow. split one big 
| >>host numa node into multiple smaller virtual ones), should this not work 
| >>with hugepages, too? 
| 
| I need to check that. But you shouldn't be able to create more numa nodes 
number in guest than 

[pve-devel] [PATCH pve-qemu-kvm] Force enable rbd cache for qmrestore v2

2016-07-26 Thread Eneko Lacunza

Hi,

I just tested this patch to work as well as the previous one. Instead of 
setting rbd_cache_writethrough_until_flush=false in devfn, issue a bogus 
flush so that Ceph activated rbd cache.


---
Index: b/vma.c
===
--- a/vma.c
+++ b/vma.c
@@ -335,6 +335,9 @@ static int extract_content(int argc, cha

BlockDriverState *bs = blk_bs(blk);

+/* This is needed to activate rbd cache (writeback/coalesce) */
+bdrv_flush(bs);
+
 if (vma_reader_register_bs(vmar, i, bs, write_zero, ) 
< 0) {

 g_error("%s", error_get_pretty(errp));
 }


--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Eneko Lacunza

Hi,

El 26/07/16 a las 10:32, Alexandre DERUMIER escribió:

There is no reason to flush a restored disk until just the end, really.
Issuing flushes every x MB could hurt other storages without need.

I'm curious to see host memory usage of a big local file storage restore 
(100GB), with writeback without any flush ?
This is how it works right now ;) - not flushing doesn't mean system 
won't write data; it can just do so when it thinks is a good time.


Cheers


--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH pve-qemu-kvm] Force enable rbd cache for qmrestore RFC

2016-07-26 Thread Eneko Lacunza

El 26/07/16 a las 13:15, Dietmar Maurer escribió:

Index: b/vma.c
===
--- a/vma.c
+++ b/vma.c
@@ -328,6 +328,12 @@ static int extract_content(int argc, cha
  }


+/* Force rbd cache */
+if (0 == strncmp(devfn, "rbd:", strlen("rbd:"))) {
+char *devfn_new =
g_strdup_printf("%s:rbd_cache_writethrough_until_flush=false", devfn);


Would it be enough to do a single

   bdrv_flush(bs)

after blk_new_open() ?



Yes.


--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] 3 numa topology issues

2016-07-26 Thread Wolfgang Bumiller
Currently we have the following code in hugepages_topology():

|for (my $i = 0; $i < $MAX_NUMA; $i++) {
|next if !$conf->{"numa$i"};
|my $numa = PVE::QemuServer::parse_numa($conf->{"numa$i"});
(...)
|$hugepages_topology->{$hugepages_size}->{$i} += 
hugepages_nr($numa_memory, $hugepages_size);
|}

The way $hugepages_topology is used this means that numa node 0 will
always allocate from the host's numa node 0, 1 from 1 and so on:

>From hugepages_allocate():

|   my $nodes = $hugepages_topology->{$size};
|
|   foreach my $numanode (keys %$nodes) {
(...)
|   my $path = 
"/sys/devices/system/node/node${numanode}/hugepages/hugepages-${hugepages_size}kB/";
(...)
|   }

Issue #1: The above code currently does not honor our 'hostnodes' option
and breaks when trying to use them together. 

Issue #2: We create one node per *virtual* socket, which means enabling
hugepages with more virtual sockets than physical numa nodes will die
with the error that the numa node doesn't exist. This should be fixable
as far as I can tell, as nothing really prevents us from putting them on
the same node? At least this used to work and I've already asked this
question at some point. You said the host kernel will try to map them,
yet it worked without issues before, so I'm still not sure about this.
Here's the conversation snippet:

| >>When adding more numaX entries to the VM's config than the host has this
| >>now produces an 'Use of uninitialized value' error.
| >>Better check for whether /sys/devices/system/node/node$numanode exists
| >>and throw a useful error.
| >>But should this even be fixed to host nodes? Without hugepages I was
| >>able to provide more smaller numa nodes to the guest (iow. split one big
| >>host numa node into multiple smaller virtual ones), should this not work
| >>with hugepages, too?
| 
| I need to check that. But you shouldn't be able to create more numa nodes 
number in guest than host nodes
| +number.
| (Because linux host kernel will try to map guest numa node to host numa node)

In the worst case we could create one big node for all cpus?

Issue #3: Actually just an extension to #2: we currently cannot enable
NUMA at all (even without hugepages) when there are more virtual sockets
than physical numa nodes, and this used to work. The big question is
now: does this even make sense? Or should we tell users not to do this?

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] [PATCH pve-qemu-kvm] Force enable rbd cache for qmrestore RFC

2016-07-26 Thread Eneko Lacunza

Hi all,

This is my first code contribution for Proxmox. Please correct my 
wrongdoings with patch creation/code style/solution etc. :-)


This small patch adds a flag to devfn to force rbd cache (writeback 
cache) activation for qmrestore, to improve performance on restore to 
RBD. This follows our last week findings.


Tested with a 10GB disk restore with 44% sparse bytes, in our office 
running cluster I get about 13,5MB/s without patch, 29MB/s with patch 
this morning.


---

Index: b/vma.c
===
--- a/vma.c
+++ b/vma.c
@@ -328,6 +328,12 @@ static int extract_content(int argc, cha
}


+/* Force rbd cache */
+if (0 == strncmp(devfn, "rbd:", strlen("rbd:"))) {
+char *devfn_new = 
g_strdup_printf("%s:rbd_cache_writethrough_until_flush=false", devfn);

+g_free(devfn);
+devfn = devfn_new;
+}
if (errp || !(blk = blk_new_open(devfn, NULL, options, 
flags, ))) {

 g_error("can't open file %s - %s", devfn,
 error_get_pretty(errp));


--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] [PATCH manager] fix influxdb field assignment and allow non integer field

2016-07-26 Thread Dominik Csapak
this patch fixes an issue where we assemble the influxdb
key value pairs to the wrong measurement

and also we did only allow integer fields,
excluding all cpu,load and wait measurements

this patch fixes both issues with a rewrite of the
recursive build_influxdb_payload sub

Signed-off-by: Dominik Csapak 
---
 PVE/Status/InfluxDB.pm | 73 --
 1 file changed, 47 insertions(+), 26 deletions(-)

diff --git a/PVE/Status/InfluxDB.pm b/PVE/Status/InfluxDB.pm
index 8300147..8130795 100644
--- a/PVE/Status/InfluxDB.pm
+++ b/PVE/Status/InfluxDB.pm
@@ -100,42 +100,63 @@ sub write_influxdb_hash {
 }
 
 sub build_influxdb_payload {
-my ($payload, $d, $ctime, $tags, $measurement, $depth) = @_;
+my ($payload, $data, $ctime, $tags, $measurement, $instance) = @_;
 
-$depth = 0 if !$depth;
 my @values = ();
 
-for my $key (keys %$d) {
+foreach my $key (sort keys %$data) {
+   my $value = $data->{$key};
+   next if !defined($value);
 
-my $value = $d->{$key};
-my $oldtags = $tags;
-   
-if ( defined $value ) {
-if ( ref $value eq 'HASH' ) {
+   if (!ref($value) && $value ne '') {
+   # value is scalar
 
-   if($depth == 0) {
-   $measurement = $key;
-   }elsif($depth == 1){
-   $tags .= ",instance=$key";
-   }
+   # if value is not just a number we
+   # have to replace " with \"
+   # and surround it with "
+   if ($value =~ m/[^\d\.]/) {
+   $value =~ s/\"/\\\"/g;
+   $value = "\"$value\"";
+   }
+   push @values, "$key=$value";
+   } elsif (ref($value) eq 'HASH') {
+   # value is a hash
 
-   $depth++;
-build_influxdb_payload($payload, $value, $ctime, $tags, 
$measurement, $depth);
-   $depth--;
-
-}elsif ($value =~ m/^\d+$/) {
-
-   $measurement = "system" if !$measurement && $depth == 0;
-   push(@values, "$key=$value");
-}
-}
-$tags = $oldtags;
+   if (!defined($measurement)) {
+   build_influxdb_payload($payload, $value, $ctime, $tags, $key);
+   } elsif(!defined($instance)) {
+   build_influxdb_payload($payload, $value, $ctime, $tags, 
$measurement, $key);
+   } else {
+   push @values, get_recursive_values($value);
+   }
+   }
 }
 
-if(@values > 0) {
+if (@values > 0) {
+   my $mm = $measurement // 'system';
+   my $tagstring = $tags;
+   $tagstring .= ",instance=$instance" if defined($instance);
my $valuestr =  join(',', @values);
-   $payload->{string} .= $measurement.",$tags $valuestr $ctime\n";
+   $payload->{string} .= "$mm,$tagstring $valuestr $ctime\n";
 }
 }
 
+sub get_recursive_values {
+my ($hash) = @_;
+
+my @values = ();
+
+foreach my $key (keys %$hash) {
+   my $value = $hash->{$key};
+   if(ref($value) eq 'HASH') {
+   push(@values, get_recursive_values($value));
+   } elsif (!ref($value) && $value ne '') {
+   $value = prepare_value($value);
+   push @values, "$key=$value";
+   }
+}
+
+return @values;
+}
+
 1;
-- 
2.1.4


___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] [PATCH kvm 1/2] fix various CVEs

2016-07-26 Thread Thomas Lamprecht
For upstream commits 926cde5f3e4d2504ed161ed0 and
cc96677469388bad3d664793 is no CVE number assigned yet.

Signed-off-by: Thomas Lamprecht 
---
Readded CVE CVE-2016-2391 and CVE-2016-5126

patch 0001-vga-add-sr_vbe-register-set.patch is only moved in the series file
to match the commit order of the git branches


 ...ke-cmdbuf-big-enough-for-maximum-CDB-size.patch | 88 ++
 .../extra/0002-scsi-esp-fix-migration.patch| 58 ++
 ...6-2391-usb-ohci-avoid-multiple-eof-timers.patch | 43 +++
 debian/patches/series  |  6 +-
 4 files changed, 194 insertions(+), 1 deletion(-)
 create mode 100644 
debian/patches/extra/0001-scsi-esp-make-cmdbuf-big-enough-for-maximum-CDB-size.patch
 create mode 100644 debian/patches/extra/0002-scsi-esp-fix-migration.patch
 create mode 100644 
debian/patches/extra/CVE-2016-2391-usb-ohci-avoid-multiple-eof-timers.patch

diff --git 
a/debian/patches/extra/0001-scsi-esp-make-cmdbuf-big-enough-for-maximum-CDB-size.patch
 
b/debian/patches/extra/0001-scsi-esp-make-cmdbuf-big-enough-for-maximum-CDB-size.patch
new file mode 100644
index 000..5beeb50
--- /dev/null
+++ 
b/debian/patches/extra/0001-scsi-esp-make-cmdbuf-big-enough-for-maximum-CDB-size.patch
@@ -0,0 +1,88 @@
+From 0988f56451a246d5b72484e0c6dd37fe1bd69d12 Mon Sep 17 00:00:00 2001
+From: Prasad J Pandit 
+Date: Thu, 16 Jun 2016 00:22:35 +0200
+Subject: [PATCH 1/2] scsi: esp: make cmdbuf big enough for maximum CDB size
+
+While doing DMA read into ESP command buffer 's->cmdbuf', it could
+write past the 's->cmdbuf' area, if it was transferring more than 16
+bytes.  Increase the command buffer size to 32, which is maximum when
+'s->do_cmd' is set, and add a check on 'len' to avoid OOB access.
+
+Reported-by: Li Qiang 
+Signed-off-by: Prasad J Pandit 
+Signed-off-by: Paolo Bonzini 
+
+Conflicts:
+   hw/scsi/esp.c
+commit ff589551c8e8e9e95e211b9d8daafb4ed39f1aec
+scsi: esp: check TI buffer index before read/write
+
+added additional control variables to ESPState as ti_size
+wasn't enough, we thus ran in a conflict here, use only
+ti_size for now as conflict resolution.
+
+Signed-off-by: Thomas Lamprecht 
+---
+ hw/scsi/esp.c | 10 --
+ include/hw/scsi/esp.h |  3 ++-
+ 2 files changed, 10 insertions(+), 3 deletions(-)
+
+diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
+index 8961be2..e533522 100644
+--- a/hw/scsi/esp.c
 b/hw/scsi/esp.c
+@@ -243,6 +243,8 @@ static void esp_do_dma(ESPState *s)
+ len = s->dma_left;
+ if (s->do_cmd) {
+ trace_esp_do_dma(s->cmdlen, len);
++assert (s->cmdlen <= sizeof(s->cmdbuf) &&
++len <= sizeof(s->cmdbuf) - s->cmdlen);
+ s->dma_memory_read(s->dma_opaque, >cmdbuf[s->cmdlen], len);
+ s->ti_size = 0;
+ s->cmdlen = 0;
+@@ -342,7 +344,7 @@ static void handle_ti(ESPState *s)
+ s->dma_counter = dmalen;
+ 
+ if (s->do_cmd)
+-minlen = (dmalen < 32) ? dmalen : 32;
++minlen = (dmalen < ESP_CMDBUF_SZ) ? dmalen : ESP_CMDBUF_SZ;
+ else if (s->ti_size < 0)
+ minlen = (dmalen < -s->ti_size) ? dmalen : -s->ti_size;
+ else
+@@ -448,7 +450,11 @@ void esp_reg_write(ESPState *s, uint32_t saddr, uint64_t 
val)
+ break;
+ case ESP_FIFO:
+ if (s->do_cmd) {
+-s->cmdbuf[s->cmdlen++] = val & 0xff;
++if (s->cmdlen < ESP_CMDBUF_SZ) {
++s->cmdbuf[s->cmdlen++] = val & 0xff;
++} else {
++trace_esp_error_fifo_overrun();
++}
+ } else if (s->ti_size == TI_BUFSZ - 1) {
+ trace_esp_error_fifo_overrun();
+ } else {
+diff --git a/include/hw/scsi/esp.h b/include/hw/scsi/esp.h
+index 6c79527..d2c4886 100644
+--- a/include/hw/scsi/esp.h
 b/include/hw/scsi/esp.h
+@@ -14,6 +14,7 @@ void esp_init(hwaddr espaddr, int it_shift,
+ 
+ #define ESP_REGS 16
+ #define TI_BUFSZ 16
++#define ESP_CMDBUF_SZ 32
+ 
+ typedef struct ESPState ESPState;
+ 
+@@ -31,7 +32,7 @@ struct ESPState {
+ SCSIBus bus;
+ SCSIDevice *current_dev;
+ SCSIRequest *current_req;
+-uint8_t cmdbuf[TI_BUFSZ];
++uint8_t cmdbuf[ESP_CMDBUF_SZ];
+ uint32_t cmdlen;
+ uint32_t do_cmd;
+ 
+-- 
+2.1.4
+
diff --git a/debian/patches/extra/0002-scsi-esp-fix-migration.patch 
b/debian/patches/extra/0002-scsi-esp-fix-migration.patch
new file mode 100644
index 000..0ddaed0
--- /dev/null
+++ b/debian/patches/extra/0002-scsi-esp-fix-migration.patch
@@ -0,0 +1,58 @@
+From 10cf6bf50d000a1b0dad1d5f2b931d1d1b1ff7f3 Mon Sep 17 00:00:00 2001
+From: Paolo Bonzini 
+Date: Mon, 20 Jun 2016 16:32:39 +0200
+Subject: [PATCH 2/2] scsi: esp: fix migration
+
+Commit 926cde5 ("scsi: esp: make cmdbuf big enough for maximum CDB size",
+2016-06-16) changed the size of a migrated field.  Split it in two
+parts, 

[pve-devel] [PATCH kvm 2/2] disable libnfs abd fdt when configuring the kvm build

2016-07-26 Thread Thomas Lamprecht
Else they will be included if a build machine has the respective
packages installed.

Signed-off-by: Thomas Lamprecht 
---
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/debian/rules b/debian/rules
index 66038de..7b9b732 100755
--- a/debian/rules
+++ b/debian/rules
@@ -33,7 +33,7 @@ endif
 config.status: configure
dh_testdir
# Add here commands to configure the package.
-   ./configure --with-confsuffix="/kvm" --target-list=x86_64-softmmu 
--prefix=/usr --datadir=/usr/share --docdir=/usr/share/doc/pve-qemu-kvm 
--sysconfdir=/etc --localstatedir=/var --disable-xen --enable-gnutls 
--enable-sdl --enable-uuid --enable-linux-aio --enable-rbd --enable-libiscsi 
--disable-smartcard --audio-drv-list="alsa" --enable-spice --enable-usb-redir 
--enable-glusterfs --enable-libusb --disable-gtk --enable-xfsctl --enable-numa 
--disable-strip --enable-jemalloc
+   ./configure --with-confsuffix="/kvm" --target-list=x86_64-softmmu 
--prefix=/usr --datadir=/usr/share --docdir=/usr/share/doc/pve-qemu-kvm 
--sysconfdir=/etc --localstatedir=/var --disable-xen --enable-gnutls 
--enable-sdl --enable-uuid --enable-linux-aio --enable-rbd --enable-libiscsi 
--disable-smartcard --audio-drv-list="alsa" --enable-spice --enable-usb-redir 
--enable-glusterfs --enable-libusb --disable-gtk --enable-xfsctl --enable-numa 
--disable-strip --enable-jemalloc --disable-libnfs --disable-fdt
 
 build: patch build-stamp
 
-- 
2.1.4


___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH qemu-server] add lock check for move_disk API call

2016-07-26 Thread Wolfgang Bumiller
applied

On Fri, Jul 22, 2016 at 07:53:53AM +0200, Fabian Grünbichler wrote:
> this API call changes the config quite drastically, and as
> such should not be possible while an operation that holds a
> lock is ongoing (e.g., migration, backup, snapshot).
> ---
>  PVE/API2/Qemu.pm | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index df0518d..f4304b8 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -2506,6 +2506,8 @@ __PACKAGE__->register_method({
>  
>   my $conf = PVE::QemuConfig->load_config($vmid);
>  
> + PVE::QemuConfig->check_lock($conf);
> +
>   die "checksum missmatch (file change by other user?)\n"
>   if $digest && $digest ne $conf->{digest};
>  
> -- 
> 2.1.4
> 
> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH qemu-server 1/2] deactivate new volumes after clone to other node

2016-07-26 Thread Wolfgang Bumiller
applied both patches

On Wed, Jul 13, 2016 at 12:44:12PM +0200, Fabian Grünbichler wrote:
> this might otherwise lead to volumes activated on the
> source and target node, which is problematic for at least
> LVM and Ceph.
> ---
>  PVE/API2/Qemu.pm | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 7337887..83611a6 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -2391,6 +2391,7 @@ __PACKAGE__->register_method({
>  if ($target) {
>   # always deactivate volumes - avoid lvm LVs to be 
> active on several nodes
>   PVE::Storage::deactivate_volumes($storecfg, $vollist, 
> $snapname) if !$running;
> + PVE::Storage::deactivate_volumes($storecfg, 
> $newvollist);
>  
>   my $newconffile = PVE::QemuConfig->config_file($newid, 
> $target);
>   die "Failed to move config to node '$target' - rename 
> failed: $!\n"
> -- 
> 2.1.4
> 
> 
> ___
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH kernel] update to Ubuntu 4.4.0-33.52

2016-07-26 Thread Wolfgang Bumiller
applied and amended the `make download`ed source archive and bump
message

On Mon, Jul 25, 2016 at 10:42:36AM +0200, Fabian Grünbichler wrote:
> ---
> Note: requires "make download" when applying
> 
>  ...470-KEYS-potential-uninitialized-variable.patch |  94 
>  ...synchronization-between-chunk-map_extend_.patch | 162 
> -
>  ...synchronization-between-synchronous-map-e.patch | 113 --
>  Makefile   |  11 +-
>  changelog.Debian   |   8 +
>  5 files changed, 12 insertions(+), 376 deletions(-)
>  delete mode 100644 CVE-2016-4470-KEYS-potential-uninitialized-variable.patch
>  delete mode 100644 
> CVE-2016-4794-1-percpu-fix-synchronization-between-chunk-map_extend_.patch
>  delete mode 100644 
> CVE-2016-4794-2-percpu-fix-synchronization-between-synchronous-map-e.patch
> 
> diff --git a/CVE-2016-4470-KEYS-potential-uninitialized-variable.patch 
> b/CVE-2016-4470-KEYS-potential-uninitialized-variable.patch
> deleted file mode 100644
> index 052436d..000
> --- a/CVE-2016-4470-KEYS-potential-uninitialized-variable.patch
> +++ /dev/null
> @@ -1,94 +0,0 @@
> -From edd3cde476d196ebdc771a8fa789d2f4de52ae72 Mon Sep 17 00:00:00 2001
> -From: Dan Carpenter 
> -Date: Wed, 13 Jul 2016 11:43:47 +0100
> -Subject: [PATCH] KEYS: potential uninitialized variable
> -
> -If __key_link_begin() failed then "edit" would be uninitialized.  I've
> -added a check to fix that.
> -
> -This allows a random user to crash the kernel, though it's quite
> -difficult to achieve.  There are three ways it can be done as the user
> -would have to cause an error to occur in __key_link():
> -
> - (1) Cause the kernel to run out of memory.  In practice, this is difficult
> - to achieve without ENOMEM cropping up elsewhere and aborting the
> - attempt.
> -
> - (2) Revoke the destination keyring between the keyring ID being looked up
> - and it being tested for revocation.  In practice, this is difficult to
> - time correctly because the KEYCTL_REJECT function can only be used
> - from the request-key upcall process.  Further, users can only make use
> - of what's in /sbin/request-key.conf, though this does including a
> - rejection debugging test - which means that the destination keyring
> - has to be the caller's session keyring in practice.
> -
> - (3) Have just enough key quota available to create a key, a new session
> - keyring for the upcall and a link in the session keyring, but not then
> - sufficient quota to create a link in the nominated destination keyring
> - so that it fails with EDQUOT.
> -
> -The bug can be triggered using option (3) above using something like the
> -following:
> -
> - echo 80 >/proc/sys/kernel/keys/root_maxbytes
> - keyctl request2 user debug:fred negate @t
> -
> -The above sets the quota to something much lower (80) to make the bug
> -easier to trigger, but this is dependent on the system.  Note also that
> -the name of the keyring created contains a random number that may be
> -between 1 and 10 characters in size, so may throw the test off by
> -changing the amount of quota used.
> -
> -Assuming the failure occurs, something like the following will be seen:
> -
> - kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
> - [ cut here ]
> - kernel BUG at ../mm/slab.c:2821!
> - ...
> - RIP: 0010:[] kfree_debugcheck+0x20/0x25
> - RSP: 0018:8804014a7de8  EFLAGS: 00010092
> - RAX: 0034 RBX: 6b6b6b6b6b6b6b68 RCX: 
> - RDX: 00040001 RSI: 00f6 RDI: 0300
> - RBP: 8804014a7df0 R08: 0001 R09: 
> - R10: 8804014a7e68 R11: 0054 R12: 0202
> - R13: 81318a66 R14:  R15: 0001
> - ...
> - Call Trace:
> -   kfree+0xde/0x1bc
> -   assoc_array_cancel_edit+0x1f/0x36
> -   __key_link_end+0x55/0x63
> -   key_reject_and_link+0x124/0x155
> -   keyctl_reject_key+0xb6/0xe0
> -   keyctl_negate_key+0x10/0x12
> -   SyS_keyctl+0x9f/0xe7
> -   do_syscall_64+0x63/0x13a
> -   entry_SYSCALL64_slow_path+0x25/0x25
> -
> -Fixes: f70e2e06196a ('KEYS: Do preallocation for __key_link()')
> -Signed-off-by: Dan Carpenter 
> -Signed-off-by: David Howells 
> -cc: sta...@vger.kernel.org
> -Signed-off-by: Linus Torvalds 
> -(cherry picked from commit 38327424b40bcebe2de92d07312c89360ac9229a)
> -CVE-2016-4470
> -Signed-off-by: Luis Henriques 
> 
> - security/keys/key.c | 2 +-
> - 1 file changed, 1 insertion(+), 1 deletion(-)
> -
> -diff --git a/security/keys/key.c b/security/keys/key.c
> -index 2779d13..1d2d3a9 100644
>  a/security/keys/key.c
> -+++ b/security/keys/key.c
> -@@ -580,7 

Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Alexandre DERUMIER
>>There is no reason to flush a restored disk until just the end, really. 
>>Issuing flushes every x MB could hurt other storages without need. 

I'm curious to see host memory usage of a big local file storage restore 
(100GB), with writeback without any flush ?


- Mail original -
De: "Eneko Lacunza" 
À: "pve-devel" 
Envoyé: Mardi 26 Juillet 2016 10:13:59
Objet: Re: [pve-devel] Speed up PVE Backup

Hi, 

El 26/07/16 a las 10:04, Alexandre DERUMIER escribió: 
>>> I think qmrestore isn't issuing any flush request (until maybe the end), 
> Need to be checked! (but if I think we open restore block storage with 
> writeback, so I hope we send flush) 
> 
>>> so for ceph storage backend we should set 
>>> rbd_cache_writethrough_until_flush=false for better performance. 
> I think it's possible to pass theses flag in qemu block driver option, when 
> opening the rbd storage 
> 
> 
> http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/ 
> 
> qemu-img {command} [options] 
> rbd:{pool-name}/{image-name}[@snapshot-name][:option1=value1][:option2=value2...]
>  
> 
> 
> for qemu-img or with qemu drive option, I think it's possible to send as 
> option, ":rbd_cache_writethrough_until_flush=false" 
I developed a small patch to do this, waiting to test it in our setup 
(today or tomorrow) 
> But if missing flush if really the problem, it should be added to restore 
> command directly. (maybe 1 flush each 4MB for example) 
This flush is needed only for Ceph RBD, so I think using the flag above 
would be more correct. 

There is no reason to flush a restored disk until just the end, really. 
Issuing flushes every x MB could hurt other storages without need. 

In fact all this is because Ceph trying to "fix" broken virtio drivers... :) 

Thanks 
Eneko 

-- 
Zuzendari Teknikoa / Director Técnico 
Binovo IT Human Project, S.L. 
Telf. 943493611 
943324914 
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) 
www.binovo.es 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Eneko Lacunza

Hi,

El 26/07/16 a las 10:04, Alexandre DERUMIER escribió:

I think qmrestore isn't issuing any flush request (until maybe the end),

Need to be checked! (but if I think we open restore block storage with 
writeback, so I hope we send flush)


so for ceph storage backend we should set
rbd_cache_writethrough_until_flush=false for better performance.

I think it's possible to pass theses flag in qemu block driver option, when 
opening the rbd storage


http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/

qemu-img {command} [options] 
rbd:{pool-name}/{image-name}[@snapshot-name][:option1=value1][:option2=value2...]


for qemu-img or with qemu drive option, I think it's possible to send as option, 
":rbd_cache_writethrough_until_flush=false"
I developed a small patch to do this, waiting to test it in our setup 
(today or tomorrow)

But if missing flush if really the problem, it should be added to restore 
command directly. (maybe 1 flush each 4MB for example)
This flush is needed only for Ceph RBD, so I think using the flag above 
would be more correct.


There is no reason to flush a restored disk until just the end, really. 
Issuing flushes every x MB could hurt other storages without need.


In fact all this is because Ceph trying to "fix" broken virtio drivers... :)

Thanks
Eneko

--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
  943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Alexandre DERUMIER
>>I think qmrestore isn't issuing any flush request (until maybe the end), 
Need to be checked! (but if I think we open restore block storage with 
writeback, so I hope we send flush)

>>so for ceph storage backend we should set 
>>rbd_cache_writethrough_until_flush=false for better performance. 

I think it's possible to pass theses flag in qemu block driver option, when 
opening the rbd storage


http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/

qemu-img {command} [options] 
rbd:{pool-name}/{image-name}[@snapshot-name][:option1=value1][:option2=value2...]


for qemu-img or with qemu drive option, I think it's possible to send as 
option, ":rbd_cache_writethrough_until_flush=false"


But if missing flush if really the problem, it should be added to restore 
command directly. (maybe 1 flush each 4MB for example)



- Mail original -
De: "Eneko Lacunza" 
À: "dietmar" , "pve-devel" 
Envoyé: Jeudi 21 Juillet 2016 13:19:10
Objet: Re: [pve-devel] Speed up PVE Backup

Hi, 

El 21/07/16 a las 09:34, Dietmar Maurer escribió: 
> 
>>> But you can try to assemble larger blocks, and write them once you get 
>>> an out of order block... 
>> Yes, this is the plan. 
>>> I always thought the ceph libraries does (or should do) that anyways? 
>>> (write combining) 
>> Reading the docs: 
>> http://docs.ceph.com/docs/hammer/rbd/rbd-config-ref/ 
>> 
>> It should be true when write-back rbd cache is activated. This seems to 
>> be the default, but maybe we're using disk cache setting on restore too? 
>> 
>> I'll try to change the disk cache setting and will report the results. 
> thanks! 
> 
Looking at more docs: 
http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/ 

This says: 
" 
QEMU’s cache settings override Ceph’s default settings (i.e., settings 
that are not explicitly set in the Ceph configuration file). If you 
explicitly set RBD Cache 
 settings in your 
Ceph configuration file, your Ceph settings override the QEMU cache 
settings. If you set cache settings on the QEMU command line, the QEMU 
command line settings override the Ceph configuration file settings. 
" 
I have been doing tests all morning with a different backup (only one 
10GB disk) so that I could perform tests faster. 

I thought maybe we were restoring without writeback cache (rbd cache), 
but have tried the following ceph.conf tweaks and conclude that rbd 
cache is enabled: 

1. If I set rbd cache = true I get the same performance. 
2. If I set rbd cache writethrough until flush = true (rbd cache = true 
not necessary), I get x2-x3 the restore performance. This setting is a 
security measure for non-flushing virtio drivers, but it is safe for a 
restore. No writeback until a flush is detected 

I think qmrestore isn't issuing any flush request (until maybe the end), 
so for ceph storage backend we should set 
rbd_cache_writethrough_until_flush=false for better performance. 

Restore is happening at about 30-45MB/s vs 15MB/s before, but all this 
may be affected by a slow OSD, so I don't think my absolute figures are 
good, only the fact that there is a noticeable improvement. (we'll have 
this fixed next week). 

If someone can test and confirm this, it should be quite easy to patch 
qmrestore... 

Thanks 

Eneko 

-- 
Zuzendari Teknikoa / Director Técnico 
Binovo IT Human Project, S.L. 
Telf. 943493611 
943324914 
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) 
www.binovo.es 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Speed up PVE Backup

2016-07-26 Thread Alexandre DERUMIER
>>But you can try to assemble larger blocks, and write them once you get 
>>an out of order block... 

>>I always thought the ceph libraries does (or should do) that anyways? 
>>(write combining) 

librbd is doing this if writeback is enabled. (merge coalesced block).
But I'm not sure (don't remember exactly, need to be verifed) it's working fine 
with current backup restore or offline disk cloning.
(maybe they are some fsync each 64k block)


- Mail original -
De: "dietmar" 
À: "pve-devel" , "Eneko Lacunza" 
Envoyé: Mercredi 20 Juillet 2016 17:46:12
Objet: Re: [pve-devel] Speed up PVE Backup

> This is called from restore_extents, where a comment precisely says "try 
> to write whole clusters to speedup restore", so this means we're writing 
> 64KB-8Byte chunks, which is giving a hard time to Ceph-RBD because this 
> means lots of ~64KB IOPS. 
> 
> So, I suggest the following solution to your consideration: 
> - Create a write buffer on startup (let's asume it's 4MB for example, a 
> number ceph rbd would like much more than 64KB). This could even be 
> configurable and skip the buffer altogether if buffer_size=cluster_size 
> - Wrap current "restore_write_data" with a 
> "restore_write_data_with_buffer", that does a copy to the 4MB buffer, 
> and only calls "restore_write_data" when it's full. 
> * Create a new "flush_restore_write_data_buffer" to flush the write 
> buffer when device restore reading is complete. 
> 
> Do you think this is a good idea? If so I will find time to implement 
> and test this to check whether restore time improves. 

We store those 64KB blocks out of order, so your suggestion will not work 
in general. 

But you can try to assemble larger blocks, and write them once you get 
an out of order block... 

I always thought the ceph libraries does (or should do) that anyways? 
(write combining) 

___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel