Re: [CentOS-virt] lvm cache + qemu-kvm stops working after about 20GB of writes

2017-05-26 Thread Richard Landsman - Rimote

Hello everyone,

I've tried just about everything the last weeks and my findings are:

- the problem with LVM cache seems NOT to be caused by KVM/qemu. But is 
/seems/ that it is noticeable more inside a KVM. So the slowdown of the 
cache also happens on HW-node, but you must give it a serious go before 
you notice.


- Not a single time I succeeded in creating a well working cache using 
LVM2 cache. In artificial / KVM setups and with *small* devices it works 
(dm-testsuite etc). But in real life scenario with *fully population all 
the PV's* on 2TB HDDs and 250G SSDs (both RAID 1) the cache stopped 
working after 20 - 50 GB of writes although the cache is 150+G large. 
Please use fio examples below and always use new filenames so not the 
same blocks are overwritten.


The poor performance stayed most of the time even when all blocks were 
flushed. Very unpredictable cache performance / behavior.


I finally decided to go for dm-writeboost in stead of lvm2 cache 
(dm-cache). This was the only way to create a well working cache that 
works till 95% filled. But of course would be nicer to have something 
that generally more stable like LVM2.


I guess in the sens of this mailing-list this issue is resolved because 
it does not seem to belong here.


--
Met vriendelijke groet,

Richard Landsman
http://rimote.nl

T: +31 (0)50 - 763 04 07
(ma-vr 9:00 tot 18:00)

24/7 bij storingen:
+31 (0)6 - 4388 7949
@RimoteSaS (Twitter Serviceberichten/security updates)

On 04/20/2017 04:23 PM, Sandro Bonazzola wrote:



On Thu, Apr 20, 2017 at 12:32 PM, Richard Landsman - Rimote 
<rich...@rimote.nl <mailto:rich...@rimote.nl>> wrote:


Hello everyone,

Anybody had the chance to test out this setup and reproduce the
problem? I assumed it would be something that's used often these
days and a solution would benefit a lot of users. If can be of any
assistance please contact me.

I haven't seen any additional report of this happening, can you please 
try to reproduce with the new qemu-kvm-ev-2.6.0-28.el7_3.9.1 currently 
in testing?


-- 
Met vriendelijke groet,


Richard Landsman
http://rimote.nl

T: +31 (0)50 - 763 04 07
(ma-vr 9:00 tot 18:00)

24/7 bij storingen:
+31 (0)6 - 4388 7949
@RimoteSaS (Twitter Serviceberichten/security updates)




--

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R

Red Hat EMEA <https://www.redhat.com/>

<https://red.ht/sig>  
TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>



___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt


___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt


Re: [CentOS-virt] lvm cache + qemu-kvm stops working after about 20GB of writes

2017-04-20 Thread Richard Landsman - Rimote

Hello everyone,

Anybody had the chance to test out this setup and reproduce the problem? 
I assumed it would be something that's used often these days and a 
solution would benefit a lot of users. If can be of any assistance 
please contact me.


--
Met vriendelijke groet,

Richard Landsman
http://rimote.nl

T: +31 (0)50 - 763 04 07
(ma-vr 9:00 tot 18:00)

24/7 bij storingen:
+31 (0)6 - 4388 7949
@RimoteSaS (Twitter Serviceberichten/security updates)

On 04/10/2017 10:08 AM, Sandro Bonazzola wrote:

Adding Paolo and Miroslav.

On Sat, Apr 8, 2017 at 4:49 PM, Richard Landsman - Rimote 
<rich...@rimote.nl <mailto:rich...@rimote.nl>> wrote:


Hello,

I would really appreciate some help/guidance with this problem.
First of all sorry for the long message. I would file a bug, but
do not know if it is my fault, dm-cache, qemu or (probably) a
combination of both. And i can imagine some of you have this setup
up and running without problems (or maybe you think it works, just
like i did, but it does not):

PROBLEM
LVM cache writeback stops working as expected after a while with a
qemu-kvm VM. A 100% working setup would be the holy grail in my
opinion... and the performance of KVM/qemu is great i must say in
the beginning.

DESCRIPTION

When using software RAID 1 (2x HDD) + software RAID 1 (2xSSD) and
create a cached LV out of them, the VM performs initially great
(at least 40.000 IOPS on 4k rand read/write)! But then after a
while (and a lot of random IO, ca 10 - 20 G) it effectively turns
in to a writethrough cache although there's much space left on the
cachedlv.


When  working as expected on KVM host all writes go to SSDs

iostat -x -m 2

Device: rrqm/s   wrqm/s r/s w/s rMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda   0.00   324.500.00   22.00 0.0014.94 
1390.57 1.90   86.390.00 86.39   5.32  11.70
sdb   0.00   324.500.00   22.00 0.0014.94 
1390.57 2.03   92.450.00 92.45   5.48  12.05
sdc   0.00  3932.000.00 *2191.50* 0.00 *270.07*  
252.3937.83   17.55 0.00   17.55   0.36 *78.05*
sdd   0.00  3932.000.00 *2197.50 * 0.00 *271.01 * 
252.5738.96   18.14 0.00   18.14   0.36 *78.95*



When not working as expected on KVM host all writes go through the
SSD on to the HDDs (effectively disabling writeback so it becomes
a writethrough)

Device: rrqm/s   wrqm/s r/s w/s rMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda   0.00 7.00  234.50 *173.50 * 0.92 *1.95*   
14.3829.27   71.27  111.89 16.37   2.45 *100.00*
sdb   0.00 3.50  212.00 *177.50 * 0.83 *1.95*   
14.6035.58   91.24  143.00 29.42   2.57*100.10*
sdc   2.50 0.00  566.00 *199.00 * 2.69
0.78 9.28 0.080.110.13 0.04   0.10 *7.70*
sdd   1.50 0.00   76.00 *199.00* 0.65 0.78   
10.66 0.020.070.16 0.04   0.07 *1.85*



Stuff i've checked/tried:

- The data in the cached LV has then not exceeded even half of the
space, so this should not happen. It even happens when only 20% of
cachedata is used.
- It seems to be triggerd most of the time when %cpy/sync column
of `lvs -a` is about 30%. But this is not always the case!
- changing the cachepolicy from cleaner to smq, wait (check flush
ready with lvs -a) and then back to smq seems to help /sometimes/!
But not always...

lvchange --cachepolicy cleaner /dev/mapper/XXX-cachedlv

lvs -a

lvchange --cachepolicy smq /dev/mapper/XXX-cachedlv

- *when mounting the LV inside the host this does not seem to
happen!!* So it looks like a qemu-kvm / dm-cache combination
issue. Only difference is that inside host i do mkfs in stead of
LVM inside VM (so could be LVM inside VM on top of LVM on KVM host
problem too? small chance probably because the first 10 - 20GB it
works great!)

- tried disabling Selinux, upgrading to newest kernels (elrepo ml
and lt), played around with dirty_cache thingeys like
proc/sys/vm/dirty_writeback_centisecs
/proc/sys/vm/dirty_expire_centisecs cat /proc/sys/vm/dirty_ratio ,
and migration threashold of dmsetup, and other probably non
important stuff like vm.dirty_bytes

- when in "slow state" the systems kworkers are exessively using
IO (10 - 20 MB per kworker process). This seems to be the
writeback process (CPY%Sync) because the cache wants to flush to
HDD. But the strange thing is that after a good sync (0% left),
the disk may become slow again after a few MBs of data. A reboot
sometimes helps.

- have tried iothreads, virtio-scsi, vcpu driver setting on
virtio-scsi controller, cachesettings, disk 

[CentOS-virt] lvm cache + qemu-kvm stops working after about 20GB of writes

2017-04-08 Thread Richard Landsman - Rimote

Hello,

I would really appreciate some help/guidance with this problem. First of 
all sorry for the long message. I would file a bug, but do not know if 
it is my fault, dm-cache, qemu or (probably) a combination of both. And 
i can imagine some of you have this setup up and running without 
problems (or maybe you think it works, just like i did, but it does not):


PROBLEM
LVM cache writeback stops working as expected after a while with a 
qemu-kvm VM. A 100% working setup would be the holy grail in my 
opinion... and the performance of KVM/qemu is great i must say in the 
beginning.


DESCRIPTION

When using software RAID 1 (2x HDD) + software RAID 1 (2xSSD) and create 
a cached LV out of them, the VM performs initially great (at least 
40.000 IOPS on 4k rand read/write)! But then after a while (and a lot of 
random IO, ca 10 - 20 G) it effectively turns in to a writethrough cache 
although there's much space left on the cachedlv.



When  working as expected on KVM host all writes go to SSDs

iostat -x -m 2

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s 
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda   0.00   324.500.00   22.00 0.0014.94 
1390.57 1.90   86.390.00   86.39   5.32  11.70
sdb   0.00   324.500.00   22.00 0.0014.94 
1390.57 2.03   92.450.00   92.45   5.48  12.05
sdc   0.00  3932.000.00 *2191.50* 0.00 *270.07*   
252.3937.83   17.550.00   17.55   0.36 *78.05*
sdd   0.00  3932.000.00 *2197.50 *0.00 *271.01 *  
252.5738.96   18.140.00   18.14   0.36 *78.95*



When not working as expected on KVM host all writes go through the SSD 
on to the HDDs (effectively disabling writeback so it becomes a 
writethrough)


Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s 
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda   0.00 7.00  234.50 *173.50 * 0.92 *1.95*
14.3829.27   71.27  111.89   16.37 2.45 *100.00*
sdb   0.00 3.50  212.00 *177.50 * 0.83 *1.95*
14.6035.58   91.24  143.00   29.42 2.57*100.10*
sdc   2.50 0.00  566.00 *199.00 * 2.69 0.78 
9.28 0.080.110.130.04   0.10 *7.70*
sdd   1.50 0.00   76.00 *199.00* 0.65 0.78
10.66 0.020.070.160.04   0.07 *1.85*



Stuff i've checked/tried:

- The data in the cached LV has then not exceeded even half of the 
space, so this should not happen. It even happens when only 20% of 
cachedata is used.
- It seems to be triggerd most of the time when %cpy/sync column of `lvs 
-a` is about 30%. But this is not always the case!
- changing the cachepolicy from cleaner to smq, wait (check flush ready 
with lvs -a) and then back to smq seems to help /sometimes/! But not 
always...


lvchange --cachepolicy cleaner /dev/mapper/XXX-cachedlv

lvs -a

lvchange --cachepolicy smq /dev/mapper/XXX-cachedlv

- *when mounting the LV inside the host this does not seem to happen!!* 
So it looks like a qemu-kvm / dm-cache combination issue. Only 
difference is that inside host i do mkfs in stead of LVM inside VM (so 
could be LVM inside VM on top of LVM on KVM host problem too? small 
chance probably because the first 10 - 20GB it works great!)


- tried disabling Selinux, upgrading to newest kernels (elrepo ml and 
lt), played around with dirty_cache thingeys like 
proc/sys/vm/dirty_writeback_centisecs 
/proc/sys/vm/dirty_expire_centisecs cat /proc/sys/vm/dirty_ratio , and 
migration threashold of dmsetup, and other probably non important stuff 
like vm.dirty_bytes


- when in "slow state" the systems kworkers are exessively using IO (10 
- 20 MB per kworker process). This seems to be the writeback process 
(CPY%Sync) because the cache wants to flush to HDD. But the strange 
thing is that after a good sync (0% left), the disk may become slow 
again after a few MBs of data. A reboot sometimes helps.


- have tried iothreads, virtio-scsi, vcpu driver setting on virtio-scsi 
controller, cachesettings, disk shedulers etc. Nothing helped.


- the new samsung 950 PRO SSDs have HPA enabled (30%!!), i have AMD 
FX(tm)-8350, 16G RAM


It feels like the lvm cache has a threshold (about 20G of data that is 
dirty) and that is stops allowing the qemu-kvm process to use writeback 
caching (the root uses inside the host seems to not have this 
limitation). It starts flushing, but only to a certain point. After a 
few  MBs of data it is right back in the slow spot again. Only solution 
is waiting for a long time (independant of CPY%SYNC) or sometimes change 
cachepolicy and force flush. This prevents for me the production use of 
this system. But it's so promising, so I hope somebody can help.


desired state:  Doing the FIO test (described in section reproduce) 
repeatedly should keep being fast till cachedlv is more or less full. If 
resyncing back to disc causes this degradation, it should 

Re: [CentOS-virt] Network isolation for KVM guests

2017-03-31 Thread Richard Landsman - Rimote

Hi,

I don't see why this should not work with the given solutions. But I'm 
relatively new to KVM / libvirt. Alternative:


Personally I use Shorewall (Shoreline FW) and bridge setups (also works 
with a bonding interface). This way you can create zones, interfaces, 
addresses, forwarding-rules etc and give per VM permission to let's say 
only use a certain IP, only access certain parts of the network, talk to 
a certain limited list of IPs etc. I can not imagine you can't create 
what you want with Shorewall. It looks complicated, but actually is very 
intuitive if you give it some time and effort.


Please feel free to provide a better description of what you want to 
accomplish. Maybe I misunderstand what you want to achieve.


--
Met vriendelijke groet,

Richard Landsman
http://rimote.nl

T: +31 (0)50 - 763 04 07
(ma-vr 9:00 tot 18:00)

24/7 bij storingen:
+31 (0)6 - 4388 7949
@RimoteSaS (Twitter Serviceberichten/security updates)

On 03/31/2017 11:56 AM, C. L. Martinez wrote:

On Thu, Mar 30, 2017 at 06:15:28PM +0100, Nux! wrote:

Use libvirt with mac/ip spoofing enabled.

https://libvirt.org/formatnwfilter.html

https://libvirt.org/firewall.html

--
Sent from the Delta quadrant using Borg technology!


Thanks Nux and Kristian but I don't see if these solutions will be really 
efective in my environment. Let me to explain. In this host I three physical 
interfaces: eth0, eth1 and wlan0.

  eth0 is connected to my internal network. eth1 is connected to a public 
router and wlan0 is connected to another public router. wlan0 and eth1 are 
bonded to provide failover Internet connections. CPU doesn't supports pci 
passthrough (pci passthrough would solve my problems).

  I need to deploy a fw vm to control traffic between internal and external 
interfaces. In BSD systems you can seggregate all ip address and route tables 
from principal routing table. It is the same effect that I would like to 
implement in this host.

  And I don't see how to implement using CentOS (or another linux distro).



___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt