On 26/05/17 18:19, Ian Jackson wrote: > Juergen Gross writes ("HVM guest performance regression"): >> Looking for the reason of a performance regression of HVM guests under >> Xen 4.7 against 4.5 I found the reason to be commit >> c26f92b8fce3c9df17f7ef035b54d97cbe931c7a ("libxl: remove freemem_slack") >> in Xen 4.6. >> >> The problem occurred when dom0 had to be ballooned down when starting >> the guest. The performance of some micro benchmarks dropped by about >> a factor of 2 with above commit. >> >> Interesting point is that the performance of the guest will depend on >> the amount of free memory being available at guest creation time. >> When there was barely enough memory available for starting the guest >> the performance will remain low even if memory is being freed later. >> >> I'd like to suggest we either revert the commit or have some other >> mechanism to try to have some reserve free memory when starting a >> domain. > > Oh, dear. The memory accounting swamp again. Clearly we are not > going to drain that swamp now, but I don't like regressions. > > I am not opposed to reverting that commit. I was a bit iffy about it > at the time; and according to the removal commit message, it was > basically removed because it was a piece of cargo cult for which we > had no justification in any of our records. > > Indeed I think fixing this is a candidate for 4.9. > > Do you know the mechanism by which the freemem slack helps ? I think > that would be a prerequisite for reverting this. That way we can have > an understanding of why we are doing things, rather than just > flailing at random...
I wish I would understand it. One candidate would be 2M/1G pages being possible with enough free memory, but I haven't proofed this yet. I can have a try by disabling big pages in the hypervisor. What makes the whole problem even more mysterious is that the regression was detected first with SLE12 SP3 (guest and dom0, Xen 4.9 and Linux 4.4) against older systems (guest and dom0). While trying to find out whether the guest or the Xen version are the culprit I found that the old guest (based on kernel 3.12) showed the mentioned performance drop with above commit. The new guest (based on kernel 4.4) shows the same bad performance regardless of the Xen version or amount of free memory. I haven't found the Linux kernel commit yet being responsible for that performance drop. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel