Re: [qubes-devel] Re: Qubes vm.swappiness=0

johnyjukya Fri, 09 Dec 2016 15:01:10 -0800

>> Having a virtual disks's contents *also* cached inside the VM, is
>> redundant, wastes memory and CPU, and makes the whole memory management
>> thing more awkward.  Having a block from root.img cached in dom0, as
>> well
>> as every domU that uses that template, is nuts.
>
> I have long wondered about this, but assumed that Xen project would have
> worked-out the right balance long ago.


Well, I think the Xen project has the preferred way of doing with with
LVM, and Qubes broke away from that a bit using .img files instead of
lvm's to host the VM's.  But 4.0 is going to require thin LVM, no?  Or is
that just for dom0?

I'm a bit of an unusual case, working on a somewhat underpowered machine. 
But that is not necessarily a bad thing; stuff that breaks more easily
when memory or other resources are low, should be spotted and fixed, as
they are broken regardless, and will likely bite others at some point (and
possibly more intermittently).  Low memory brings out bugs.

Qubes dom0 gets *very* cranky if it doesn't have enough memory.  It
doesn't handle low memory gracefully at all.  In the memory manager you
see a fudge factor % added to each VM, and then a constant fudge factor
added on, and a miminum VM size used, and then another fudge factor boost
added to dom0, etc..

It almost feels that the numbers were chosen (extra dom0 memory boost, for
example) by cranking up the number until things worked well, rather than
knowing why things stumbled with a bit less memory.  Just my opinion from
a low memory system, and poking around the source a bit.

I found there were places I could save a lot of memory (cache-margin of
1.1 instead of 1.3) without any ill effect; and other areas that if you
tweaked things (like a very low dom0 memory boost) the system just
stumbled and got unusable.

> However, there is a trade-off. Relying more on dom0 for caching makes
> the caching slower as its now copied between VMs.

You'd think, wouldn't you?  But in practice (well, in 10 minutes of
testing), the dom0 cache was as fast, or faster than the VM cache. 
Perhaps more layers of caching involves more administrative code to run,
and maybe the VBD layer is efficient enough that the additional calls
aren't that big a deal.

> Conversely, swapping in domUs is _extra_ expensive.

Swapping is way worse than anything related to cache/buffers, for sure. 
Swapping in a VM hurts way more than (modest) dom0 swapping.  But yeah,
when dom0 gets swapping too much, it's a bad thing.  Important requests
(such as for more memory, or keyboard/mouse I/O) might get delayed due to
deadlocks from swapping, and the whole system feels terrible.

> Adding to the complexity of the situation is that root.img and
> private.img will respond very differently to domU-vs-dom0 caching, where
> private.img access benefits more from domU cache.

The snapshot thing adds yet another layer of complexity, as well.

Every VM's volatile.img is (in doing a copy-on-write snapshot thing)
keeping track of the differences between the template's root.img and the
current state of its root file system.

Say you have 5 VM's all opened based upon debian-8's root.img.  They're
all tracking diff's against (a snapshot of) root.img.  Thankfully the root
shouldn't change too much during normal operations.

>> On a Virtual machine system, swapping makes very little sense.  It's a
>> major performance killer, especially inside a VM.
>
> I'm glad all this is being discussed.
>
> But I think limited swapping in vms can be good--if it frees up memory
> for running processes in other vms.

I kind of bounced around on the issue.  Went from no-swapping-allowed! to
trying heavy swapping; hey, if it ain't being used, don't keep it in
memory.

But personally, my VM's aren't long-lived enough to let things swap out
and stabilize.  For someone who keeps VM's opened for long periods, the
benefit could be greater in having a higher vm.swappiness.

(There's something about the templates being so clean and shiny that makes
frequent VM reboots soothing to me, from a security standpoint.  That may
just be my autistic side, tho.)

>> But unless you're intentionally running very bloated AppVM's that start
>> a
>> bunch of unnecessary stuff (not a great idea), swapping out unused stuff
>> buys you very little.
>
> This is where I have a problem. Qubes is a desktop system where most
> users aren't expected to be manually ensuring that all their running
> code is "tight". There will have to be plenty of niceties on hand and
> techies often refer to this as "bloat".

I hear ya, for sure.  The great thing about Qubes is that it makes setting
up and configuring things a relative breeze.  I had never successfully
configured a Xen VM, until I started using Qubes.  (Now, I'm crafting my
own, but I'd never be here without Qubes.  Xen has a big learning curve on
its own.)

So yes, users shouldn't in general have to worry about how much memory is
going where and such.  As simplistic as qmemman's algorithm is, it serves
exactly that purpose, of shuffling the memory around to wherever the
user's applications require it.

However, there are some things we can do to be smart.  For example, a
little Sound toggle icon in the manager next to each VM (or in a config
page) would let the user pick which VM's need sound.  Most of them don't,
and it's a huge security risk (IMHO) to allow it by default.

Same with all of cups and samba and stuff.  They're all installed and
running by default, presenting a security risk (or at least greater attack
surface), and they're not easy to turn off unless you know what you'd
doing.

My homebrew "Qubes-Lite" has a very small Linux installations for the
service VM's.  I think that's reasonable for service VM's and their
nature.

The template-vm (equivalent) is stocked up fully with many, many packages,
so I can run the programs I want when I want.  But I turn off most
systemctl startups.  I don't need all those daemons running in every child
VM.  It's a waste of memory and a security risk.

>> That keeps the servicevm's from grabbing/releasing memory which only
>> gets
>> used as a cache anwyays.  Redundant, as mentioned, due to dom0's
>> caching,
>> but also mostly unused since sys-net and friends don't do a lot of disk
>> activity.
>
> I do this, too. I would even recommend restricting netvms and proxyvms
> to 250MB or so.

Agreed.  256MB is what I use if the VM is going to have X going.  128mb
works fine if not.  (But if X then somehow starts up anyway, it gets
unpleasant.)

(One of the few reasons you'd want X ability in sys-net or sys-firewall,
might be for the network manager applet, nm-applet.  If you're a
command-line guy like me and don't want the GUI, but want the network
manager functionality, check out "nmcli."  It's a real gem..)

>> That did result in more memory for active dom0 caching and for VM usage,
>> which was a performance boost in general.  It even let me start more
>> VM's
>> that I used to be able to.  (Although changing the cache-margin-factor
>> from 1.3 to 1.1 was far more helpful for creating more VM's.)
>
> Hmmm.... Where exactly? Can you post a howto?

I've been meaning to, on a few different topics. But for now, short version:

Obviously, you're voiding your warrantee by tampering with the parameters,
so if your system gets stupid, don't complain on the list until you try
with the settings changed back.  :)

In /etc/qubes/qmemman.conf, change "cache-margin-factor = 1.3" to "1.1"
instead.  Or whatever.

You can restart the memory manager to have this take effect, with "sudo
systemctl restart qubes-qmemman"

Doing stupid things like setting the value to < 1.0 and turning off swap
will guarantee you ill effects.

The Qubes manager only shows the memory actually allocated to the VM, not
how much it has requested/needs.  I have a text utility which shows all of
these, and have started turning it into a live dom0/domU memory monitor. 
I'll try to post at least the text version in the next day or two.

(I was wrong about that value being hard-coded elsewhere; it was just that
when I used the same Qubes Python API to get at the memory information,
the config value wasn't loaded from the file by the library; I had to do
it myself, like qmemman_server.py does.)

>> Having swappiness at anything other than 0 for a VM (set in the
>> template)
>> doesn't make a lot of sense.  If you have large but unused programs/data
>> in your VM's, you're doing it wrong.
>
> "Unused" by what metric? If something is executed twice in a 10min
> session, what then?

I guess it depends upon the person and the system.  If they don't mind a
slight delay when the do something every ten minutes, crank up the
swappiness.  If you only want any interface pauses (from swapping back in)
when absolutely necessary, crank swappiness down.  But you might not be
able to launch as many VM's.  It's all a balance.  :)

I know I have limited memory, so in order to do more simultaneous things,
I'm going to have to accept that infrequently used things might suffer
from the odd delay.

In digging through the memory manager, I actually want it to do *more* for
the user and not less (while being more flexible), so I think we're in
agreement about the user experience as simple as possible.

>> I've done some performance testing of buffered vs. non-buffered disk
>> access in a VM (hdparm), and there is no performance gain from the
>> buffers/cache inside the VM.  In fact, some tests showed the cached
>> situation slower than non-cached, sometimes significantly so.  Let dom0
>> do
>> the caching.
>
> I don't believe this would hold true for private.img.

Fair enough.  But the penalty for dom0 caching even in that case is small
enough that I think it may warrant the simplicity of keeping caching out
of the VM's.  An unshared cache page in dom0 takes no more resources than
an unshared cache page in a VM.  In fact, dom0 is gonna cache it anyways,
so why duplicate the effort?

(Unless you disable dom0's caching, but that seems like a terrible,
terrible idea. :) )

>> I think there are Xen features (such as Intellicache?) for doing domU's
>> caching in a shared way in dom0.  That seems like a win-win.  Has Qubes
>> considered looking at this feature?
>
> Would be prudent to search for security risks associated with that.

That was the first thing that crossed my mind, having a shared memory page
accessible by dom0 and one or more VM's.  Just opens the door for more
potential border cases and buffer overflows.

I would imagine it certainly adds complexity and configuration issues and
so on.

On the other hand, Qubes (with Xen's help) is indeed solidly in the
business of shuffling memory around between VM's, and making sure it does
so securely.

And it does already hand out shared pages for the X server to blast bits
to the gui daemons.  Adding shared cache pages might not be a huge stretch
if done carefully and smartly.  I'll have to learn more about Intellicache
and other options.

Qubes devs "rolled their own" (God love them) when it came to a GUI
system, sound, remote execution, rpc, file copying, and even created the
underlying vchan mechanism they're all based upon.  So even that's a
possibility.

I'd love to see code pages shared between AppVm's instances.  Even the
kernel (especially the kernel) could benefit hugely from that.  If done
carfefully.

Basing all of its service upon its vchan is a key part of Qubes' security
win, using a Microkernel-ish IPC mechanism (mostly with fixed-sized
messages) instead of the TCP/IP stack.  It's what lets dom0 go
networkless.

At the end of the day, it's kind of sad how we got here.

Vchan is a win because we can't trust our networking stacks, or the
operating systems they run upon, or the hardware that it uses, from either
bugs or compromise.

Virtualization and isolation of device drivers and applications is a win
for the same reasons, as well as not being able to trust the applications
themselves from bugs, inappropriate permissions, or intentional
compromise.

If the hardware, drivers, operating systems, and applications were all
designed properly, validated, signed, yadda yadda yadda, Qubes would be
unnecessary, and just a curiosity.

Ideally, it'd be nice to have a secure 64-bit Microkernel operating that
could run all applications you need and support all the hardware you need
it to.  Qubes, kinda, sorta, is that.  Or as close as we can come today.

I could see it evolving towards eliminating some of the awkward layers
(such as whole operating system instances to isolate a questionable app or
device driver) and become more and more of a pure microkernel.  Especially
if it ever makes the jump (or forks to) something like L4.

I'm not sure I'll ever have full confidence in anything as complex as Xen.

-d

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-devel+unsubscr...@googlegroups.com.
To post to this group, send email to qubes-devel@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-devel/c086d89a2ba0ce536c5f3b5e12394344.webmail%40localhost.
For more options, visit https://groups.google.com/d/optout.

Re: [qubes-devel] Re: Qubes vm.swappiness=0

Reply via email to