>> Having a virtual disks's contents *also* cached inside the VM, is >> redundant, wastes memory and CPU, and makes the whole memory management >> thing more awkward. Having a block from root.img cached in dom0, as >> well >> as every domU that uses that template, is nuts. > > I have long wondered about this, but assumed that Xen project would have > worked-out the right balance long ago.
Well, I think the Xen project has the preferred way of doing with with LVM, and Qubes broke away from that a bit using .img files instead of lvm's to host the VM's. But 4.0 is going to require thin LVM, no? Or is that just for dom0? I'm a bit of an unusual case, working on a somewhat underpowered machine. But that is not necessarily a bad thing; stuff that breaks more easily when memory or other resources are low, should be spotted and fixed, as they are broken regardless, and will likely bite others at some point (and possibly more intermittently). Low memory brings out bugs. Qubes dom0 gets *very* cranky if it doesn't have enough memory. It doesn't handle low memory gracefully at all. In the memory manager you see a fudge factor % added to each VM, and then a constant fudge factor added on, and a miminum VM size used, and then another fudge factor boost added to dom0, etc.. It almost feels that the numbers were chosen (extra dom0 memory boost, for example) by cranking up the number until things worked well, rather than knowing why things stumbled with a bit less memory. Just my opinion from a low memory system, and poking around the source a bit. I found there were places I could save a lot of memory (cache-margin of 1.1 instead of 1.3) without any ill effect; and other areas that if you tweaked things (like a very low dom0 memory boost) the system just stumbled and got unusable. > However, there is a trade-off. Relying more on dom0 for caching makes > the caching slower as its now copied between VMs. You'd think, wouldn't you? But in practice (well, in 10 minutes of testing), the dom0 cache was as fast, or faster than the VM cache. Perhaps more layers of caching involves more administrative code to run, and maybe the VBD layer is efficient enough that the additional calls aren't that big a deal. > Conversely, swapping in domUs is _extra_ expensive. Swapping is way worse than anything related to cache/buffers, for sure. Swapping in a VM hurts way more than (modest) dom0 swapping. But yeah, when dom0 gets swapping too much, it's a bad thing. Important requests (such as for more memory, or keyboard/mouse I/O) might get delayed due to deadlocks from swapping, and the whole system feels terrible. > Adding to the complexity of the situation is that root.img and > private.img will respond very differently to domU-vs-dom0 caching, where > private.img access benefits more from domU cache. The snapshot thing adds yet another layer of complexity, as well. Every VM's volatile.img is (in doing a copy-on-write snapshot thing) keeping track of the differences between the template's root.img and the current state of its root file system. Say you have 5 VM's all opened based upon debian-8's root.img. They're all tracking diff's against (a snapshot of) root.img. Thankfully the root shouldn't change too much during normal operations. >> On a Virtual machine system, swapping makes very little sense. It's a >> major performance killer, especially inside a VM. > > I'm glad all this is being discussed. > > But I think limited swapping in vms can be good--if it frees up memory > for running processes in other vms. I kind of bounced around on the issue. Went from no-swapping-allowed! to trying heavy swapping; hey, if it ain't being used, don't keep it in memory. But personally, my VM's aren't long-lived enough to let things swap out and stabilize. For someone who keeps VM's opened for long periods, the benefit could be greater in having a higher vm.swappiness. (There's something about the templates being so clean and shiny that makes frequent VM reboots soothing to me, from a security standpoint. That may just be my autistic side, tho.) >> But unless you're intentionally running very bloated AppVM's that start >> a >> bunch of unnecessary stuff (not a great idea), swapping out unused stuff >> buys you very little. > > This is where I have a problem. Qubes is a desktop system where most > users aren't expected to be manually ensuring that all their running > code is "tight". There will have to be plenty of niceties on hand and > techies often refer to this as "bloat". I hear ya, for sure. The great thing about Qubes is that it makes setting up and configuring things a relative breeze. I had never successfully configured a Xen VM, until I started using Qubes. (Now, I'm crafting my own, but I'd never be here without Qubes. Xen has a big learning curve on its own.) So yes, users shouldn't in general have to worry about how much memory is going where and such. As simplistic as qmemman's algorithm is, it serves exactly that purpose, of shuffling the memory around to wherever the user's applications require it. However, there are some things we can do to be smart. For example, a little Sound toggle icon in the manager next to each VM (or in a config page) would let the user pick which VM's need sound. Most of them don't, and it's a huge security risk (IMHO) to allow it by default. Same with all of cups and samba and stuff. They're all installed and running by default, presenting a security risk (or at least greater attack surface), and they're not easy to turn off unless you know what you'd doing. My homebrew "Qubes-Lite" has a very small Linux installations for the service VM's. I think that's reasonable for service VM's and their nature. The template-vm (equivalent) is stocked up fully with many, many packages, so I can run the programs I want when I want. But I turn off most systemctl startups. I don't need all those daemons running in every child VM. It's a waste of memory and a security risk. >> That keeps the servicevm's from grabbing/releasing memory which only >> gets >> used as a cache anwyays. Redundant, as mentioned, due to dom0's >> caching, >> but also mostly unused since sys-net and friends don't do a lot of disk >> activity. > > I do this, too. I would even recommend restricting netvms and proxyvms > to 250MB or so. Agreed. 256MB is what I use if the VM is going to have X going. 128mb works fine if not. (But if X then somehow starts up anyway, it gets unpleasant.) (One of the few reasons you'd want X ability in sys-net or sys-firewall, might be for the network manager applet, nm-applet. If you're a command-line guy like me and don't want the GUI, but want the network manager functionality, check out "nmcli." It's a real gem..) >> That did result in more memory for active dom0 caching and for VM usage, >> which was a performance boost in general. It even let me start more >> VM's >> that I used to be able to. (Although changing the cache-margin-factor >> from 1.3 to 1.1 was far more helpful for creating more VM's.) > > Hmmm.... Where exactly? Can you post a howto? I've been meaning to, on a few different topics. But for now, short version: Obviously, you're voiding your warrantee by tampering with the parameters, so if your system gets stupid, don't complain on the list until you try with the settings changed back. :) In /etc/qubes/qmemman.conf, change "cache-margin-factor = 1.3" to "1.1" instead. Or whatever. You can restart the memory manager to have this take effect, with "sudo systemctl restart qubes-qmemman" Doing stupid things like setting the value to < 1.0 and turning off swap will guarantee you ill effects. The Qubes manager only shows the memory actually allocated to the VM, not how much it has requested/needs. I have a text utility which shows all of these, and have started turning it into a live dom0/domU memory monitor. I'll try to post at least the text version in the next day or two. (I was wrong about that value being hard-coded elsewhere; it was just that when I used the same Qubes Python API to get at the memory information, the config value wasn't loaded from the file by the library; I had to do it myself, like qmemman_server.py does.) >> Having swappiness at anything other than 0 for a VM (set in the >> template) >> doesn't make a lot of sense. If you have large but unused programs/data >> in your VM's, you're doing it wrong. > > "Unused" by what metric? If something is executed twice in a 10min > session, what then? I guess it depends upon the person and the system. If they don't mind a slight delay when the do something every ten minutes, crank up the swappiness. If you only want any interface pauses (from swapping back in) when absolutely necessary, crank swappiness down. But you might not be able to launch as many VM's. It's all a balance. :) I know I have limited memory, so in order to do more simultaneous things, I'm going to have to accept that infrequently used things might suffer from the odd delay. In digging through the memory manager, I actually want it to do *more* for the user and not less (while being more flexible), so I think we're in agreement about the user experience as simple as possible. >> I've done some performance testing of buffered vs. non-buffered disk >> access in a VM (hdparm), and there is no performance gain from the >> buffers/cache inside the VM. In fact, some tests showed the cached >> situation slower than non-cached, sometimes significantly so. Let dom0 >> do >> the caching. > > I don't believe this would hold true for private.img. Fair enough. But the penalty for dom0 caching even in that case is small enough that I think it may warrant the simplicity of keeping caching out of the VM's. An unshared cache page in dom0 takes no more resources than an unshared cache page in a VM. In fact, dom0 is gonna cache it anyways, so why duplicate the effort? (Unless you disable dom0's caching, but that seems like a terrible, terrible idea. :) ) >> I think there are Xen features (such as Intellicache?) for doing domU's >> caching in a shared way in dom0. That seems like a win-win. Has Qubes >> considered looking at this feature? > > Would be prudent to search for security risks associated with that. That was the first thing that crossed my mind, having a shared memory page accessible by dom0 and one or more VM's. Just opens the door for more potential border cases and buffer overflows. I would imagine it certainly adds complexity and configuration issues and so on. On the other hand, Qubes (with Xen's help) is indeed solidly in the business of shuffling memory around between VM's, and making sure it does so securely. And it does already hand out shared pages for the X server to blast bits to the gui daemons. Adding shared cache pages might not be a huge stretch if done carefully and smartly. I'll have to learn more about Intellicache and other options. Qubes devs "rolled their own" (God love them) when it came to a GUI system, sound, remote execution, rpc, file copying, and even created the underlying vchan mechanism they're all based upon. So even that's a possibility. I'd love to see code pages shared between AppVm's instances. Even the kernel (especially the kernel) could benefit hugely from that. If done carfefully. Basing all of its service upon its vchan is a key part of Qubes' security win, using a Microkernel-ish IPC mechanism (mostly with fixed-sized messages) instead of the TCP/IP stack. It's what lets dom0 go networkless. At the end of the day, it's kind of sad how we got here. Vchan is a win because we can't trust our networking stacks, or the operating systems they run upon, or the hardware that it uses, from either bugs or compromise. Virtualization and isolation of device drivers and applications is a win for the same reasons, as well as not being able to trust the applications themselves from bugs, inappropriate permissions, or intentional compromise. If the hardware, drivers, operating systems, and applications were all designed properly, validated, signed, yadda yadda yadda, Qubes would be unnecessary, and just a curiosity. Ideally, it'd be nice to have a secure 64-bit Microkernel operating that could run all applications you need and support all the hardware you need it to. Qubes, kinda, sorta, is that. Or as close as we can come today. I could see it evolving towards eliminating some of the awkward layers (such as whole operating system instances to isolate a questionable app or device driver) and become more and more of a pure microkernel. Especially if it ever makes the jump (or forks to) something like L4. I'm not sure I'll ever have full confidence in anything as complex as Xen. -d -- You received this message because you are subscribed to the Google Groups "qubes-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to qubes-devel+unsubscr...@googlegroups.com. To post to this group, send email to qubes-devel@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-devel/c086d89a2ba0ce536c5f3b5e12394344.webmail%40localhost. For more options, visit https://groups.google.com/d/optout.