Re: minimizing buffer size (i.e. page cache) for bulk copy/rsync -- Re: Swappiness in Buster
- Forwarded message - ‐‐‐ Original Message ‐‐‐ On Wednesday, July 8, 2020 7:53 AM, Zenaan Harkness wrote: > Anyone here able to answer this annoying buffer issue on bulk copies? [ paraphrasing: page cache gets swamped by bulk copy, driving interactive desktop applications to heavy latency... ] use an LD_PRELOAD hack to force mlock() on the apps you want to keep interactive: https://stackoverflow.com/questions/37818335/mlock-a-program-from-a-wrapper --- If you have the sources for the program, add a command-line option so that the program calls mlockall(MCL_CURRENT | MCL_FUTURE) at some point. That locks it in memory. If you want to control the address spaces the kernel loads the program into, you need to delve into kernel internals. Most likely, there is no reason to do so; only people with really funky hardware would. If you don't have the sources, or don't want to recompile the program, then you can create a dynamic library that executes the command, and inject it into the process via LD_PRELOAD. Save the following as lockall.c: #include #include #include #include #include static void wrerr(const char *p) { if (p) { const char *q = p + strlen(p); ssize_t n; while (p < q) { n = write(STDERR_FILENO, p, (size_t)(q - p)); if (n > 0) p += n; else if (n != -1 || errno != EINTR) return; } } } static void init(void) __attribute__((constructor)); static void init(void) { int saved_errno = errno; if (mlockall(MCL_CURRENT | MCL_FUTURE) == -1) { const char *errmsg = strerror(errno); wrerr("Cannot lock all memory: "); wrerr(errmsg); wrerr(".\n"); exit(127); } else wrerr("All memory locked.\n"); errno = saved_errno; } Compile it to a dynamic library liblockall.so using gcc -Wall -O2 -fPIC -shared lockall.c -Wl,-soname,liblockall.so -o liblockall.so Install the library somewhere typical, for example sudo install -o 0 -g 0 -m 0664 liblockall.so /usr/lib/ so you can run any binary, and lock it into memory, using LD_PRELOAD=liblockall.so binary arguments.. If you install the library somewhere else (not listed in /etc/ld.so.conf), you'll need to specify path to the library, like LD_PRELOAD=/usr/lib/liblockall.so binary arguments.. Typically, you'll see the message Cannot lock all memory: Cannot allocate memory. printed by the interposed library, when running commands as a normal user. (The superuser, or root, typically has no such limit.) This is because for obvious reasons, most Linux distributions limit the amount of memory an unprivileged user can lock into memory; this is the RLIMIT_MEMLOCK resource limit. Run ulimit -l to see the per-process resource limits currently set (for the current user, obviously). I suggest you set a suitable limit of how much memory the process can run, running e.g. ulimit -l 16384 bash-built-in before executing the (to set the limit to 16384*1024 bytes, or 16 MiB), if running as superuser (root). If the process leaks memory, instead of crashing your machine (because it locked all available memory), the process will die (from SIGSEGV) if it exceeds the limit. That is, you'd start your process using ulimit -l 16384 LD_PRELOAD=/usr/lib/liblockall.so binary arguments.. if using Bash or dash shell. If running as a dedicated user, most distributions use the pam_limits.so PAM module to set the resource limits "automatically". The limits are listed either in the /etc/security/limits.conf file, or in a file in the /etc/security/limits.d/ subdirectory, using this format; the memlock item specifies the amount of memory each process can lock, in units of 1024 bytes. So, if your service runs as user mydev, and you wish to allow the user to lock up to 16 megabytes = 16384*1024 bytes per process, then add line mydev - memlock 16384 into /etc/security/limits.conf or /etc/security/limits.d/mydev.conf, whichever your Linux distribution prefers/suggests. Prior to PAM, shadow-utils were used to control the resource limits. The memlock resource limit is specified in units of 1024 bytes; a limit of 16 megabytes would be set using M16384. So, if using shadow-utils instead of PAM, adding line mydev M16384 (followed by whatever the other limits you wish to specify) to /etc/limits should do the trick. ---end-cut--- best regards, - End forwarded message -
Re: minimizing buffer size (i.e. page cache) for bulk copy/rsync -- Re: Swappiness in Buster
On Thu, Jul 09, 2020 at 12:02:39AM +1000, Zenaan Harkness wrote: > What I imagine, and surely hope is that this softlimit program works as > advertised. It's hardly unique. You could do the same thing with a shell script wrapper that calls ulimit and then exec's the target program. > [stuff about buffers] > Since if it's not written in a totally insane way, cp (and to a lesser > extent rsync) should work like this already, what we really need to > limit is that pig with the strange "Linux" name rsync uses a shitload of memory not because it's copying a file with an enormous buffer size, but because it's tracking *all* of the file metadata in the entire hierarchy that you feed to it. > What has been happening since the dawn of time is that this kernel > caches _every_ disk page read by a program such as cp [...] Well, prematurely killing rsync is not going to affect the kernel's retention or discarding of specific disk cache pages that are more precious to you.
Re: minimizing buffer size (i.e. page cache) for bulk copy/rsync -- Re: Swappiness in Buster
On Wed, Jul 08, 2020 at 08:05:23AM -0400, Greg Wooledge wrote: > On Wed, Jul 08, 2020 at 08:00:38AM -0400, Dan Ritter wrote: > > Zenaan Harkness wrote: > > > > > > Seriously, is there a way to stop bulk copies from eternally flushing my > > > $Desktop's cached pages down the drain minute after blessful minute, > > > hour after gratitude filled hour? > > > > softlimit is packaged in daemontools. > > > > NAME > >softlimit - runs another program with new resource > > limits. > > I haven't been following this thread because I don't know anything > about the kernel's "swappiness" settings. > > But I'm a little bit confused about the intent here -- setting a > resource limit on memory is just going to make the program crash > with an "out of memory" error when it tries to exceed the limit. > It's not going to discipline a memory hog program into using > different algorithms. I also know nothing about swappiness. But this soft limit tool sounds exceedingly powerful () Hmm. Excuse me ... where was I? Oh yes, powERRR! What I imagine, and surely hope is that this softlimit program works as advertised. You see, the copy program (normally aliased as `cp`) when copying say 900 Gigabytes, does not, or certainly should not, consume more than say 1 Megabyte of memory as a read buffer. Of course _some_ buffer is needed, since otherwise each read call reads only one byte, which would result in a ridiculous number of kernel calls and the performance that doing so entails. But half a MiB being written to the destination, whilst another half MiB is being read from the source "should" be more than enough, since, roughly speaking, those buffers can simply swap over when they're both ready for another round. Since if it's not written in a totally insane way, cp (and to a lesser extent rsync) should work like this already, what we really need to limit is that pig with the strange "Linux" name (I think it's called fake lipschtick on an even-toed ungulate family Suidae, I mean kernel). You see Linux in its eternal and fruitless desire to make user's happy, gladly tells cp's read side that reading is finished, and also tells its write side that writing is finished, and so cp races as fast as it possibly can even if, and especially if, one of the drives is a different speed than the other (almost always the case) - and the bigger the difference, the quicker RAM is filled up, and Firefox's tabs evicted from memory. What has been happening since the dawn of time is that this kernel caches _every_ disk page read by a program such as cp (I know, I know, this is previously unheard of and amazing information I'm leaking) into something Linux calls the 'page cache' (totally bizzare thing to call an in memory cache of disk pages, but hey, what do I know...). And Linux, ever generous with other peoples' resources, hands out your RAM, ballooning the page cache as though the world will end if it does not do this to the greatest extent possible "because you might refer to one of those pages in the near future" and of course you do, when you write it to the destination disk, at which point, like a pregnant Sow on meth, Linux almost faints in excitement as its prediction that you would use that page again, comes true. And so now Linux is ecstaticly hyping the utility of all these "read once, write once" pages in the vain hope that evicting all your firefox Tab's cached memory pages is The Right Thing To Do ™©®. (And Firefox, unlike cp and rsync, is helpful enough to tag most of its RAM pages with "Totally ephemeral dude, feel free to dump to disk or even completely destroy at any time"). Firefox is super helpful like that :( And, cp and rsync are evidently nowhere as helpful as Firefox is to Linux's desire to please the user, and so your desktop experience goes straight down the drain. This has been happening since I discovered Linux - 20 years ago, the symptom was the entire desktop stuttering, the X cursor jumping with a latency of 10 seconds, if you were lucky. Alas, every one of the 4000+ kernel developers works for Google or Amazon ECS, and throughput is the only thing the datacenter needs - that and the ability to compile Linux kernels.
Re: minimizing buffer size (i.e. page cache) for bulk copy/rsync -- Re: Swappiness in Buster
On Wed, Jul 08, 2020 at 08:00:38AM -0400, Dan Ritter wrote: > Zenaan Harkness wrote: > > > > Seriously, is there a way to stop bulk copies from eternally flushing my > > $Desktop's cached pages down the drain minute after blessful minute, > > hour after gratitude filled hour? > > softlimit is packaged in daemontools. > > NAME >softlimit - runs another program with new resource > limits. I haven't been following this thread because I don't know anything about the kernel's "swappiness" settings. But I'm a little bit confused about the intent here -- setting a resource limit on memory is just going to make the program crash with an "out of memory" error when it tries to exceed the limit. It's not going to discipline a memory hog program into using different algorithms.
Re: minimizing buffer size (i.e. page cache) for bulk copy/rsync -- Re: Swappiness in Buster
Zenaan Harkness wrote: > > Seriously, is there a way to stop bulk copies from eternally flushing my > $Desktop's cached pages down the drain minute after blessful minute, > hour after gratitude filled hour? softlimit is packaged in daemontools. NAME softlimit - runs another program with new resource limits. SYNOPSIS softlimit [ opts ] child DESCRIPTION opts is a series of getopt-style options. child consists of one or more arguments. softlimit sets soft resource limits as specified by opts. It then runs child. OPTIONS In each of the following opts, n may be =, indicating that the soft limit should be set equal to the hard limit. opts controlling memory use: -m n Same as -d n -s n -l n -a n. -d n Limit the data segment per process to n bytes. -s n Limit the stack segment per process to n bytes. -l n Limit the locked physical pages per process to n bytes. This option has no effect on some operating systems. -a n Limit the total of all segments per process to n bytes. This option has no effect on some operating systems. and so on for other resources to limit. -dsr-
Re: minimizing buffer size (i.e. page cache) for bulk copy/rsync -- Re: Swappiness in Buster
Hi. On Wed, Jul 08, 2020 at 05:51:54PM +1000, Zenaan Harkness wrote: > What needs to happen ™©® > > What is needed is an option somewhere, strictly abided by, where "For > the following command" only say 1MiB of buffer and Page Cache, in TOTAL, > is able to be used by that command ever, until its completion. This works for me in stable, it's systemwide though. sysctl -w vm.dirty_bytes=16777216 sysctl -w vm.dirty_background_bytes=16777216 Reco
minimizing buffer size (i.e. page cache) for bulk copy/rsync -- Re: Swappiness in Buster
On Wed, Jul 08, 2020 at 08:26:18AM +0200, Martin Reissner wrote: > For the sake of completeness, I seem to have solved it after some more > research and it turned out to be systemd, as following bug that was > reported for Centos7 seems to be applying to Debian Buster as well: > > https://github.com/systemd/systemd/issues/9276 > > Luckily the workaround mentioned in: > > https://github.com/systemd/systemd/issues/9276#issuecomment-442514543 > > applies here as well and by setting this to the desired swappiness value > and rebooting the system so far it seems to work as before and swapping > out is only done if it can't be avoided. Your thread brings to mind an "$AGE_OF_LINUX_KERNEL + 1 year" years old Linux kernel bona fydee, bulk copy bug bear: Doing a bulk copy e.g. using cp or rsync, to copy GiBs far greater than available RAM from one disk to another, results in ... Total Annihilation of Desktop Latency This happens to this day even on the latest Linux kernels on Sid such as 5.7.0-1-amd64 to pick an entirely random example... and has been literally bugging me since the beginnings of my Debian journey ~20 years ago. Note that in this extremely humble user's experience, even ionice --class Idle ... is not even able to quell the Memory Beast's all consuming hunger games for all possible RAM and then some, in it's insatiable appetite for all the page cache in... The Entire Universe What needs to happen ™©® What is needed is an option somewhere, strictly abided by, where "For the following command" only say 1MiB of buffer and Page Cache, in TOTAL, is able to be used by that command ever, until its completion. In this way my 16GiBs of RAM will __finally__ be left to its rightful owners Firefox, mutt, and my XOrg's cursor, since to me as an extremely humble desktop user, these are the only things in the world that I ever care about, and I esPECIALLY want to NOT care that a 1,700 GiB movie disk backup takes an extra 5 or 10 % longer than the NINE hours it already takes across its USB 2 connection! Seriously, is there a way to stop bulk copies from eternally flushing my $Desktop's cached pages down the drain minute after blessful minute, hour after gratitude filled hour? The heavens will truly be pleased ..
Re: Swappiness in Buster
For the sake of completeness, I seem to have solved it after some more research and it turned out to be systemd, as following bug that was reported for Centos7 seems to be applying to Debian Buster as well: https://github.com/systemd/systemd/issues/9276 Luckily the workaround mentioned in: https://github.com/systemd/systemd/issues/9276#issuecomment-442514543 applies here as well and by setting this to the desired swappiness value and rebooting the system so far it seems to work as before and swapping out is only done if it can't be avoided. Martin
Re: Swappiness in Buster
On 06/07/2020 23:27, deloptes wrote: > May be look deeper in documentation - I recall asking few years ago and was > answered that now it would cache whatever it can and will free on demand. > swap is done only if memory is really insufficient. > > I don't recall when or where I asked read this > I probably wouldn't understand much delving deeper into the documentation or even code^^ Using up all the available RAM for disk cache and freeing cache when memory is needed is what Linux did for as long as I can remember and still does. Swapping being only done when there is no memory available is exactly what the "swappiness" parameter did when set to "1" in Stretch, but this doesn't seem to work in Buster anymore as it swaps out the diskcache even when there is memory available.
Re: Swappiness in Buster
Martin Reissner wrote: > Yeah, only talking about server and mostly database applications. I > usually set it to 1, but even tried 0 which disabled swap completely on > Stretch but on Buster it didn't make a difference at all, the setting > seems to be ignored while using default swappiness (60?) since Buster. > > Following output is from a mariadb server that didn't swap at all > running Stretch. > > root@server:~# cat /proc/sys/vm/swappiness > 1 > root@server:~# free -tm > total used free shared buff/cache available > Mem: 48284 9763 707 59 37814 38147 > Swap: 11443 5020 6423 > Total: 59728 14783 7131 May be look deeper in documentation - I recall asking few years ago and was answered that now it would cache whatever it can and will free on demand. swap is done only if memory is really insufficient. I don't recall when or where I asked read this
Re: Swappiness in Buster
On 06/07/2020 18:11, songbird wrote: > Martin Reissner wrote: > >> Hello, >> >> ever since upgrading machines to Buster the vm.swappiness sysctl >> parameter doesn't seem to do anything anymore and regardless on how I >> set it via sysctl or directly in /proc the system behaves as it would >> have a pretty high swappiness and thus is swapping out quite a bit under >> load, using the memory mostly for buff/cache. >> >> To make this clear, I'm seeing no performance degradation and there is >> not much swapping in done, I'm merely wondering why this behaviour has >> changed and if there's a way to make it work as before again. Also I'd >> be happy to know if anybody else is experiencing this as Buster has been >> out for a while now and I found some information on this ie. other users >> reporting the same issue, but not really a lot. >> >> Now if you're asking why I would configure swap if I don't want it to be >> used, the answer is monitoring and failsafe. I usually set the >> swappiness to 1 so to only swap if the box runs out of memory and then >> trigger an alarm to let us know something is up. This is much more >> convenient than the oomkiller striking down a mysqld process just >> because it was allowed to use a bit too much memory. > > i haven't noticed any issues with my desktop system set > to 5, but i rarely if ever get much swapping going on > anyways. i'm not using buster but using testing. > > if you are running a database oriented server i think > the recommendation would be to set it to 0 or 1. > > > songbird > Yeah, only talking about server and mostly database applications. I usually set it to 1, but even tried 0 which disabled swap completely on Stretch but on Buster it didn't make a difference at all, the setting seems to be ignored while using default swappiness (60?) since Buster. Following output is from a mariadb server that didn't swap at all running Stretch. root@server:~# cat /proc/sys/vm/swappiness 1 root@server:~# free -tm total used free shared buff/cache available Mem: 48284 9763 707 59 37814 38147 Swap: 11443 5020 6423 Total:5972814783 7131
Re: Swappiness in Buster
Martin Reissner wrote: > Hello, > > ever since upgrading machines to Buster the vm.swappiness sysctl > parameter doesn't seem to do anything anymore and regardless on how I > set it via sysctl or directly in /proc the system behaves as it would > have a pretty high swappiness and thus is swapping out quite a bit under > load, using the memory mostly for buff/cache. > > To make this clear, I'm seeing no performance degradation and there is > not much swapping in done, I'm merely wondering why this behaviour has > changed and if there's a way to make it work as before again. Also I'd > be happy to know if anybody else is experiencing this as Buster has been > out for a while now and I found some information on this ie. other users > reporting the same issue, but not really a lot. > > Now if you're asking why I would configure swap if I don't want it to be > used, the answer is monitoring and failsafe. I usually set the > swappiness to 1 so to only swap if the box runs out of memory and then > trigger an alarm to let us know something is up. This is much more > convenient than the oomkiller striking down a mysqld process just > because it was allowed to use a bit too much memory. i haven't noticed any issues with my desktop system set to 5, but i rarely if ever get much swapping going on anyways. i'm not using buster but using testing. if you are running a database oriented server i think the recommendation would be to set it to 0 or 1. songbird