Launchpad has imported 38 comments from the remote bug at https://bugzilla.kernel.org/show_bug.cgi?id=196729.
If you reply to an imported comment from within Launchpad, your comment will be sent to the remote bug automatically. Read more about Launchpad's inter-bugtracker facilities at https://help.launchpad.net/InterBugTracking. ------------------------------------------------------------------------ On 2017-08-22T11:17:08+00:00 netwiz wrote: I have 10Gb of RAM in this system and run Fedora 26. If I launch Cities: Skylines with no swap space, things run well performance wise until I get an OOM - and it all dies - which is expected. When I turn on swap to /dev/sda2 which resides on an SSD, I get complete system freezes while swap is being accessed. The first swap was after loading a saved game, then launching kmail in the background. This caused ~500Mb to be swapped to /dev/sda2 on an SSD. The system froze for about 8 minutes - barely being able to move the mouse. The HDD LED was on constantly during the entire time. To hopefully rule out the above glibc issue, I started the game via jemalloc - but experienced even more severe freezes while swapping. I gave up waiting after 13 minutes of non-responsiveness - not even being able to move the mouse properly. During these hangs, I could typed into a Konsole window, and some of the typing took 3+ minutes to display on the screen (yay for buffers?). I have tested this with both the default vm.swappiness values, as well as the following: vm.swappiness = 1 vm.min_free_kbytes = 32768 vm.vfs_cache_pressure = 60 I noticed that when I do eventually get screen updates, all 8 cpus (4 cores / 2 threads) show 100% CPU usage - and kswapd is right up there in the process list for CPU usage. Sadly I haven't been able to capture this information fully yet due to said unresponsiveness. (more to come in comments & attachments) Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/0 ------------------------------------------------------------------------ On 2017-08-22T11:18:57+00:00 netwiz wrote: First - using kernel 4.10.17 - which does not show any issues in swapping: I tried doing: swapoff /dev/sda2 Attached output as vmstat-4.10.17-10Gb-noswap.log 18:27:00 - Launched Cities: Skylines 18:27:30 - Started loading the saved game 18:28:25 - About this time the game started doing its thing. Started scrolling around. 18:28:47 - System stopped responding and then the C:S was killed by the OOM handler Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/1 ------------------------------------------------------------------------ On 2017-08-22T11:19:41+00:00 netwiz wrote: Created attachment 258045 vmstat-4.10.17-10Gb-noswap.log (OK - OOM running) Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/2 ------------------------------------------------------------------------ On 2017-08-22T11:21:32+00:00 netwiz wrote: Created attachment 258047 vmstat-4.10.17-10Gb.log (OK with swapping) Second test, same kernel with swap turned on: I have attached the vmstat output that goes with the following timestamps for system utilisation: 15:32:30 - Launch Skylines 15:33:00 - Load the saved game 15:34:11 - Saved game loaded ok. 15:35:00 - Launch Chrome. 15:35:36 - Chrome launched - System responding ok. 15:36:00 - Browsing a few web sites 15:36:50 - Exit Chrome 15:37:30 - Exit Cities: Skylines. You'll note that there are very few missing vmstat lines - however I did notice the following missing: 15:35:10 15:35:12 15:35:15 15:35:20 15:35:26 15:35:29 15:35:30 Attachment is vmstat-4.10.17-10Gb.log Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/3 ------------------------------------------------------------------------ On 2017-08-22T11:24:02+00:00 netwiz wrote: Created attachment 258049 vmstat-20Gb.log (OK - all in RAM) Now using kernel 4.11.x (same happens with 4.12.x) - and testing with 20Gb of RAM in the system - meaning no swapping. Attached as: vmstat-20Gb.log Timestamps of events: 21:57 - launch the game from within Steam. 21:58:00 - Load the saved game. 21:58:48 - Saved game is loaded and I'm scrolling around in the map. 22:00:00 - Hit the quit to desktop button. 22:00:31 - Am back to desktop with all RAM free again. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/4 ------------------------------------------------------------------------ On 2017-08-22T11:25:59+00:00 netwiz wrote: Created attachment 258051 vmstat-10Gb.log (NOT OK - System Unresponsive) I now drop back to 10Gb of RAM to test the swapping under 4.11.x kernel. Log attached as vmstat-10Gb.log Timestamps: 22:10:00 - Launched of the game from within Steam 22:11:00 - Load the same saved game from the previous log 22:12:01 - Saved game is loaded and I can scroll around. Noted a slight pause when swapd went to 256 - but otherwise all is well. 22:13:00 - Launched Google Chrome browser to make the system swap. After this point, the whole system went to hell. You'll note many missing vmstat entries up until around 22:22 when I managed to exit from the game back to desktop via the normal means (and not getting annoyed and doing a pkill from tty2). As such, the system went nuts for ~9 minutes until I was able to exit the game to stop things going nuts. I note that with 20Gb RAM - as the system never touches swap, I can still play the game, browse the web with the Chrome browser, read / write email, and even watch a DVB-T broadcast in VLC without having any more than a minor pause in the game for less than a second. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/5 ------------------------------------------------------------------------ On 2017-08-22T11:29:50+00:00 netwiz wrote: So overall, this seems to indicate a regression between kernel 4.10.x (I'm pretty sure I tested all ok with 4.10.15?) and the newer 4.11 and 4.12 builds. I made contact with Rik van Riel and Ying Huang (which I will attempt to add to this as a CC for comment?) - they don't believe it is a swapping issue - however Rik seems to believe that: > > There is ZERO swap space in use. > > > > In other words, it is not actually swapping, > > but thrashing through the page cache. You may want to email the people who worked on page cache replacement stuff recently, and the linux-mm mailing list as well. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/6 ------------------------------------------------------------------------ On 2017-08-22T22:56:22+00:00 akpm wrote: (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Tue, 22 Aug 2017 11:17:08 +0000 bugzilla-dae...@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=196729 > > Bug ID: 196729 > Summary: System becomes unresponsive when swapping - Regression > since 4.10.x > Product: Memory Management > Version: 2.5 > Kernel Version: 4.11.x / 4.12.x > Hardware: All > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Page Allocator > Assignee: a...@linux-foundation.org > Reporter: net...@crc.id.au > Regression: No So it's "Regression: yes". More info at the bugzilla link. > I have 10Gb of RAM in this system and run Fedora 26. If I launch Cities: > Skylines with no swap space, things run well performance wise until I get an > OOM - and it all dies - which is expected. > > When I turn on swap to /dev/sda2 which resides on an SSD, I get complete > system freezes while swap is being accessed. > > The first swap was after loading a saved game, then launching kmail in the > background. This caused ~500Mb to be swapped to /dev/sda2 on an SSD. The > system froze for about 8 minutes - barely being able to move the mouse. The > HDD LED was on constantly during the entire time. > > To hopefully rule out the above glibc issue, I started the game via jemalloc > - > but experienced even more severe freezes while swapping. I gave up waiting > after 13 minutes of non-responsiveness - not even being able to move the > mouse > properly. > > During these hangs, I could typed into a Konsole window, and some of the > typing took 3+ minutes to display on the screen (yay for buffers?). > > I have tested this with both the default vm.swappiness values, as well as the > following: > vm.swappiness = 1 > vm.min_free_kbytes = 32768 > vm.vfs_cache_pressure = 60 > > I noticed that when I do eventually get screen updates, all 8 cpus (4 cores / > 2 threads) show 100% CPU usage - and kswapd is right up there in the process > list for CPU usage. Sadly I haven't been able to capture this information > fully yet due to said unresponsiveness. > > (more to come in comments & attachments) > > -- > You are receiving this mail because: > You are the assignee for the bug. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/7 ------------------------------------------------------------------------ On 2017-08-23T13:54:38+00:00 mhocko wrote: Created attachment 258067 read_vmstat.c On Tue 22-08-17 15:55:30, Andrew Morton wrote: > > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Tue, 22 Aug 2017 11:17:08 +0000 bugzilla-dae...@bugzilla.kernel.org wrote: [...] > Sadly I haven't been able to capture this information > > fully yet due to said unresponsiveness. Please try to collect /proc/vmstat in the bacground and provide the collected data. Something like while true do cp /proc/vmstat > vmstat.$(date +%s) sleep 1s done If the system turns out so busy that it won't be able to fork a process or write the output (which you will see by checking timestamps of files and looking for holes) then you can try the attached proggy ./read_vmstat output_file timeout output_size Note you might need to increase the mlock rlimit to lock everything into memory. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/8 ------------------------------------------------------------------------ On 2017-08-23T14:41:38+00:00 netwiz wrote: Created attachment 258069 8Gb-noswap.tar.gz On Wednesday, 23 August 2017 11:38:48 PM AEST Michal Hocko wrote: > On Tue 22-08-17 15:55:30, Andrew Morton wrote: > > (switched to email. Please respond via emailed reply-to-all, not via the > > bugzilla web interface). > > > On Tue, 22 Aug 2017 11:17:08 +0000 bugzilla-dae...@bugzilla.kernel.org wrote: > [...] > > > Sadly I haven't been able to capture this information > > > > > fully yet due to said unresponsiveness. > > Please try to collect /proc/vmstat in the bacground and provide the > collected data. Something like > > while true > do > cp /proc/vmstat > vmstat.$(date +%s) > sleep 1s > done > > If the system turns out so busy that it won't be able to fork a process > or write the output (which you will see by checking timestamps of files > and looking for holes) then you can try the attached proggy > ./read_vmstat output_file timeout output_size > > Note you might need to increase the mlock rlimit to lock everything into > memory. Thanks Michal, I have upgraded PCs since I initially put together this data - however I was able to get strange behaviour by pulling out an 8Gb RAM stick in my new system - leaving it with only 8Gb of RAM. All these tests are performed with Fedora 26 and kernel 4.12.8-300.fc26.x86_64 I have attached 3 files with output. 8Gb-noswap.tar.gz contains the output of /proc/vmstat running on 8Gb of RAM with no swap. Under this scenario, I was expecting the OOM reaper to just kill the game when memory allocated became too high for the amount of physical RAM. Interestingly, you'll notice a massive hang in the output before the game is terminated. I didn't see this before. 8Gb-swap-on-file.tar.gz contains the output of /proc/vmstat still with 8Gb of RAM - but creating a file with swap on the PCIe SSD /swapfile with size 8Gb via: # dd if=/dev/zero of=/swapfile bs=1G count=8 # mkswap /swapfile # swapon /swapfile Some times (all in UTC+10): 23:58:30 - Start loading the saved game 23:59:38 - Load ok, all running fine 00:00:15 - Load Chrome 00:01:00 - Quit the game The game seemed to run ok with no real issue - and a lot was swapped to the swap file. I'm wondering if it was purely the speed of the PCIe SSD that caused this appearance - as the creation of the file with dd completed at ~1.4GB/sec. 8Gb-swap-on-ssd.tar.gz contains adding a 32Gb SATA based SSD to the system and using the entire block device as swap via: # mkswap -f /dev/sda # swapon /dev/sda There are many pauses and unresponsiveness issues while this was loading - however we eventually got there. Some timings (all in UTC+10 again): 00:06:33 - Load the saved game 00:11:22 - Saved game loaded - somewhat responsive 00:12:00 - Load Chrome 00:13:07 - Quit the game + chrome For the sake of information, the following is a speed test on the SSD in question: # dd if=/dev/zero of=/dev/sda bs=1M count=8192 conv=fsync 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 44.923 s, 191 MB/s # dd if=/dev/sda of=/dev/null bs=1M count=8192 conv=fsync dd: fsync failed for '/dev/null': Invalid argument 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 30.7414 s, 279 MB/s Running the game on the exact same system with 16Gb of RAM and no swap works perfectly - even with multitasking - as we never end up filling physical RAM. As there is some data missing though, should I still attempt to compile + run the program provided? I'm not quite clear on the mlock rlimit mention - I haven't really had to debug anything like this before. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/9 ------------------------------------------------------------------------ On 2017-08-23T14:41:39+00:00 netwiz wrote: Created attachment 258071 8Gb-swap-on-file.tar.gz Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/10 ------------------------------------------------------------------------ On 2017-08-23T14:41:39+00:00 netwiz wrote: Created attachment 258073 8Gb-swap-on-ssd.tar.gz Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/11 ------------------------------------------------------------------------ On 2017-08-23T14:41:39+00:00 netwiz wrote: Created attachment 258075 signature.asc Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/12 ------------------------------------------------------------------------ On 2017-08-24T12:41:44+00:00 mhocko wrote: On Thu 24-08-17 00:30:40, Steven Haigh wrote: > On Wednesday, 23 August 2017 11:38:48 PM AEST Michal Hocko wrote: > > On Tue 22-08-17 15:55:30, Andrew Morton wrote: > > > (switched to email. Please respond via emailed reply-to-all, not via the > > > bugzilla web interface). > > > > > On Tue, 22 Aug 2017 11:17:08 +0000 bugzilla-dae...@bugzilla.kernel.org > wrote: > > [...] > > > > > Sadly I haven't been able to capture this information > > > > > > > fully yet due to said unresponsiveness. > > > > Please try to collect /proc/vmstat in the bacground and provide the > > collected data. Something like > > > > while true > > do > > cp /proc/vmstat > vmstat.$(date +%s) > > sleep 1s > > done > > > > If the system turns out so busy that it won't be able to fork a process > > or write the output (which you will see by checking timestamps of files > > and looking for holes) then you can try the attached proggy > > ./read_vmstat output_file timeout output_size > > > > Note you might need to increase the mlock rlimit to lock everything into > > memory. > > Thanks Michal, > > I have upgraded PCs since I initially put together this data - however I was > able to get strange behaviour by pulling out an 8Gb RAM stick in my new > system > - leaving it with only 8Gb of RAM. > > All these tests are performed with Fedora 26 and kernel > 4.12.8-300.fc26.x86_64 > > I have attached 3 files with output. > > 8Gb-noswap.tar.gz contains the output of /proc/vmstat running on 8Gb of RAM > with no swap. Under this scenario, I was expecting the OOM reaper to just > kill > the game when memory allocated became too high for the amount of physical > RAM. > Interestingly, you'll notice a massive hang in the output before the game is > terminated. I didn't see this before. I have checked few gaps. E.g. vmstat.1503496391 vmstat.1503496451 which is one minute. The most notable thing is that there are only very few pagecache pages [base] [diff] nr_active_file 1641 3345 nr_inactive_file 1630 4787 So there is not much to reclaim without swap. The more important thing is that we keep reclaiming and refaulting that memory workingset_activate 5905591 1616391 workingset_refault 33412538 10302135 pgactivate 42279686 13219593 pgdeactivate 48175757 14833350 pgscan_kswapd 379431778 126407849 pgsteal_kswapd 49751559 13322930 so we are effectivelly trashing over the very small amount of reclaimable memory. This is something that we cannot detect right now. It is even questionable whether the OOM killer would be an appropriate action. Your system has recovered and then it is always hard to decide whether a disruptive action is more appropriate. One minute of unresponsiveness is certainly annoying though. Your system is obviously under provisioned to load you want to run obviously. It is quite interesting to see that we do not really have too many direct reclaimers during this time period allocstall_normal 30 1 allocstall_movable 490 88 pgscan_direct_throttle 0 0 pgsteal_direct 24434 4069 pgscan_direct 38678 5868 > 8Gb-swap-on-file.tar.gz contains the output of /proc/vmstat still with 8Gb of > RAM - but creating a file with swap on the PCIe SSD /swapfile with size 8Gb > via: > # dd if=/dev/zero of=/swapfile bs=1G count=8 > # mkswap /swapfile > # swapon /swapfile > > Some times (all in UTC+10): > 23:58:30 - Start loading the saved game > 23:59:38 - Load ok, all running fine > 00:00:15 - Load Chrome > 00:01:00 - Quit the game > > The game seemed to run ok with no real issue - and a lot was swapped to the > swap file. I'm wondering if it was purely the speed of the PCIe SSD that > caused this appearance - as the creation of the file with dd completed at > ~1.4GB/sec. Swap IO tends to be really scattered and the IO performance is not really great even on a fast storage AFAIK. Anyway your original report sounded like a regression. Were you able to run the _same_ workload on an older kernel without these issues? Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/13 ------------------------------------------------------------------------ On 2017-08-24T14:20:08+00:00 netwiz wrote: Created attachment 258079 signature.asc On Thursday, 24 August 2017 10:41:39 PM AEST Michal Hocko wrote: > On Thu 24-08-17 00:30:40, Steven Haigh wrote: > > On Wednesday, 23 August 2017 11:38:48 PM AEST Michal Hocko wrote: > > > On Tue 22-08-17 15:55:30, Andrew Morton wrote: > > > > (switched to email. Please respond via emailed reply-to-all, not via > > > > the > > > > bugzilla web interface). > > > > > > > > On Tue, 22 Aug 2017 11:17:08 +0000 bugzilla-dae...@bugzilla.kernel.org > > > > wrote: > > > [...] > > > > > > > Sadly I haven't been able to capture this information > > > > > > > > > fully yet due to said unresponsiveness. > > > > > > Please try to collect /proc/vmstat in the bacground and provide the > > > collected data. Something like > > > > > > while true > > > do > > > > > > cp /proc/vmstat > vmstat.$(date +%s) > > > sleep 1s > > > > > > done > > > > > > If the system turns out so busy that it won't be able to fork a process > > > or write the output (which you will see by checking timestamps of files > > > and looking for holes) then you can try the attached proggy > > > ./read_vmstat output_file timeout output_size > > > > > > Note you might need to increase the mlock rlimit to lock everything into > > > memory. > > > > Thanks Michal, > > > > I have upgraded PCs since I initially put together this data - however I > > was able to get strange behaviour by pulling out an 8Gb RAM stick in my > > new system - leaving it with only 8Gb of RAM. > > > > All these tests are performed with Fedora 26 and kernel > > 4.12.8-300.fc26.x86_64 > > > > I have attached 3 files with output. > > > > 8Gb-noswap.tar.gz contains the output of /proc/vmstat running on 8Gb of > > RAM > > with no swap. Under this scenario, I was expecting the OOM reaper to just > > kill the game when memory allocated became too high for the amount of > > physical RAM. Interestingly, you'll notice a massive hang in the output > > before the game is terminated. I didn't see this before. > > I have checked few gaps. E.g. vmstat.1503496391 vmstat.1503496451 which > is one minute. The most notable thing is that there are only very few > pagecache pages > [base] [diff] > nr_active_file 1641 3345 > nr_inactive_file 1630 4787 > > So there is not much to reclaim without swap. The more important thing > is that we keep reclaiming and refaulting that memory > > workingset_activate 5905591 1616391 > workingset_refault 33412538 10302135 > pgactivate 42279686 13219593 > pgdeactivate 48175757 14833350 > > pgscan_kswapd 379431778 126407849 > pgsteal_kswapd 49751559 13322930 > > so we are effectivelly trashing over the very small amount of > reclaimable memory. This is something that we cannot detect right now. > It is even questionable whether the OOM killer would be an appropriate > action. Your system has recovered and then it is always hard to decide > whether a disruptive action is more appropriate. One minute of > unresponsiveness is certainly annoying though. Your system is obviously > under provisioned to load you want to run obviously. > > It is quite interesting to see that we do not really have too many > direct reclaimers during this time period > allocstall_normal 30 1 > allocstall_movable 490 88 > pgscan_direct_throttle 0 0 > pgsteal_direct 24434 4069 > pgscan_direct 38678 5868 Yes, I understand that the system is really not suitable - however I believe the test is useful - even from an informational point of view :) > > 8Gb-swap-on-file.tar.gz contains the output of /proc/vmstat still with 8Gb > > of RAM - but creating a file with swap on the PCIe SSD /swapfile with > > size 8Gb> > > via: > > # dd if=/dev/zero of=/swapfile bs=1G count=8 > > # mkswap /swapfile > > # swapon /swapfile > > > > Some times (all in UTC+10): > > 23:58:30 - Start loading the saved game > > 23:59:38 - Load ok, all running fine > > 00:00:15 - Load Chrome > > 00:01:00 - Quit the game > > > > The game seemed to run ok with no real issue - and a lot was swapped to > > the > > swap file. I'm wondering if it was purely the speed of the PCIe SSD that > > caused this appearance - as the creation of the file with dd completed at > > ~1.4GB/sec. > > Swap IO tends to be really scattered and the IO performance is not really > great even on a fast storage AFAIK. > > Anyway your original report sounded like a regression. Were you able to > run the _same_ workload on an older kernel without these issues? When I try the same tests with swap on an SSD under kernel 4.10.x (I believe the latest I tried was 4.10.25?) - then swap using the SSD did not cause any issues or periods of system unresponsiveness. The file attached in the original bug report "vmstat-4.10.17-10Gb.log" was taken on my old system with 10Gb of RAM - and there were no significant pauses while swapping. I do find it interesting that the newer '8Gb-swap-on-file.tar.gz' does not show any issues. I wonder if it would be helpful to attempt the same using a file on the SSD that was a swap disk in the '8Gb-swap-on-ssd.tar.gz' so we have a constant device - but with a file on the SSD instead of the entire block device. That would at least expose any issues on the same device in file vs block mode? Or maybe even if there's a difference just having the file on a much (much!) faster drive? Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/14 ------------------------------------------------------------------------ On 2017-08-28T07:14:14+00:00 ying.huang wrote: Compared with a) vmstat-4.10.17-10Gb.log (OK with swapping) and b) vmstat-10Gb.log (NOT OK - System Unresponsive) The si/so is low in both files. And si/so in a) is higher than that of b), so the problem may be we swap less than before? The bi is kept high in b). I guess we encountered thrashing for file pages. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/15 ------------------------------------------------------------------------ On 2017-11-27T02:37:31+00:00 netwiz wrote: To give this a bit of a nudge, I've been seeing reports of others having similar issues. See: https://www.reddit.com/r/Fedora/comments/7f0dht/system_freezes_for_45min_in_lowmemory_conditions/ Also lodged on the RH BZ a while ago: https://bugzilla.redhat.com/show_bug.cgi?id=1472336 Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/16 ------------------------------------------------------------------------ On 2018-04-02T01:47:19+00:00 code wrote: Hi, I’ve experienced what I believe is the same problem. The problem has gone away completely for me after I bumped vm.min_free_kbytes way up to 393216. As soon as the system ran out of physical memory, the system would freeze for at least 2 minutes and often up to 45 minutes. GNOME desktop would stop. I could move the mouse cursor, and ping the system from a remote computer; but not connect over SSH or do anything other than wave the mouse about. The system clock on the top of GNOME would stop updating for 45 minutes. (Maaaybe it would move forward 1 minute after 20 minutes and still be 19 minutes out of sync.) I've been having this issue for years on multiple different computer configurations with 8+ GiB of memory and large SWAP partitions. I never saw more than maybe 5 MiB in use on the SWAP partition. After tuning min_free_kbytes, the SWAP partition is now being used properly and the system only does the occasional (and expected) 1 second stutter when running low on physical memory. I also run Fedora and have kept up with the latest stable release. Aside 1: The issue would persist with SWAPOFF, just like Steven Haigh describes. Aside 2: The problem happen much more frequently when I used BtrFS. After switching to XFS, this happen less frequently (weekly instead of daily). Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/17 ------------------------------------------------------------------------ On 2018-05-07T00:49:26+00:00 ultra10e wrote: Please refer also to this bug report. It is the same problem and has existed for eleven (11!) years if one can believe that. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/159356 I personally experience this on two 4GB laptops running live versions (no swap) of Debian 8.6 - 9.4, Fedora 25 - 28, Ubuntu, with a myriad of shells; Gnome, Mate, Cinnamon, KDE, etc. (One laptop I expanded to 8GB now but it doesn't matter- it just take a little longer to freeze the system.) Various browsers from Firefox52 to current Developers 60, Chrome and Chromium Certain combos eat up memory faster (Gnome has a memory leak for example in which it consumers memory for every window drawn and NEVER relinquishes that memory, without a restart to gnome-shell), new Firefox or Chrom* vs older ESR versions (54 and under) of FF eat up memory much more quickly. Under the best combo/circumstances, I can open up 25-30 FF tabs before the system SUDDENLY SEIZES (observe the "USB live stick" light flashing non-stop, as if swapping, even tho no swap on Live versions). If not caught within literal seconds to Ctrl-Alt-F5 to an opened root console where I can kill the FF ps and save this "live" session, the computer is entirely unresponsive and requires power cycling. In rare instances, some 10's of minutes later or even hours (4,8 12) later, the system *might* finally respond to the request to drop to the console. Keystrokes to issue the kill command can take minutes per key, but if successful, I've seen the load reported after the kill as high as 75. Truly amazing. It's difficult to fathom this critical a bug in memory management has gone un-addressed/un-noticed for so long but alas. I can't recall but I've read this behavior ONLY occurs on 64-bit kernels, and is un-reproducible on 32-bit kernels. Also, on non-"live" installs, with swap configured, one can watch the hard drive light come on and remain solid to the same effect. Power cycle time. I've read from others, that they've determined swap isn't even really being used, so not sure what the "read" thrashing going on is (and it must be read thrashing because on Live versions there's no swap and the USB drive light is steady active also). I just run Linux to not run Windows. Basic browsing, text document editing, file management, a few cli's and an instant messaging program typically opened simultaneously. Nothing computationally heavy but memory intensive (at least for the web browser) for sure. STILL- difficult to believe the OS cannot handle this situation with some sort of message, or killing a window/throwing an error about an opened Firefox tab or something-- rather, it simply fills up the memory (I watch on gnome-system-status/Resources tab now) to 99% and then it's too late. I really don't know technically how the Out Of Memory killer works/is supposed to work, but it sure isn't doing anything here. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/18 ------------------------------------------------------------------------ On 2018-05-18T01:41:55+00:00 korbin.freedman wrote: I experience this too. Ive tested using Kernel version Kernel 4.17 rc8, Kernel 4.16.8, Kernel 4.12.8, and Kernel 4.14 across Manjaro Linux, Ubuntu Linux, Opensuse leap 15 and Fedora 28. Steps to trigger: -Open firefox with many tabs, or any other high memory usage program -Wait a second -System freezes. Sometimes the only fix is a hard reboot Other findings: -I notice really high cpu load averages if the system unfreezes -If the system is not frozen, it is highly unresponsive on high memory usage when swapping -Hard drive indicator light stays solidly on when system is frozen (excessive hard disk use) -The reason the system freezes is because it is swapping Tested on a Intel i5 520m with 4gb ram/ 4gb swap (Lenovo t410) Intel E6400 with 3gb ram/ 3gb swap This bug is really hard to deal with because it usually requires a hard restart. Please fix ASAP if possible Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/19 ------------------------------------------------------------------------ On 2018-05-18T01:44:50+00:00 korbin.freedman wrote: My fedora report here : https://bugzilla.redhat.com/show_bug.cgi?id=1577528 Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/20 ------------------------------------------------------------------------ On 2018-05-18T01:51:18+00:00 korbin.freedman wrote: If someone gives me some debugging instructions, ill gladly follow them Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/21 ------------------------------------------------------------------------ On 2018-05-19T01:36:26+00:00 korbin.freedman wrote: https://askubuntu.com/questions/41778/computer-freezing-on-almost-full- ram-possibly-disk-cache-problem https://unix.stackexchange.com/questions/28175/system-hanging-when-it- runs-out-of-memory https://bbs.archlinux.org/viewtopic.php?id=231087 Over 5 (Non bug reporting) websites report this bug, with recent posts up to 2018. This memory issue is FATAL to people who rely on linux for gaming/working. Please don't let this issue go unnoticed for 11 more years. It is critical to system function that memory management is good. The issue actually doesn't only occur when swapping. Swapping however greatly aggravates this bug Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/22 ------------------------------------------------------------------------ On 2018-05-19T20:32:29+00:00 ultra10e wrote: @SlayerProof32 #19 > -The reason the system freezes is because it is swapping I don't think this is the case at least not in the traditional sense. if you read my comment above yours, I'm running Live versions of Linux where no swap is configured. I suppose it could be swapping portions of code into and out of regular memory, but there's no hard disk writes. On my USB sticks, I note the light comes on steady when the system freezes. It's reading (something). I also noted the bug doesn't exhibit in 32-bit systems (I'd read previously somewhere). Eleven years. Shocking. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/23 ------------------------------------------------------------------------ On 2018-05-19T21:45:20+00:00 korbin.freedman wrote: @lou #20 I have confirmed that swapping is not the issue and you are correct. On my linux system, I tried swapoff --a and to no surprise, it crashed on high memory usage still, and there was still excessive drive usage. Thanks for the insight Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/24 ------------------------------------------------------------------------ On 2018-05-19T21:45:55+00:00 korbin.freedman wrote: https://bugzilla.kernel.org/show_bug.cgi?id=199763 I filed another report. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/25 ------------------------------------------------------------------------ On 2018-06-18T17:01:18+00:00 iive wrote: I'm just a user, haunted by similar issues. 1. If you have zswap compiled and enabled, try to disable it. ZSwap uses some RAM in order to compress pages and hide them there. e.g. reserves 500MB with the idea it could put 1000MB swap in there. 2. I think that there is something very broken in "Transparent HugePage Support" or THP, especially at kernel-4.15 and later. A bug in 4.16 memory compaction led to very noticeable swap pressure. The issue has been fixed, but disabling THP also was able to workaround it. If you have major issues, try to build a kernel with that option disabled and see if the problem remains as severe. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/26 ------------------------------------------------------------------------ On 2018-06-19T16:25:32+00:00 korbin.freedman wrote: What happens for me at least with basically any system I use, is that Linux runs out of memory first, then SLAMS my swap disk with writes, which I believe overloads my system’s I/O susbsystem, and caused a complete system freeze. I have swapiness set to 60. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/27 ------------------------------------------------------------------------ On 2018-09-23T08:21:34+00:00 ultra10e wrote: Update: I'd commented in detail about this bug (comment 18.23 above). I run the live versions of Linux on a 4GB Core-i5 laptop (and another 4GB pentium laptop also.) Just wanted to add: I've added 4Gb of RAM to the Core-i5 laptop for 8Gb total now. This obviously helps a lot. Now some notes for 8GB RAM instances. With Fedora 28, the system will still cease up with maybe 2 dozen (or less depending on what's happening (video, etc) ) FF tabs opened/active. I came back here to note that, I'm currently using a Live Debian Stretch (9.5). There are obviously significant differences in the way these variants of Linux manage memory. Why? Because under the same system conditions (Gnome, same s/w programs installed and/or running), I can open WAY more tabs in FF on Debian; open more simultaneous programs, without fear of a sudden system heart- attack. In fact, it is much harder for me to cause the system freeze in Debian, even with approaching 50 tabs opened in FF developer 63... I understand there are underlying Fedora vs Debian system differences like: systemd vs init, and Wayland vs Xorg, Gnome versions (3.28.1 vs 3.22.5) and kernel revisions (4.16.3-301.fc28.x86_64 vs 4.9.110-1 (2018-07-05) ), but in all, I find Debian WAYYYY more forgiving, and more manageable, ESPECIALLY in light of this FATAL flaw, AND the known Gnome memory leak bug which can easily be remediated for in Debian by restarting Gnome (via Alt-F2, r) to free back up that memory. (The only way to accomplish this in Fedora is to actually log out of your session because of Wayland limitations.) Anyway, I just thought it's another data point to add to the mystery. I still have to keep resource monitor opened even in Stretch, just in case, but I only crashed Stretch once over the past 3 months or so when I was in the 80's (mem % used) and let a video play for 2 hrs without checking up. Normally anyway, that percentage isn't rising above the 70's in my typical "working" environment. Finally, I'd like to mention to those asking for logs, etc., for this issue, realize that WHEN this issue occurs, it *is* essentially a heart- attack for the system. There is no recourse, and no way to gather logs. EVERYTHING ceases up- usually never to come back. A hard power-cycle is the only recourse, and NO logs which would shed light on the issue are written. EVERYTHING stops- including log writing. This is the reality. I *do* have a few logs from and old (non-live) Jesse 8.7.1 install-- for a few times when, the system did revive, after hours-- and there's nothing in there that would shed light on the issue. The few entries in the log that I've researched pointed to no other instances/causes of this same issue. It would be nice after 11 or 12 years of this issue, if someone higher up and more knowledgeable in the development "food chain" would would simply replicate the issue, it's not really that hard to do so at all. It honestly is a show-stopper. Ciao. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/28 ------------------------------------------------------------------------ On 2018-12-21T04:49:16+00:00 korbin.freedman wrote: These kind of reports are all over the web. Please someone who knows how memory management works, fix this behavior Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/29 ------------------------------------------------------------------------ On 2018-12-25T10:01:23+00:00 iam wrote: I can confirm this regression. My system does not freeze with 4.9.140 and is overall totally usable on high memory usage and swap (right now 7.5 out of 8 GB RAM is used and 2 GB is swapped), but it becomes unresponsive with 4.19.10 when memory consumption is close to my RAM limit of 8 GB. My system is: Lenovo Thinkpad X220 laptop, Intel Core i7-2640M CPU (Sandy Bridge), Intel HD 3000 GPU, 8GB RAM, 8GB SWAP, SSD, UEFI mode. Fedora 29, kernels 4.9.140 (self-built), 4.19.10-300.fc29.x86_64 from repository. I'm using KDE Plasma 5. You can easily reproduce this issue with the following steps: 1. Add "mem=2G" to the kernel command line. If you use GRUB, press "e" button in the bootloader menu and append "mem=2G" to the "linux" line. Your system would be limited to 2 GB RAM. 2. Boot the system (press F10 in bootloader menu after editing) 3. Run several heavy applications: web browser, word processor, GIMP. Open gmail web interface in browser, it's also heavy. Expected result: The system (at least the GUI, including mouse cursor) does not freeze/hang, but properly utilizes swap, what could be seen with 4.9.140 kernel (before regression). System is usable, all applications are still accessible, the music plays without hiccups, you can continue doing your work. Actual result: The system (at least the GUI, including mouse cursor) freezes/hangs with 4.19.10 kernel, with a very little chance to recover by itself. Disk activity LED is constantly lid. System is unusable. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/30 ------------------------------------------------------------------------ On 2018-12-25T17:11:47+00:00 korbin.freedman wrote: This issue appears to slightly improve in 4.20. Some of the issue is that things like klogin are getting swapped, as well as other parts of the GUI. We need a way to tell the kernel what to swap out first (like Firefox tabs, or open programs). The kernel should also start swapping at say 65% memory usage very slowly, and then increase swapping as memory fills, not slamming swap at 95% usage. We also should make sure the OOM killer is doing its job, instead of waiting for alt SysRq f. As ram frees, swap should be unloaded immediately, slowly like MacOs does. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/31 ------------------------------------------------------------------------ On 2019-02-13T21:48:25+00:00 ultra10e wrote: I know @SlayerProof32 posted that this was rectified w kernel >4.17.5 in https://bugzilla.redhat.com/show_bug.cgi?id=1577528 , but the bug still exists. Easily reproducible. Tested on a 3GB desktop (Core 2 Quad), running Live Ubuntu 18.10 LTS off of a flash drive (pendrivelinux.com). Kernel 4.18.0-10. FF 63.0. I set the download folder to a dir on the hard drive so as not to deliberately stress free RAM. With the system showing about 1.5GB free (System Monitor), trying to d/l this 1.25GB ROM image from Mega ( https://mega.nz/#!KUAyRKjJ !3hALO7dkuyFdE41BTWf1OfHaZmdTA-Kzd8q0HYiMbYs ), the d/l gets to 100% but system monitor shows RAM at 97% or 98%, the flash drive lights up and stays lit. System frozen, as I've reported previously with my other laptops. I've mitigated this on the laptops because they now have 8GB of RAM. BUT-- even then, I can STILL crash those systems using Live Debian (or whatever flavor). It just takes more stressing (more open tabs, bigger d/l's, whatever) to get there, but it does. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/32 ------------------------------------------------------------------------ On 2019-03-10T22:40:29+00:00 iam wrote: How to reproduce (light): 1. Open web browser 2. Run the following command: stress --vm 1 --vm-hang 0 --vm-bytes "$(awk '/MemAvailable/ {print $2"000"}' /proc/meminfo)" 3. Navigate to https://www.tumblr.com/explore/trending in web browser Actual result: The system is very slow, almost unresponsive. HDD LED is constantly lid. How to reproduce (heavy): 1. Open web browser 2. Run the following command: stress --vm 1 --vm-hang 0 --vm-bytes "$(awk '/MemAvailable/ {print $2"000"}' /proc/meminfo)" 3. Navigate to https://mail.google.com/ in web browser, while being logged into Gmail. Actual result: The system is unresponsive. HDD LED is constantly lid. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/33 ------------------------------------------------------------------------ On 2019-03-31T15:03:22+00:00 xjjzyx wrote: Please pay special attention to the fact that you may need to know some knowledge and take a proper backup before trying the methods mentioned later. I have tested and found that the kernel 4.19.32 may not have this problem,the problem in kernel 5.0.5 is obvious. Under the kernel with this problem, I found a way, if the problem occurs, run the following command immediately under the appropriate permissions: sync && echo 3 > /proc/sys/vm/drop_caches && sync && echo 3 > /proc/sys/vm/drop_caches && sync && echo 3 > /proc/sys/vm/drop_caches This may reset some of the system's operations, which may affect the next few seconds of IO operation, but should alleviate the unresponsive situation. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/34 ------------------------------------------------------------------------ On 2019-06-04T07:04:39+00:00 dreamer.tan+kernel wrote: This problem is still happening as of kernel 5.0.17 (Fedora 30) - shouldn't this be reflected in "Kernel version" of this issue? Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/35 ------------------------------------------------------------------------ On 2019-06-04T10:38:33+00:00 iam wrote: Those who experience the issue, try to set the following sysctl settings: vm.swappiness=100 vm.watermark_scale_factor=200 It greatly helps on my PC. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/36 ------------------------------------------------------------------------ On 2019-06-29T13:59:55+00:00 lukycrociato wrote: I am experiencing the same on Ubuntu, when the system starts swapping even small quantities of RAM, it just locks up until I manually trigger the OOM via sysrq keys. This does not happen on FreeBSD and I can easily swap gigabytes of memory without even a single slowdown Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1833281/comments/56 ** Changed in: linux Status: Unknown => Confirmed ** Changed in: linux Importance: Unknown => Medium ** Bug watch added: Red Hat Bugzilla #1472336 https://bugzilla.redhat.com/show_bug.cgi?id=1472336 ** Bug watch added: Red Hat Bugzilla #1577528 https://bugzilla.redhat.com/show_bug.cgi?id=1577528 ** Bug watch added: Linux Kernel Bug Tracker #199763 https://bugzilla.kernel.org/show_bug.cgi?id=199763 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1833281 Title: System freeze when memory is put on SWAP in Linux >4.10.x Status in Linux: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: I'm reporting this since it's reproduceable the 70% of the time. Summary: In different circumstances, when the systems starts to swap out RAM memory, even small amounts, the system becomes completely unusuable and the screen freezes up, no mouse movement, no TTY access or SSH access can be made, only SYSRQ keys seem to do something (only reboot, so REISUB worked so far though, OOM is useless since the memory/swap is not even full) The I/O Disk led is stuck to 100% in ALL the following cases when this happens. So far: - This happens even when only ZRAM is enabled, and no swap partition is used. - Happens when ZSWAP is used with a swap partition - Happens also when a partition without zram or zswap is used - Maybe it's AMD specific? However, I'm not experiencing this on my laptop using the same tests. My laptop is an Intel one, while my desktop is an AMD Ryzen platform. Here are the specs: CPU: AMD Ryzen 5 1600 no OC GPU: AMD RX 580 8GB SSD: Crucial MX500 500GB MOBO: MSI B350M Grenade RAM: 8GB HyperX Kingston 2667Mhz Ubuntu version: 18.04 LTS, backports repo enabled Kernel version: 4.18.0-18, official ubuntu repo Bios settings: Default Additional info: Maybe I'm not 100% sure, but I noticed when using the 5.0.0-17 generic kernel, the lockups seem to still happen, but they recover eventually. Happened only a few times though... But will always be frozen for at least 30 seconds, differently from my intel laptop where those do occur. The SSD make is the same. I bought two of these, they got also the same amount of RAM. In my laptop those do not occur at all. Swapping memory even huge quantities like 1GB or more, do not produce any issues. Tests made: For testing this behaviour I tried: - Compiling the chromium-browser source code (takes up a lot of system RAM) - Used the "stress" command, using a specific amount of memory to decide how many it will be swapped, and here I noticed that even small quantities like a couple of megabytes will cause the system to freeze the 70% of the times Example: "stress --vm 1 --vm-bytes=7G" What should happen: I expect system slowdowns when swapping out memory since I do not have enough RAM, but unlikely when using Windows or my laptop with the same Linux version, not a completely unusuable environment. The swap partition is in both cases on an SSD. Reproduceability: 70% of the times Additional info again: I'm not sure this is due to any hardware failure, my SSD health is fine, as my CPU and RAM. As I said swapping in Windows works fine... --- ProblemType: Bug ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC1: haru 2076 F.... pulseaudio /dev/snd/controlC2: haru 2076 F.... pulseaudio /dev/snd/controlC0: haru 2076 F.... pulseaudio CurrentDesktop: communitheme:ubuntu:GNOME DistroRelease: Ubuntu 18.04 IwConfig: enp24s0 no wireless extensions. lo no wireless extensions. MachineType: Micro-Star International Co., Ltd. MS-7A37 Package: linux (not installed) ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-5.0.0-17-generic root=UUID=75d45574-7169-4653-aea3-9f95087f0806 ro rootflags=subvol=@ quiet splash vt.handoff=1 ProcVersionSignature: Ubuntu 5.0.0-17.18~18.04.1-generic 5.0.8 RelatedPackageVersions: linux-restricted-modules-5.0.0-17-generic N/A linux-backports-modules-5.0.0-17-generic N/A linux-firmware 1.173.6 RfKill: 0: hci0: Bluetooth Soft blocked: no Hard blocked: no Tags: bionic Uname: Linux 5.0.0-17-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: sudo video WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/22/2019 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 1.K0 dmi.board.asset.tag: To be filled by O.E.M. dmi.board.name: B350M MORTAR (MS-7A37) dmi.board.vendor: Micro-Star International Co., Ltd. dmi.board.version: 1.0 dmi.chassis.asset.tag: To be filled by O.E.M. dmi.chassis.type: 4 dmi.chassis.vendor: Micro-Star International Co., Ltd. dmi.chassis.version: 1.0 dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1.K0:bd01/22/2019:svnMicro-StarInternationalCo.,Ltd.:pnMS-7A37:pvr1.0:rvnMicro-StarInternationalCo.,Ltd.:rnB350MMORTAR(MS-7A37):rvr1.0:cvnMicro-StarInternationalCo.,Ltd.:ct4:cvr1.0: dmi.product.family: To be filled by O.E.M. dmi.product.name: MS-7A37 dmi.product.sku: To be filled by O.E.M. dmi.product.version: 1.0 dmi.sys.vendor: Micro-Star International Co., Ltd. To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1833281/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp