[qubes-users] Re: Incredible HD thrashing on 4.0
On Friday, August 17, 2018 at 2:57:31 AM UTC+2, Marcus Linsner wrote: > "For example, consider a case where you have zero swap and system is nearly > running out of RAM. The kernel will take memory from e.g. Firefox (it can do > this because Firefox is running executable code that has been loaded from > disk - the code can be loaded from disk again if needed). If Firefox then > needs to access that RAM again N seconds later, the CPU generates "hard > fault" which forces Linux to free some RAM (e.g. take some RAM from another > process), load the missing data from disk and then allow Firefox to continue > as usual. This is pretty similar to normal swapping and kswapd0 does it. " - > Mikko Rantalainen Feb 15 at 13:08 Good news: no more disk thrashing with this patch [1] (also attached) and I'm keeping track of how to properly get rid of this disk thrashing in this [2]. Bad news: I made the patch and I've no idea how good it is(since I am noob :D) and what are the side-effects of using it. Likely a better patch can be made! (but none who know how to do it right have answered/helped yet :D so ... it's, for me, better than nothing) I'm not going to post here anymore, to allow OP to be answered (since, it seems to be a different issue) [1] https://github.com/constantoverride/qubes-linux-kernel/blob/acd686a5019c7ab6ec10dc457bdee4830e2d741f/patches.addon/le9b.patch [2] https://stackoverflow.com/q/52067753/10239615 -- You received this message because you are subscribed to the Google Groups "qubes-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to qubes-users+unsubscr...@googlegroups.com. To post to this group, send email to qubes-users@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-users/7b9ba803-0e87-4525-9d8e-2f256ffc5122%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 32699b2..7636498 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -208,7 +208,7 @@ enum lru_list { #define for_each_lru(lru) for (lru = 0; lru < NR_LRU_LISTS; lru++) -#define for_each_evictable_lru(lru) for (lru = 0; lru <= LRU_ACTIVE_FILE; lru++) +#define for_each_evictable_lru(lru) for (lru = 0; lru <= LRU_INACTIVE_FILE; lru++) static inline int is_file_lru(enum lru_list lru) { diff --git a/mm/vmscan.c b/mm/vmscan.c index 03822f8..1f3ffb5 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2234,7 +2234,7 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, anon = lruvec_lru_size(lruvec, LRU_ACTIVE_ANON, MAX_NR_ZONES) + lruvec_lru_size(lruvec, LRU_INACTIVE_ANON, MAX_NR_ZONES); - file = lruvec_lru_size(lruvec, LRU_ACTIVE_FILE, MAX_NR_ZONES) + + file = //lruvec_lru_size(lruvec, LRU_ACTIVE_FILE, MAX_NR_ZONES) + lruvec_lru_size(lruvec, LRU_INACTIVE_FILE, MAX_NR_ZONES); spin_lock_irq(>lru_lock); @@ -2345,7 +2345,7 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc sc->priority == DEF_PRIORITY); blk_start_plug(); - while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] || + while (nr[LRU_INACTIVE_ANON] || //nr[LRU_ACTIVE_FILE] || nr[LRU_INACTIVE_FILE]) { unsigned long nr_anon, nr_file, percentage; unsigned long nr_scanned; @@ -2372,7 +2372,8 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc * stop reclaiming one LRU and reduce the amount scanning * proportional to the original scan target. */ - nr_file = nr[LRU_INACTIVE_FILE] + nr[LRU_ACTIVE_FILE]; + nr_file = nr[LRU_INACTIVE_FILE] //+ nr[LRU_ACTIVE_FILE] + ; nr_anon = nr[LRU_INACTIVE_ANON] + nr[LRU_ACTIVE_ANON]; /* @@ -2391,7 +2392,8 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc percentage = nr_anon * 100 / scan_target; } else { unsigned long scan_target = targets[LRU_INACTIVE_FILE] + - targets[LRU_ACTIVE_FILE] + 1; + //targets[LRU_ACTIVE_FILE] + + 1; lru = LRU_FILE; percentage = nr_file * 100 / scan_target; }
Re: [qubes-users] Re: Incredible HD thrashing on 4.0
On a NUMA system it could also be swapping pages from an efficient node to a less efficient distant node. -- You received this message because you are subscribed to the Google Groups "qubes-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to qubes-users+unsubscr...@googlegroups.com. To post to this group, send email to qubes-users@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-users/d3a0006c-ef4c-ed8e-05d3-870437e01f82%40gmx.com. For more options, visit https://groups.google.com/d/optout. 0xDF372A17.asc Description: application/pgp-keys
[qubes-users] Re: Incredible HD thrashing on 4.0
On Thursday, August 16, 2018 at 10:06:54 PM UTC+2, Brendan Hoar wrote: > On Thursday, August 16, 2018 at 3:21:27 PM UTC-4, Marcus Linsner wrote: > > The good news is that I've realized that the OOM triggering was legit: I > > had firefox set to use 12 cores at once and 14GiB of RAM was clearly not > > enough! (8 and no ccache was good though - did compile it twice like so) > > > > The bad news is that I still don't know why the disk-read thrashing was > > happening for me, but I will default to blame the OOM (even though no swap > > was active, ie. I swapoff-ed the swap partition earlier) due to previous > > experience with OOM triggering on bare-metal hardware: I seem to remember > > SSD disk activity led being full-on during an impending OOM and everything > > freezing! > > Maybe this applies: > > https://askubuntu.com/questions/432809/why-is-kswapd0-running-on-a-computer-with-no-swap > > [[if kswapd0 is taking any CPU and you do not have swap, the system is nearly > out of RAM and is trying to deal with the situation by (in practise) swapping > pages from executables. The correct fix is to reduce workload, add swap or > (preferably) install more RAM. Adding swap will improve performance because > kernel will have more options about what to swap to disk. Without swap the > kernel is practically forced to swap application code.]] > > This could be a reason you only see reads hammering the drive, maybe? > > Also worth remembering: every read is decrypting block(s) which takes some > CPU (even on systems with AES-NI support). > > Brendan Thank you Brendan! The following comment(from the webpage that you linked) explained the constant disk-reading best for me: "For example, consider a case where you have zero swap and system is nearly running out of RAM. The kernel will take memory from e.g. Firefox (it can do this because Firefox is running executable code that has been loaded from disk - the code can be loaded from disk again if needed). If Firefox then needs to access that RAM again N seconds later, the CPU generates "hard fault" which forces Linux to free some RAM (e.g. take some RAM from another process), load the missing data from disk and then allow Firefox to continue as usual. This is pretty similar to normal swapping and kswapd0 does it. " - Mikko Rantalainen Feb 15 at 13:08 $ sysctl vm.swappiness vm.swappiness = 60 In retrospect, I apologize for hijacking this thread, because it now appears to me that my issue is totally different from the OP(even though the subject still applies): On Friday, August 10, 2018 at 9:02:31 PM UTC+2, Kelly Dean wrote: > Has anybody else used both Qubes 3.2 and 4.0 on a system with a HD, not SSD? > Have you noticed the disk thrashing to be far worse under 4.0? I suspect it > might have something to do with the new use of LVM combining snapshots with > thin provisioning. > > The problem seems to be triggered by individual qubes doing ordinary bursts > of disk access, such as loading a program or accessing swap, which would > normally take just a few seconds on Qubes 3.2, but dom0 then massively > multiplies that I/O on Qubes 4.0, leading to disk thrashing that drags on for > minutes at a time, and in some cases, more than an hour. > > iotop in dom0 says the thrashing procs are e.g. [21.xvda-0] and [21.xvda-1], > reading the disk at rates ranging from 10 to 50 MBps (max throughput of the > disk is about 100). At this rate, for how prolonged the thrashing is, it > could have read and re-read the entire virtual disk multiple times over, so > there's something extremely inefficient going on. > > Is there any solution other than installing a SSD? I'd prefer not to have to > add hardware to solve a software performance regression. -- You received this message because you are subscribed to the Google Groups "qubes-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to qubes-users+unsubscr...@googlegroups.com. To post to this group, send email to qubes-users@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-users/c184a781-3883-443a-b719-6b6817a4de7d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[qubes-users] Re: Incredible HD thrashing on 4.0
On Thursday, August 16, 2018 at 3:21:27 PM UTC-4, Marcus Linsner wrote: > The good news is that I've realized that the OOM triggering was legit: I had > firefox set to use 12 cores at once and 14GiB of RAM was clearly not enough! > (8 and no ccache was good though - did compile it twice like so) > > The bad news is that I still don't know why the disk-read thrashing was > happening for me, but I will default to blame the OOM (even though no swap > was active, ie. I swapoff-ed the swap partition earlier) due to previous > experience with OOM triggering on bare-metal hardware: I seem to remember SSD > disk activity led being full-on during an impending OOM and everything > freezing! Maybe this applies: https://askubuntu.com/questions/432809/why-is-kswapd0-running-on-a-computer-with-no-swap [[if kswapd0 is taking any CPU and you do not have swap, the system is nearly out of RAM and is trying to deal with the situation by (in practise) swapping pages from executables. The correct fix is to reduce workload, add swap or (preferably) install more RAM. Adding swap will improve performance because kernel will have more options about what to swap to disk. Without swap the kernel is practically forced to swap application code.]] This could be a reason you only see reads hammering the drive, maybe? Also worth remembering: every read is decrypting block(s) which takes some CPU (even on systems with AES-NI support). Brendan -- You received this message because you are subscribed to the Google Groups "qubes-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to qubes-users+unsubscr...@googlegroups.com. To post to this group, send email to qubes-users@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-users/0be4cbfa-1899-4d6a-b0c0-bd1994482553%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[qubes-users] Re: Incredible HD thrashing on 4.0
On Thursday, August 16, 2018 at 8:03:52 PM UTC+2, Marcus Linsner wrote: > On Thursday, August 16, 2018 at 7:50:14 PM UTC+2, Marcus Linsner wrote: > > On Thursday, August 16, 2018 at 7:35:26 PM UTC+2, Marcus Linsner wrote: > > > $ cat /proc/meminfo > > > MemTotal:7454500 kB > > > MemFree: 5635088 kB > > > MemAvailable:6574676 kB > > > Buffers: 53832 kB > > > Cached: 1094368 kB > > > SwapCached:0 kB > > > Active: 724832 kB > > > Inactive: 747696 kB > > > Active(anon): 233816 kB > > > Inactive(anon):95768 kB > > > Active(file): 491016 kB > > > Inactive(file): 651928 kB > > > Unevictable: 73568 kB > > > Mlocked: 73568 kB > > > SwapTotal: 0 kB > > > SwapFree: 0 kB > > > Dirty: 292 kB > > > Writeback: 0 kB > > > AnonPages:398016 kB > > > Mapped:54320 kB > > > Shmem: 5256 kB > > > Slab: 134680 kB > > > SReclaimable: 74124 kB > > > SUnreclaim:60556 kB > > > KernelStack:4800 kB > > > PageTables:10524 kB > > > NFS_Unstable: 0 kB > > > Bounce:0 kB > > > WritebackTmp: 0 kB > > > CommitLimit: 3727248 kB > > > Committed_AS:1332236 kB > > > VmallocTotal: 34359738367 kB > > > VmallocUsed: 0 kB > > > VmallocChunk: 0 kB > > > HardwareCorrupted: 0 kB > > > AnonHugePages: 0 kB > > > ShmemHugePages:0 kB > > > ShmemPmdMapped:0 kB > > > CmaTotal: 0 kB > > > CmaFree: 0 kB > > > HugePages_Total: 0 > > > HugePages_Free:0 > > > HugePages_Rsvd:0 > > > HugePages_Surp:0 > > > Hugepagesize: 2048 kB > > > DirectMap4k: 327644 kB > > > DirectMap2M:14008320 kB > > > DirectMap1G: 0 kB > > > > I resumed the firefox compilation and noticed that the memory jumped back > > to 14GB again - I was sure it was more than that 7.4GB before: > > > > $ cat /proc/meminfo > > MemTotal: 14003120 kB > > MemFree: 4602448 kB > > MemAvailable:6622252 kB > > Buffers: 186220 kB > > Cached: 1986192 kB > > SwapCached:0 kB > > Active: 7482024 kB > > Inactive:1448656 kB > > Active(anon):6667828 kB > > Inactive(anon):95780 kB > > Active(file): 814196 kB > > Inactive(file): 1352876 kB > > Unevictable: 73568 kB > > Mlocked: 73568 kB > > SwapTotal: 0 kB > > SwapFree: 0 kB > > Dirty:306392 kB > > Writeback: 4684 kB > > AnonPages: 6811888 kB > > Mapped: 199164 kB > > Shmem: 5340 kB > > Slab: 239524 kB > > SReclaimable: 177620 kB > > SUnreclaim:61904 kB > > KernelStack:5968 kB > > PageTables:28612 kB > > NFS_Unstable: 0 kB > > Bounce:0 kB > > WritebackTmp: 0 kB > > CommitLimit: 7001560 kB > > Committed_AS:8571548 kB > > VmallocTotal: 34359738367 kB > > VmallocUsed: 0 kB > > VmallocChunk: 0 kB > > HardwareCorrupted: 0 kB > > AnonHugePages: 0 kB > > ShmemHugePages:0 kB > > ShmemPmdMapped:0 kB > > CmaTotal: 0 kB > > CmaFree: 0 kB > > HugePages_Total: 0 > > HugePages_Free:0 > > HugePages_Rsvd:0 > > HugePages_Surp:0 > > Hugepagesize: 2048 kB > > DirectMap4k: 327644 kB > > DirectMap2M:14008320 kB > > DirectMap1G: 0 kB > > > > > > Oh man, I'm hitting that disk thrashing again after just a few minutes: > > 202MiB/sec reading, 0.0 writing. > > > > Paused qube, reading stopped. > > Resumed qube sooner than before and it's still thrashing... > > > > It'a a fedora 28 template-based VM. > > > > I shut down another VM and I thought dom0 crashed because it froze for like > > 10 sec before the notification message told me that that VM stopped. > > Ok, I caught kswapd0 at 14% in a 'top' terminal on the offending qube, before > the disk thrashing begun(which froze all terminals too) and then the only > process at 100% after disk thrashing stopped! and here's the continuation of > the log, btw the thrashing only stopped after OOM killed the rustc > process(which, my guess was triggeding kswapd0 to use 100% cpu or what): > > [ 6871.435899] systemd-coredum: 4 output lines suppressed due to ratelimiting > [ 6871.485869] audit: type=1130 audit(1534438842.909:179): pid=1 uid=0 > auid=4294967295 ses=4294967295 msg='unit=systemd-logind comm="systemd" > exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed' > [ 6871.486357] audit: type=1130 audit(1534438842.910:180): pid=1 uid=0 > auid=4294967295 ses=4294967295 msg='unit=systemd-logind comm="systemd" > exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' > [ 7076.504392] kauditd_printk_skb: 5 callbacks suppressed > [
[qubes-users] Re: Incredible HD thrashing on 4.0
On Thursday, August 16, 2018 at 7:50:14 PM UTC+2, Marcus Linsner wrote: > On Thursday, August 16, 2018 at 7:35:26 PM UTC+2, Marcus Linsner wrote: > > $ cat /proc/meminfo > > MemTotal:7454500 kB > > MemFree: 5635088 kB > > MemAvailable:6574676 kB > > Buffers: 53832 kB > > Cached: 1094368 kB > > SwapCached:0 kB > > Active: 724832 kB > > Inactive: 747696 kB > > Active(anon): 233816 kB > > Inactive(anon):95768 kB > > Active(file): 491016 kB > > Inactive(file): 651928 kB > > Unevictable: 73568 kB > > Mlocked: 73568 kB > > SwapTotal: 0 kB > > SwapFree: 0 kB > > Dirty: 292 kB > > Writeback: 0 kB > > AnonPages:398016 kB > > Mapped:54320 kB > > Shmem: 5256 kB > > Slab: 134680 kB > > SReclaimable: 74124 kB > > SUnreclaim:60556 kB > > KernelStack:4800 kB > > PageTables:10524 kB > > NFS_Unstable: 0 kB > > Bounce:0 kB > > WritebackTmp: 0 kB > > CommitLimit: 3727248 kB > > Committed_AS:1332236 kB > > VmallocTotal: 34359738367 kB > > VmallocUsed: 0 kB > > VmallocChunk: 0 kB > > HardwareCorrupted: 0 kB > > AnonHugePages: 0 kB > > ShmemHugePages:0 kB > > ShmemPmdMapped:0 kB > > CmaTotal: 0 kB > > CmaFree: 0 kB > > HugePages_Total: 0 > > HugePages_Free:0 > > HugePages_Rsvd:0 > > HugePages_Surp:0 > > Hugepagesize: 2048 kB > > DirectMap4k: 327644 kB > > DirectMap2M:14008320 kB > > DirectMap1G: 0 kB > > I resumed the firefox compilation and noticed that the memory jumped back to > 14GB again - I was sure it was more than that 7.4GB before: > > $ cat /proc/meminfo > MemTotal: 14003120 kB > MemFree: 4602448 kB > MemAvailable:6622252 kB > Buffers: 186220 kB > Cached: 1986192 kB > SwapCached:0 kB > Active: 7482024 kB > Inactive:1448656 kB > Active(anon):6667828 kB > Inactive(anon):95780 kB > Active(file): 814196 kB > Inactive(file): 1352876 kB > Unevictable: 73568 kB > Mlocked: 73568 kB > SwapTotal: 0 kB > SwapFree: 0 kB > Dirty:306392 kB > Writeback: 4684 kB > AnonPages: 6811888 kB > Mapped: 199164 kB > Shmem: 5340 kB > Slab: 239524 kB > SReclaimable: 177620 kB > SUnreclaim:61904 kB > KernelStack:5968 kB > PageTables:28612 kB > NFS_Unstable: 0 kB > Bounce:0 kB > WritebackTmp: 0 kB > CommitLimit: 7001560 kB > Committed_AS:8571548 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 0 kB > VmallocChunk: 0 kB > HardwareCorrupted: 0 kB > AnonHugePages: 0 kB > ShmemHugePages:0 kB > ShmemPmdMapped:0 kB > CmaTotal: 0 kB > CmaFree: 0 kB > HugePages_Total: 0 > HugePages_Free:0 > HugePages_Rsvd:0 > HugePages_Surp:0 > Hugepagesize: 2048 kB > DirectMap4k: 327644 kB > DirectMap2M:14008320 kB > DirectMap1G: 0 kB > > > Oh man, I'm hitting that disk thrashing again after just a few minutes: > 202MiB/sec reading, 0.0 writing. > > Paused qube, reading stopped. > Resumed qube sooner than before and it's still thrashing... > > It'a a fedora 28 template-based VM. > > I shut down another VM and I thought dom0 crashed because it froze for like > 10 sec before the notification message told me that that VM stopped. Ok, I caught kswapd0 at 14% in a 'top' terminal on the offending qube, before the disk thrashing begun(which froze all terminals too) and then the only process at 100% after disk thrashing stopped! and here's the continuation of the log, btw the thrashing only stopped after OOM killed the rustc process(which, my guess was triggeding kswapd0 to use 100% cpu or what): [ 6871.435899] systemd-coredum: 4 output lines suppressed due to ratelimiting [ 6871.485869] audit: type=1130 audit(1534438842.909:179): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-logind comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed' [ 6871.486357] audit: type=1130 audit(1534438842.910:180): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-logind comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 7076.504392] kauditd_printk_skb: 5 callbacks suppressed [ 7076.504393] audit: type=1101 audit(1534439047.928:186): pid=5658 uid=1000 auid=1000 ses=1 msg='op=PAM:accounting grantors=pam_unix acct="user" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/2 res=success' [ 7076.504502] audit: type=1123 audit(1534439047.928:187): pid=5658 uid=1000 auid=1000 ses=1 msg='cwd="/home/user"
[qubes-users] Re: Incredible HD thrashing on 4.0
On Thursday, August 16, 2018 at 7:35:26 PM UTC+2, Marcus Linsner wrote: > $ cat /proc/meminfo > MemTotal:7454500 kB > MemFree: 5635088 kB > MemAvailable:6574676 kB > Buffers: 53832 kB > Cached: 1094368 kB > SwapCached:0 kB > Active: 724832 kB > Inactive: 747696 kB > Active(anon): 233816 kB > Inactive(anon):95768 kB > Active(file): 491016 kB > Inactive(file): 651928 kB > Unevictable: 73568 kB > Mlocked: 73568 kB > SwapTotal: 0 kB > SwapFree: 0 kB > Dirty: 292 kB > Writeback: 0 kB > AnonPages:398016 kB > Mapped:54320 kB > Shmem: 5256 kB > Slab: 134680 kB > SReclaimable: 74124 kB > SUnreclaim:60556 kB > KernelStack:4800 kB > PageTables:10524 kB > NFS_Unstable: 0 kB > Bounce:0 kB > WritebackTmp: 0 kB > CommitLimit: 3727248 kB > Committed_AS:1332236 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 0 kB > VmallocChunk: 0 kB > HardwareCorrupted: 0 kB > AnonHugePages: 0 kB > ShmemHugePages:0 kB > ShmemPmdMapped:0 kB > CmaTotal: 0 kB > CmaFree: 0 kB > HugePages_Total: 0 > HugePages_Free:0 > HugePages_Rsvd:0 > HugePages_Surp:0 > Hugepagesize: 2048 kB > DirectMap4k: 327644 kB > DirectMap2M:14008320 kB > DirectMap1G: 0 kB I resumed the firefox compilation and noticed that the memory jumped back to 14GB again - I was sure it was more than that 7.4GB before: $ cat /proc/meminfo MemTotal: 14003120 kB MemFree: 4602448 kB MemAvailable:6622252 kB Buffers: 186220 kB Cached: 1986192 kB SwapCached:0 kB Active: 7482024 kB Inactive:1448656 kB Active(anon):6667828 kB Inactive(anon):95780 kB Active(file): 814196 kB Inactive(file): 1352876 kB Unevictable: 73568 kB Mlocked: 73568 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty:306392 kB Writeback: 4684 kB AnonPages: 6811888 kB Mapped: 199164 kB Shmem: 5340 kB Slab: 239524 kB SReclaimable: 177620 kB SUnreclaim:61904 kB KernelStack:5968 kB PageTables:28612 kB NFS_Unstable: 0 kB Bounce:0 kB WritebackTmp: 0 kB CommitLimit: 7001560 kB Committed_AS:8571548 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB ShmemHugePages:0 kB ShmemPmdMapped:0 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free:0 HugePages_Rsvd:0 HugePages_Surp:0 Hugepagesize: 2048 kB DirectMap4k: 327644 kB DirectMap2M:14008320 kB DirectMap1G: 0 kB Oh man, I'm hitting that disk thrashing again after just a few minutes: 202MiB/sec reading, 0.0 writing. Paused qube, reading stopped. Resumed qube sooner than before and it's still thrashing... It'a a fedora 28 template-based VM. I shut down another VM and I thought dom0 crashed because it froze for like 10 sec before the notification message told me that that VM stopped. -- You received this message because you are subscribed to the Google Groups "qubes-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to qubes-users+unsubscr...@googlegroups.com. To post to this group, send email to qubes-users@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-users/a0e7f45a-29fa-4330-ab43-eb4f31511bce%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[qubes-users] Re: Incredible HD thrashing on 4.0
On Friday, August 10, 2018 at 9:02:31 PM UTC+2, Kelly Dean wrote: > Has anybody else used both Qubes 3.2 and 4.0 on a system with a HD, not SSD? > Have you noticed the disk thrashing to be far worse under 4.0? I suspect it > might have something to do with the new use of LVM combining snapshots with > thin provisioning. > > The problem seems to be triggered by individual qubes doing ordinary bursts > of disk access, such as loading a program or accessing swap, which would > normally take just a few seconds on Qubes 3.2, but dom0 then massively > multiplies that I/O on Qubes 4.0, leading to disk thrashing that drags on for > minutes at a time, and in some cases, more than an hour. > > iotop in dom0 says the thrashing procs are e.g. [21.xvda-0] and [21.xvda-1], > reading the disk at rates ranging from 10 to 50 MBps (max throughput of the > disk is about 100). At this rate, for how prolonged the thrashing is, it > could have read and re-read the entire virtual disk multiple times over, so > there's something extremely inefficient going on. > > Is there any solution other than installing a SSD? I'd prefer not to have to > add hardware to solve a software performance regression. Interestingly, I've just encountered this thrashing, but on SSD(it's just reading 192MiB/sec constantly), Qubes R4.0 up to date, inside a qube while compiling firefox: typing in any of 3 of its terminal windows does not even echo anything and the firefox compilation terminal is frozen; the swap (of 1G) was turned off a while ago (via swapoff); I used Qube Manager to Pause the offending cube and the thrashing stopped. I don't see much on logs. Ok so I resumed the qube, the thrashing resumed for a few seconds then stopped and all terminals were alive again (I can type into them). The log spewed some new things (since the updatedb audit which was last while Paused), I'm including some long lines from before, note that the log after the unpause starts from "[ 6862.846945] INFO: rcu_sched self-detected stall on CPU", as follows: [0.00] Linux version 4.14.57-1.pvops.qubes.x86_64 (user@build-fedora4) (gcc version 6.4.1 20170727 (Red Hat 6.4.1-1) (GCC)) #1 SMP Mon Jul 23 16:28:54 UTC 2018 [0.00] Command line: root=/dev/mapper/dmroot ro nomodeset console=hvc0 rd_NO_PLYMOUTH rd.plymouth.enable=0 plymouth.enable=0 nopat ... [ 2769.581919] audit: type=1101 audit(1534434741.005:133): pid=10290 uid=1000 auid=1000 ses=1 msg='op=PAM:accounting grantors=pam_unix acct="user" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/3 res=success' [ 2769.582396] audit: type=1123 audit(1534434741.005:134): pid=10290 uid=1000 auid=1000 ses=1 msg='cwd="/home/user" cmd=737761706F202F6465762F7876646331 terminal=pts/3 res=success' [ 2769.582525] audit: type=1110 audit(1534434741.006:135): pid=10290 uid=0 auid=1000 ses=1 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/3 res=success' [ 2769.583384] audit: type=1105 audit(1534434741.007:136): pid=10290 uid=0 auid=1000 ses=1 msg='op=PAM:session_open grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_systemd,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/3 res=success' [ 2776.388700] audit: type=1106 audit(1534434747.812:137): pid=10290 uid=0 auid=1000 ses=1 msg='op=PAM:session_close grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_systemd,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/3 res=success' [ 2776.388735] audit: type=1104 audit(1534434747.812:138): pid=10290 uid=0 auid=1000 ses=1 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/3 res=success' [ 4093.008056] audit: type=1116 audit(1534436064.432:139): pid=29167 uid=0 auid=4294967295 ses=4294967295 msg='op=add-group id=982 exe="/usr/sbin/groupadd" hostname=? addr=? terminal=? res=success' [ 4093.030620] audit: type=1132 audit(1534436064.454:140): pid=29167 uid=0 auid=4294967295 ses=4294967295 msg='op=add-shadow-group id=982 exe="/usr/sbin/groupadd" hostname=? addr=? terminal=? res=success' [ 4093.304708] audit: type=1130 audit(1534436064.728:141): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=run-rfbdacad57c5f4bc183d36a7c402c9ae7 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 4094.576065] audit: type=1130 audit(1534436065.999:142): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=man-db-cache-update comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 4094.576138] audit: type=1131 audit(1534436065.999:143): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=man-db-cache-update comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 4094.577822] audit: type=1131 audit(1534436066.001:144): pid=1 uid=0 auid=4294967295 ses=4294967295
[qubes-users] Re: Incredible HD thrashing on 4.0
On Saturday, August 11, 2018 at 3:02:31 AM UTC+8, Kelly Dean wrote: > Has anybody else used both Qubes 3.2 and 4.0 on a system with a HD, not SSD? > Have you noticed the disk thrashing to be far worse under 4.0? I suspect it > might have something to do with the new use of LVM combining snapshots with > thin provisioning. > > The problem seems to be triggered by individual qubes doing ordinary bursts > of disk access, such as loading a program or accessing swap, which would > normally take just a few seconds on Qubes 3.2, but dom0 then massively > multiplies that I/O on Qubes 4.0, leading to disk thrashing that drags on for > minutes at a time, and in some cases, more than an hour. > > iotop in dom0 says the thrashing procs are e.g. [21.xvda-0] and [21.xvda-1], > reading the disk at rates ranging from 10 to 50 MBps (max throughput of the > disk is about 100). At this rate, for how prolonged the thrashing is, it > could have read and re-read the entire virtual disk multiple times over, so > there's something extremely inefficient going on. > > Is there any solution other than installing a SSD? I'd prefer not to have to > add hardware to solve a software performance regression. Same here for me, I hear lots of scratching sounds from the HDD whenever I do something in the laptop. Extremely worries me that the HDD might die soon because of it D: -- You received this message because you are subscribed to the Google Groups "qubes-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to qubes-users+unsubscr...@googlegroups.com. To post to this group, send email to qubes-users@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-users/182578a5-3cad-4dbe-9cdc-94a78cc58930%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.