More data on 2.4.5 VM issues
This is 2.4.5 with Andrea Arcangeli's aa1 patch compiled with himem: Why is kswapd using so much CPU? If you reboot the machine and run the same user process kswapd CPU usage is almost 0% and none of the swap is used. This machine was upgraded from 2.2 and we did not have the luxury of re-partitioning it support the "new" 2.4 swap size requirements. After running for a few days with relatively constant memory usage: vmstat: procs memoryswap io system cpu r b w swpd free buff cache si sobibo incs us sy id 2 0 1 136512 5408504 209744 0 0 0 2 1949 10 26 64 top: 5:38pm up 3 days, 19:44, 2 users, load average: 2.08, 2.13, 2.15 34 processes: 32 sleeping, 2 running, 0 zombie, 0 stopped CPU0 states: 16.0% user, 56.4% system, 16.2% nice, 26.3% idle CPU1 states: 11.1% user, 57.0% system, 11.0% nice, 31.3% idle Mem: 1028804K av, 1023744K used,5060K free, 0K shrd, 504K buff Swap: 136512K av, 136512K used, 0K free 209876K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 28442 root 18 10 898M 812M 36188 R N 56.0 80.9 296:12 gateway.smart.5 28438 root 16 10 898M 811M 35084 S N 43.7 80.7 291:03 gateway.smart.5 5 root 9 0 00 0 SW 37.6 0.0 164:58 kswapd 2509 root 18 0 492 492 300 R 2.5 0.0 0:00 top 1 root 9 0680 0 SW0.0 0.0 0:08 init 2 root 9 0 00 0 SW0.0 0.0 0:00 keventd 3 root 19 19 00 0 SWN 0.0 0.0 1:11 ksoftirqd_CPU0 4 root 19 19 00 0 SWN 0.0 0.0 1:04 ksoftirqd_CPU1 6 root 9 0 00 0 SW0.0 0.0 0:00 kreclaimd 7 root 9 0 00 0 SW0.0 0.0 0:00 bdflush 8 root 9 0 00 0 SW0.0 0.0 0:07 kupdated 11 root 9 0 00 0 SW0.0 0.0 0:00 scsi_eh_0 315 root 9 0 1000 0 SW0.0 0.0 0:00 syslogd Hope this helps - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.4.1ac3 vs 2.4.1ac8
Basic Machine configuration: SMP Supermicro board 2 gigabytes of ECC Registered ram Adaptec AIC-7892 eepro100 onboard nic The machine has been running as a database server with no MySQL crashes for several months and has run fine with kernels 2.2.18 and 2.4.1ac3. We have seen a HUGE improvement in the processing power and file access from kernel 2.4.1ac3 to 2.4.1ac8, but MySQL crashes every few hours with the following error on 2.4.1ac8: mysqld version: 3.23.32 mysqld got signal 11; The manual section 'Debugging a MySQL server' tells you how to use a stack trace and/or the core file to produce a readable backtrace that may help in finding out why mysqld died Attemping backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong Bogus stack limit or frame pointer, aborting backtrace With 2.4.1.ac8 syslog has been spitting out the following errors: Feb 8 23:12:38 db1 kernel: __alloc_pages: 0-order allocation failed. Feb 8 23:34:54 db1 kernel: __alloc_pages: 2-order allocation failed. Feb 8 23:34:54 db1 kernel: IP: queue_glue: no memory for gluing queue ef1c1de0 Feb 8 23:34:55 db1 kernel: __alloc_pages: 2-order allocation failed. Feb 8 23:34:55 db1 kernel: IP: queue_glue: no memory for gluing queue ef1c1ee0 Feb 8 23:34:56 db1 kernel: __alloc_pages: 2-order allocation failed. Feb 8 23:34:56 db1 kernel: IP: queue_glue: no memory for gluing queue ef1c1160 Feb 8 23:34:59 db1 kernel: __alloc_pages: 2-order allocation failed. Feb 8 23:34:59 db1 kernel: IP: queue_glue: no memory for gluing queue ef1c11a0 Feb 8 23:35:05 db1 kernel: nfs: server toastem not responding, still trying Feb 8 23:35:05 db1 kernel: __alloc_pages: 2-order allocation failed. Feb 8 23:35:05 db1 kernel: IP: queue_glue: no memory for gluing queue c322e520 Feb 8 23:35:06 db1 kernel: __alloc_pages: 2-order allocation failed. Feb 8 23:35:06 db1 kernel: IP: queue_glue: no memory for gluing queue ef1c11a0 Feb 8 23:36:04 db1 kernel: __alloc_pages: 2-order allocation failed. Feb 8 23:36:04 db1 kernel: IP: queue_glue: no memory for gluing queue c322ea60 Feb 8 23:36:05 db1 kernel: __alloc_pages: 2-order allocation failed. Feb 8 23:36:05 db1 kernel: IP: queue_glue: no memory for gluing queue c322ea60 Feb 8 23:36:06 db1 kernel: __alloc_pages: 2-order allocation failed. Feb 8 23:36:06 db1 kernel: IP: queue_glue: no memory for gluing queue c322ea60 Feb 9 00:00:13 db1 kernel: __alloc_pages: 1-order allocation failed. Feb 9 00:00:21 db1 last message repeated 269 times Feb 9 00:15:13 db1 kernel: __alloc_pages: 1-order allocation failed. Feb 9 00:15:19 db1 last message repeated 114 times etc We would love to stay with kernel 2.4.1ac8 because of the huge speed increase. Queries / Sec on this machine are from about 300 - 1700 If you need more information please email me. Thanks - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
dst cache overflow
Hello, Recently we have been experiencing some problems with the network dying temporarily on a machine then magically coming back to life. This appears to happen more frequently when the machines are loaded down CPU wise and usually sustain over 3Mbits/sec of network traffic. This is happening on several machines with similar configurations. Each machine has about 2000 active tcp connections on them and CPU usage is typically over 75%. They have AMD 800-950 processors and 3Com 905C cards running Redhat 6.2 with kernel build ranges from 2.2.17p3 to 2.2.18p10. When this happens the below logs appear in the system logger: Oct 3 12:14:38 onion kernel: dst cache overflow Oct 3 12:14:38 onion last message repeated 9 times Oct 3 12:14:43 onion kernel: NET: 486 messages suppressed. Oct 3 12:14:43 onion kernel: dst cache overflow Oct 3 12:14:48 onion kernel: RPC: sendmsg returned error 105 Oct 3 12:14:49 onion kernel: NET: 367 messages suppressed. Oct 3 12:14:49 onion kernel: dst cache overflow Oct 3 12:14:51 onion kernel: RPC: sendmsg returned error 105 Oct 3 12:14:53 onion kernel: NET: 192 messages suppressed. Oct 3 12:14:53 onion kernel: dst cache overflow Oct 3 12:14:55 onion kernel: RPC: sendmsg returned error 105 Oct 3 12:14:59 onion kernel: NET: 122 messages suppressed. Oct 3 12:14:59 onion kernel: dst cache overflow Oct 3 12:15:01 onion kernel: nfs: server toastem not responding, still trying Oct 3 12:15:01 onion kernel: RPC: sendmsg returned error 105 Oct 3 12:15:03 onion kernel: RPC: sendmsg returned error 105 Oct 3 12:15:04 onion kernel: NET: 52 messages suppressed. Oct 3 12:15:05 onion kernel: dst cache overflow Oct 3 12:15:08 onion kernel: RPC: sendmsg returned error 105 Oct 3 12:15:11 onion kernel: nfs: server toastem OK Oct 3 09:23:58 mint kernel: NET: 271 messages suppressed. Oct 3 09:23:58 mint kernel: dst cache overflow Oct 3 09:23:58 mint last message repeated 9 times Oct 3 09:24:07 mint kernel: NET: 384 messages suppressed. Oct 3 09:24:07 mint kernel: dst cache overflow Oct 3 09:24:07 mint kernel: NET: 255 messages suppressed. Oct 3 09:24:07 mint kernel: dst cache overflow Oct 3 09:24:12 mint kernel: NET: 149 messages suppressed. Oct 3 09:24:12 mint kernel: dst cache overflow Oct 3 09:24:18 mint kernel: NET: 64 messages suppressed. Oct 3 09:24:18 mint kernel: dst cache overflow Oct 3 09:24:23 mint kernel: NET: 35 messages suppressed. Oct 3 09:24:23 mint kernel: dst cache overflow Oct 3 09:24:27 mint kernel: NET: 23 messages suppressed. . . Hope this helps Thanks --Michael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/