process hangs on do_exit when oom happens
I looked up nothing useful with google,so I'm here for help.. when this happens: I use memcg to limit the memory use of a process,and when the memcg cgroup was out of memory, the process was oom-killed however,it cannot really complete the exiting. here is the some information OS version: centos6.22.6.32.220.7.1 /proc/pid/stack --- [] __cond_resched+0x2a/0x40 [] unmap_vmas+0xb49/0xb70 [] exit_mmap+0x7e/0x140 [] mmput+0x58/0x110 [] exit_mm+0x11d/0x160 [] do_exit+0x1ad/0x860 [] do_group_exit+0x41/0xb0 [] get_signal_to_deliver+0x1e8/0x430 [] do_notify_resume+0xf4/0x8b0 [] int_signal+0x12/0x17 [] 0x /proc/pid/stat --- 11337 (CF_user_based) R 1 11314 11314 0 -1 4203524 7753602 0 0 0 622 1806 0 0 -2 0 1 0 324381340 0 0 18446744073709551615 0 0 0 0 0 0 0 0 66784 0 0 0 17 3 1 1 0 0 0 /proc/pid/status Name: CF_user_based State: R (running) Tgid: 11337 Pid:11337 PPid: 1 TracerPid: 0 Uid:32114 32114 32114 32114 Gid:32114 32114 32114 32114 Utrace: 0 FDSize: 128 Groups: 32114 Threads:1 SigQ: 2/2325005 SigPnd: ShdPnd: 4100 SigBlk: SigIgn: SigCgt: 0001800104e0 CapInh: CapPrm: CapEff: CapBnd: Cpus_allowed: Cpus_allowed_list: 0-31 Mems_allowed: ,0003 Mems_allowed_list: 0-1 voluntary_ctxt_switches:4300 nonvoluntary_ctxt_switches: 77 /var/log/messages --- Oct 17 15:22:19 hpc16 kernel: CF_user_based invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0 Oct 17 15:22:19 hpc16 kernel: CF_user_based cpuset=/ mems_allowed=0-1 Oct 17 15:22:19 hpc16 kernel: Pid: 3909, comm: CF_user_based Not tainted 2.6.32-2.0.0.1 #4 Oct 17 15:22:19 hpc16 kernel: Call Trace: Oct 17 15:22:19 hpc16 kernel: [] ? dump_header+0x85/0x1a0 Oct 17 15:22:19 hpc16 kernel: [] ? oom_kill_process+0x25e/0x2a0 Oct 17 15:22:19 hpc16 kernel: [] ? select_bad_process+0xce/0x110 Oct 17 15:22:19 hpc16 kernel: [] ? out_of_memory+0x1a8/0x390 Oct 17 15:22:19 hpc16 kernel: [] ? __alloc_pages_nodemask+0x73a/0x750 Oct 17 15:22:19 hpc16 kernel: [] ? __mem_cgroup_commit_charge+0x45/0x90 Oct 17 15:22:19 hpc16 kernel: [] ? alloc_pages_vma+0x9a/0x190 Oct 17 15:22:19 hpc16 kernel: [] ? handle_pte_fault+0x4cc/0xa90 Oct 17 15:22:19 hpc16 kernel: [] ? alloc_pages_current+0xab/0x110 Oct 17 15:22:19 hpc16 kernel: [] ? invalidate_interrupt5+0xe/0x20 Oct 17 15:22:19 hpc16 kernel: [] ? handle_mm_fault+0x12a/0x1b0 Oct 17 15:22:19 hpc16 kernel: [] ? do_page_fault+0x199/0x550 Oct 17 15:22:19 hpc16 kernel: [] ? call_rwsem_wake+0x18/0x30 Oct 17 15:22:19 hpc16 kernel: [] ? invalidate_interrupt5+0xe/0x20 Oct 17 15:22:19 hpc16 kernel: [] ? page_fault+0x25/0x30 Oct 17 15:22:19 hpc16 kernel: Mem-Info: Oct 17 15:22:19 hpc16 kernel: Node 0 Normal per-cpu: Oct 17 15:22:19 hpc16 kernel: CPU0: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU1: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU2: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU3: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU4: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU5: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU6: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU7: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU8: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU9: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 10: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 11: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 12: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 13: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 14: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 15: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 16: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 17: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 18: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 19: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 20: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 21: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 22: hi: 186, btch: 31 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU 23: hi: 186, btch: 31 usd: 18 Oct 17 15:22:19 hpc16 kernel: Node 1 DMA per-cpu: Oct 17 15:22:19 hpc16 kernel: CPU0: hi:0, btch: 1 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU1: hi:0, btch: 1 usd: 0 Oct 17 15:22:19 hpc16 kernel: CPU2: hi
Question about page_size of x86_64
I found that my machine has a feature 'page size extension=true ' with 'cpuid' command but I don't know now to use it with linux.. anyone can give some help ? -- 使用 Opera 革命性的电子邮件客户程序: http://www.opera.com/mail/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
compile a kernel for kgdb with less "optimized out"
I think those who use kgdb must hate this sentence "optimized out". I have tried many times to build a kernel with less optimization but failed. today,I found a trick method ,just get rid of the -O2 and -Os on the top level of the kernel source and add -O2 for the arch/x86 directory,it works !.I saw much "optimized out" now. I build the kernel of centos6, with kernel 2.6.32. I don't know whether it works with kernel of other versions ,but it worth a try. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: question about IO-sched
thanks very much. 在 Wed, 18 Jul 2012 14:51:09 +0800,Corrado Zoccolo 写道: On Sun, Jul 15, 2012 at 9:08 AM, gaoqiang wrote: many thanks. but why the sys_read operation hangs on sync_page ? there are still many free memory.I mean ,the actually free memory,excluding the various kinds of caches or buffers. http://kerneltrap.org/node/4941 explains sync_page: ->sync_page() is an awful misnomer. Usually, when page IO operation is requested by calling ->writepage() or ->readpage(), file-system queues IO request (e.g., disk-based file system may do this my calling submit_bio()), but underlying device driver does not proceed with this IO immediately, because IO scheduling is more efficient when there are multiple requests in the queue. Only when something really wants to wait for IO completion (wait_on_page_{locked,writeback}() are used to wait for read and write completion respectively) IO queue is processed. To do this wait_on_page_bit() calls ->sync_page() (see block_sync_page()---standard implementation of ->sync_page() for disk-based file systems). So, semantics of ->sync_page() are roughly "kick underlying storage driver to actually perform all IO queued for this page, and, maybe, for other pages on this device too". It is expected that sys_read will wait until the data is available for the process. If you don't want to wait (because you can do other stuff in the mean time, including queuing other I/O operations), you can use aio_read. The kernel will notify your process when the operation completes and the data is available in memory. Thanks, Corrado 在 Fri, 13 Jul 2012 22:15:31 +0800,Corrado Zoccolo 写道: Hi, the catch is that writes are "fire and forget", so they keep accumulating in the I/O sched, and there is always plenty of them to schedule (unless you explicitly make sync writes). The reader, instead, waits for the result of each read operation before scheduling a new read, so there is at most one outstanding read, and some time nothing. The deadline scheduler is work conserving, meaning that it never leaves the disk idle when there is work queued, and most of the time after an operation completes, there is only write work queued, so you see much more writes being sent to the device. Only schedulers that delay writes waiting for reads (as Anticipatory in old kernels, and now CFQ) can achieve higher read to write ratios. Cheers Corrado On Thu, Jul 12, 2012 at 11:01 AM, gaoqiang wrote: Hi,all I have long known that deadline is read-prefered. but a simple test gives the opposite result. with two processes running at the same time,one for read and one for write.actually,they did nothing bug IO operation. while(true) { read(); } the other: while(true) { write(); } with deadline IO-sched and ext4 filesystem.as a result, read ratio was about below 3M/s.and write about 100M/s. I have tested both kernel-2.6.18 and kernel-2.6.32,getting the same result. I add some debug information in the kernel and recompile,found that,it has little to do with IO-sched layer because read request dropped into deadline was 5% of write request .from /proc//stack,the read process hands on sync_page most of the time. what is the matter ? anyone help me ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/**majordomo-info.html<http://vger.kernel.org/majordomo-info.html> Please read the FAQ at http://www.tux.org/lkml/ -- 使用 Opera 革命性的电子邮件客户程序: http://www.opera.com/mail/ -- __ dott. Corrado Zoccolo mailto:czocc...@gmail.com PhD - Department of Computer Science - University of Pisa, Italy -- The self-confidence of a warrior is not the self-confidence of the average man. The average man seeks certainty in the eyes of the onlooker and calls that self-confidence. The warrior seeks impeccability in his own eyes and calls that humbleness. Tales of Power - C. Castaneda -- 使用 Opera 革命性的电子邮件客户程序: http://www.opera.com/mail/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: question about IO-sched
many thanks. but why the sys_read operation hangs on sync_page ? there are still many free memory.I mean ,the actually free memory,excluding the various kinds of caches or buffers. 在 Fri, 13 Jul 2012 22:15:31 +0800,Corrado Zoccolo 写道: Hi, the catch is that writes are "fire and forget", so they keep accumulating in the I/O sched, and there is always plenty of them to schedule (unless you explicitly make sync writes). The reader, instead, waits for the result of each read operation before scheduling a new read, so there is at most one outstanding read, and some time nothing. The deadline scheduler is work conserving, meaning that it never leaves the disk idle when there is work queued, and most of the time after an operation completes, there is only write work queued, so you see much more writes being sent to the device. Only schedulers that delay writes waiting for reads (as Anticipatory in old kernels, and now CFQ) can achieve higher read to write ratios. Cheers Corrado On Thu, Jul 12, 2012 at 11:01 AM, gaoqiang wrote: Hi,all I have long known that deadline is read-prefered. but a simple test gives the opposite result. with two processes running at the same time,one for read and one for write.actually,they did nothing bug IO operation. while(true) { read(); } the other: while(true) { write(); } with deadline IO-sched and ext4 filesystem.as a result, read ratio was about below 3M/s.and write about 100M/s. I have tested both kernel-2.6.18 and kernel-2.6.32,getting the same result. I add some debug information in the kernel and recompile,found that,it has little to do with IO-sched layer because read request dropped into deadline was 5% of write request .from /proc//stack,the read process hands on sync_page most of the time. what is the matter ? anyone help me ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/**majordomo-info.html<http://vger.kernel.org/majordomo-info.html> Please read the FAQ at http://www.tux.org/lkml/ -- 使用 Opera 革命性的电子邮件客户程序: http://www.opera.com/mail/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
question about IO-sched
Hi,all I have long known that deadline is read-prefered. but a simple test gives the opposite result. with two processes running at the same time,one for read and one for write.actually,they did nothing bug IO operation. while(true) { read(); } the other: while(true) { write(); } with deadline IO-sched and ext4 filesystem.as a result, read ratio was about below 3M/s.and write about 100M/s. I have tested both kernel-2.6.18 and kernel-2.6.32,getting the same result. I add some debug information in the kernel and recompile,found that,it has little to do with IO-sched layer because read request dropped into deadline was 5% of write request .from /proc//stack,the read process hands on sync_page most of the time. what is the matter ? anyone help me ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/