On Wed, Jul 8, 2015 at 7:37 AM, Dave Chinner <da...@fromorbit.com> wrote: > On Tue, Jul 07, 2015 at 05:29:43PM +0800, Gavin Guo wrote: >> Hi all, >> >> Recently, we observed that there is the error message in >> Ubuntu-3.13.0-48.80: >> >> "XFS: possible memory allocation deadlock in kmem_alloc (mode:0x8250)" >> >> repeatedly shows in the dmesg. Temporarily, our workaround is to tune the >> parameters, such as, vfs_cache_pressure, min_free_kbytes, and dirty_ratio. >> >> And we also found that there are different error messages regarding the >> hung tasks which happened in xfs_log_commit_cil and xlog_cil_push. >> >> The log is available at: http://paste.ubuntu.com/11835007/ >> >> The following link seems the same problem we suffered: >> >> XFS hangs with XFS: possible memory allocation deadlock in kmem_alloc >> http://oss.sgi.com/archives/xfs/2015-03/msg00172.html >> >> I read the mail and found that there might be some modification regarding >> to move the memory allocation outside the ctx lock. And I also read the >> latest patch from February of 2015 to see if there is any new change >> about that. Unfortunately, I didn't find anything regarding the change (may >> be I'm not familiar with the XFS, so didn't find the commit). If it's >> possible for someone who is familiar with the code to point out the commits >> related to the bug if already exist or any status about the plan. > > No commits - the approach I thought we might be able to take to > avoid the problem didn't work out. I have another idea of how we > might solve the problem, but I haven't ad a chance to prototype it > yet.
I have read the code for a while and still can't figure out how to fix. My current understanding is that the problem is Buddy system is running out of memory so the XFS kmem_alloc(), called by xfs_log_commit_cil-> xlog_cil_insert_items-> xlog_cil_insert_format_items-> kmem_zalloc, fail and stuck in the while loop and retry. There are also 2 other threads running in the same time: 1). xfs_log_commit_cil->down_read(&cil->xc_ctx_lock); 2). xlog_cil_push->down_write(&cil->xc_ctx_lock); So, the both threads are blocked and waiting for the first kmem_zalloc() to succeed. However, if there is a way to decrease the memory request or if it's possible to elaborate more on the idea you mentioned. I know it's a problem which cannot be solved in a short time. And I'd like to help if there is any possibility. Thanks, Gavin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/