On Fri, 14 Mar 2014 16:46:26 -0400 Joseph Salisbury <joseph.salisb...@canonical.com> wrote:
> Hi Tetsuo, > > A kernel bug report was opened against Ubuntu[0]. We performed a kernel > bisect, and found that reverting the following commit resolved this bug: > > > commit 786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 > Author: Tetsuo Handa <penguin-ker...@i-love.sakura.ne.jp> > Date: Tue Nov 12 15:06:45 2013 -0800 > > kthread: make kthread_create() killable > > The regression was introduced as of v3.13-rc1. > > The bug indicates an issue with the SAS controller during > initialization, which prevents the system from booting. Additional > details are available in the bug report or on request. > > I was hoping to get your feedback, since you are the patch author. Do > you think gathering any additional data will help diagnose this issue, > or would it be best to submit a revert request? > > [0] http://pad.lv/1276705 What process is running here? Presumably modprobe. A possible explanation is that modprobe has genuinely received a SIGKILL. Can you identify anything in this setup which might send a SIGKILL to the modprobe process? kthread_create_on_node() thinks that SIGKILL came from the oom-killer and it cheerfully returns -ENOMEM, which is incorrect if that signal came from userspace. And I don't _think_ we prevent userspace-originated signals from unblocking wait_for_completion_killable()? Root cause time: it's wrong for the oom-killer to use SIGKILL. In fact it's basically always wrong to send signals from in-kernel. Signals are a userspace IPC mechanism and using them in-kernel a) makes it hard (or impossible) to distinguish them from userspace-originated signals and b) permits userspace to produce surprising results in the kernel, which I suspect is what we're seeing here. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/