Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri, 4 Dec 2015 17:57:44 +0100 Petr Mladekwrote: > On Wed 2015-12-02 00:24:49, Jiri Kosina wrote: > > On Fri, 27 Nov 2015, Petr Mladek wrote: > > > > > MN10300 has its own implementation for entering and exiting NMI > > > handlers. It does not call nmi_enter() and nmi_exit(). Please, find > > > below an updated patch that adds printk_nmi_enter() and > > > printk_nmi_exit() to the custom entry points. Then we could add HAVE_NMI > > > to arch/mn10300/Kconfig and avoid the above warning. > > > > Hmm, so what exactly would go wrong if MN10300 (whatever that architecture > > is) would call nmi_enter() and nmi_exit() at the places where it's > > starting and finishing NMI handler? > > > > >From a cursory look, it seems like most (if not all) of the things called > > from nmi_{enter,exit}() would be nops there anyway. > > Good point. Max mentioned in the other main that the NMI handler > should follow the NMI ruler. I do not why it could not work. > In fact, it might improve things, e.g. nmi_enter() blocks > recursive NMIs. > > I think that it will move it into a separate patch, thought. > I've sort of lost the plot on this patchset. I know Daniel had concerns (resolved?). Sergey lost the ability to perform backtraces and has a proposed fix ("printk/nmi: restore printk_func in nmi_panic") but that wasn't fully resolved and I didn't merge anything. I'm not sure what Jan's thinking is on it all. So... I'll retain printk-nmi-generic-solution-for-safe-printk-in-nmi.patch printk-nmi-use-irq-work-only-when-ready.patch printk-nmi-warn-when-some-message-has-been-lost-in-nmi-context.patch printk-nmi-increase-the-size-of-nmi-buffer-and-make-it-configurable.patch in -mm for now. Perhaps I should drop them all and we start again after -rc1?
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri, 4 Dec 2015 17:57:44 +0100 Petr Mladek wrote: > On Wed 2015-12-02 00:24:49, Jiri Kosina wrote: > > On Fri, 27 Nov 2015, Petr Mladek wrote: > > > > > MN10300 has its own implementation for entering and exiting NMI > > > handlers. It does not call nmi_enter() and nmi_exit(). Please, find > > > below an updated patch that adds printk_nmi_enter() and > > > printk_nmi_exit() to the custom entry points. Then we could add HAVE_NMI > > > to arch/mn10300/Kconfig and avoid the above warning. > > > > Hmm, so what exactly would go wrong if MN10300 (whatever that architecture > > is) would call nmi_enter() and nmi_exit() at the places where it's > > starting and finishing NMI handler? > > > > >From a cursory look, it seems like most (if not all) of the things called > > from nmi_{enter,exit}() would be nops there anyway. > > Good point. Max mentioned in the other main that the NMI handler > should follow the NMI ruler. I do not why it could not work. > In fact, it might improve things, e.g. nmi_enter() blocks > recursive NMIs. > > I think that it will move it into a separate patch, thought. > I've sort of lost the plot on this patchset. I know Daniel had concerns (resolved?). Sergey lost the ability to perform backtraces and has a proposed fix ("printk/nmi: restore printk_func in nmi_panic") but that wasn't fully resolved and I didn't merge anything. I'm not sure what Jan's thinking is on it all. So... I'll retain printk-nmi-generic-solution-for-safe-printk-in-nmi.patch printk-nmi-use-irq-work-only-when-ready.patch printk-nmi-warn-when-some-message-has-been-lost-in-nmi-context.patch printk-nmi-increase-the-size-of-nmi-buffer-and-make-it-configurable.patch in -mm for now. Perhaps I should drop them all and we start again after -rc1?
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Thu 2016-03-17 12:35:27, Andrew Morton wrote: > On Fri, 4 Dec 2015 17:57:44 +0100 Petr Mladekwrote: > > > On Wed 2015-12-02 00:24:49, Jiri Kosina wrote: > > > On Fri, 27 Nov 2015, Petr Mladek wrote: > > > > > > > MN10300 has its own implementation for entering and exiting NMI > > > > handlers. It does not call nmi_enter() and nmi_exit(). Please, find > > > > below an updated patch that adds printk_nmi_enter() and > > > > printk_nmi_exit() to the custom entry points. Then we could add > > > > HAVE_NMI > > > > to arch/mn10300/Kconfig and avoid the above warning. > > > > > > Hmm, so what exactly would go wrong if MN10300 (whatever that > > > architecture > > > is) would call nmi_enter() and nmi_exit() at the places where it's > > > starting and finishing NMI handler? > > > > > > >From a cursory look, it seems like most (if not all) of the things > > > >called > > > from nmi_{enter,exit}() would be nops there anyway. > > > > Good point. Max mentioned in the other main that the NMI handler > > should follow the NMI ruler. I do not why it could not work. > > In fact, it might improve things, e.g. nmi_enter() blocks > > recursive NMIs. > > > > I think that it will move it into a separate patch, thought. > > > > I've sort of lost the plot on this patchset. > > I know Daniel had concerns (resolved?). Sergey lost the ability to > perform backtraces and has a proposed fix ("printk/nmi: restore > printk_func in nmi_panic") but that wasn't fully resolved and I didn't > merge anything. I'm not sure what Jan's thinking is on it all. > > So... I'll retain > > printk-nmi-generic-solution-for-safe-printk-in-nmi.patch > printk-nmi-use-irq-work-only-when-ready.patch > printk-nmi-warn-when-some-message-has-been-lost-in-nmi-context.patch > printk-nmi-increase-the-size-of-nmi-buffer-and-make-it-configurable.patch > > in -mm for now. Perhaps I should drop them all and we start again > after -rc1? Please, drop it for now. I'll send an updated version that will better handle Daniel's concerns after rc1. I thought that it had already been decided. You wanted to remove the patchset in favour of "improvements to the nmi_backtrace code" by Chris Metcalf, see http://thread.gmane.org/gmane.linux.ports.arm.kernel/482845/focus=483002 Best Regards, Petr
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Thu 2016-03-17 12:35:27, Andrew Morton wrote: > On Fri, 4 Dec 2015 17:57:44 +0100 Petr Mladek wrote: > > > On Wed 2015-12-02 00:24:49, Jiri Kosina wrote: > > > On Fri, 27 Nov 2015, Petr Mladek wrote: > > > > > > > MN10300 has its own implementation for entering and exiting NMI > > > > handlers. It does not call nmi_enter() and nmi_exit(). Please, find > > > > below an updated patch that adds printk_nmi_enter() and > > > > printk_nmi_exit() to the custom entry points. Then we could add > > > > HAVE_NMI > > > > to arch/mn10300/Kconfig and avoid the above warning. > > > > > > Hmm, so what exactly would go wrong if MN10300 (whatever that > > > architecture > > > is) would call nmi_enter() and nmi_exit() at the places where it's > > > starting and finishing NMI handler? > > > > > > >From a cursory look, it seems like most (if not all) of the things > > > >called > > > from nmi_{enter,exit}() would be nops there anyway. > > > > Good point. Max mentioned in the other main that the NMI handler > > should follow the NMI ruler. I do not why it could not work. > > In fact, it might improve things, e.g. nmi_enter() blocks > > recursive NMIs. > > > > I think that it will move it into a separate patch, thought. > > > > I've sort of lost the plot on this patchset. > > I know Daniel had concerns (resolved?). Sergey lost the ability to > perform backtraces and has a proposed fix ("printk/nmi: restore > printk_func in nmi_panic") but that wasn't fully resolved and I didn't > merge anything. I'm not sure what Jan's thinking is on it all. > > So... I'll retain > > printk-nmi-generic-solution-for-safe-printk-in-nmi.patch > printk-nmi-use-irq-work-only-when-ready.patch > printk-nmi-warn-when-some-message-has-been-lost-in-nmi-context.patch > printk-nmi-increase-the-size-of-nmi-buffer-and-make-it-configurable.patch > > in -mm for now. Perhaps I should drop them all and we start again > after -rc1? Please, drop it for now. I'll send an updated version that will better handle Daniel's concerns after rc1. I thought that it had already been decided. You wanted to remove the patchset in favour of "improvements to the nmi_backtrace code" by Chris Metcalf, see http://thread.gmane.org/gmane.linux.ports.arm.kernel/482845/focus=483002 Best Regards, Petr
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Wed 2015-12-02 00:24:49, Jiri Kosina wrote: > On Fri, 27 Nov 2015, Petr Mladek wrote: > > > MN10300 has its own implementation for entering and exiting NMI > > handlers. It does not call nmi_enter() and nmi_exit(). Please, find > > below an updated patch that adds printk_nmi_enter() and > > printk_nmi_exit() to the custom entry points. Then we could add HAVE_NMI > > to arch/mn10300/Kconfig and avoid the above warning. > > Hmm, so what exactly would go wrong if MN10300 (whatever that architecture > is) would call nmi_enter() and nmi_exit() at the places where it's > starting and finishing NMI handler? > > >From a cursory look, it seems like most (if not all) of the things called > from nmi_{enter,exit}() would be nops there anyway. Good point. Max mentioned in the other main that the NMI handler should follow the NMI ruler. I do not why it could not work. In fact, it might improve things, e.g. nmi_enter() blocks recursive NMIs. I think that it will move it into a separate patch, thought. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Wed 2015-12-02 13:45:16, Michael Ellerman wrote: > On Fri, 2015-11-27 at 12:09 +0100, Petr Mladek wrote: > > > printk() takes some locks and could not be used a safe way in NMI > > context. > > > > The chance of a deadlock is real especially when printing > > stacks from all CPUs. This particular problem has been addressed > > on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack > > trace on all CPUs"). > > ... > > > diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c > > new file mode 100644 > > index ..3989e13a0021 > > --- /dev/null > > +++ b/kernel/printk/nmi.c > > @@ -0,0 +1,200 @@ > > ... > > > + > > +struct nmi_seq_buf { > > + atomic_tlen;/* length of written data */ > > + struct irq_work work; /* IRQ work that flushes the buffer */ > > + unsigned char buffer[PAGE_SIZE - sizeof(atomic_t) - > > + sizeof(struct irq_work)]; > > +}; > > +static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq); > > > PAGE_SIZE isn't always 4K. > > On typical powerpc systems this will give you 128K, and on some 512K, which is > probably not what we wanted. Good point! > The existing code just did: > > #define NMI_BUF_SIZE 4096 I will change this to 8192. The 4kB were not enough in some cases. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Wed 2015-12-02 13:45:16, Michael Ellerman wrote: > On Fri, 2015-11-27 at 12:09 +0100, Petr Mladek wrote: > > > printk() takes some locks and could not be used a safe way in NMI > > context. > > > > The chance of a deadlock is real especially when printing > > stacks from all CPUs. This particular problem has been addressed > > on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack > > trace on all CPUs"). > > ... > > > diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c > > new file mode 100644 > > index ..3989e13a0021 > > --- /dev/null > > +++ b/kernel/printk/nmi.c > > @@ -0,0 +1,200 @@ > > ... > > > + > > +struct nmi_seq_buf { > > + atomic_tlen;/* length of written data */ > > + struct irq_work work; /* IRQ work that flushes the buffer */ > > + unsigned char buffer[PAGE_SIZE - sizeof(atomic_t) - > > + sizeof(struct irq_work)]; > > +}; > > +static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq); > > > PAGE_SIZE isn't always 4K. > > On typical powerpc systems this will give you 128K, and on some 512K, which is > probably not what we wanted. Good point! > The existing code just did: > > #define NMI_BUF_SIZE 4096 I will change this to 8192. The 4kB were not enough in some cases. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Wed 2015-12-02 00:24:49, Jiri Kosina wrote: > On Fri, 27 Nov 2015, Petr Mladek wrote: > > > MN10300 has its own implementation for entering and exiting NMI > > handlers. It does not call nmi_enter() and nmi_exit(). Please, find > > below an updated patch that adds printk_nmi_enter() and > > printk_nmi_exit() to the custom entry points. Then we could add HAVE_NMI > > to arch/mn10300/Kconfig and avoid the above warning. > > Hmm, so what exactly would go wrong if MN10300 (whatever that architecture > is) would call nmi_enter() and nmi_exit() at the places where it's > starting and finishing NMI handler? > > >From a cursory look, it seems like most (if not all) of the things called > from nmi_{enter,exit}() would be nops there anyway. Good point. Max mentioned in the other main that the NMI handler should follow the NMI ruler. I do not why it could not work. In fact, it might improve things, e.g. nmi_enter() blocks recursive NMIs. I think that it will move it into a separate patch, thought. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri, 2015-11-27 at 12:09 +0100, Petr Mladek wrote: > printk() takes some locks and could not be used a safe way in NMI > context. > > The chance of a deadlock is real especially when printing > stacks from all CPUs. This particular problem has been addressed > on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack > trace on all CPUs"). ... > diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c > new file mode 100644 > index ..3989e13a0021 > --- /dev/null > +++ b/kernel/printk/nmi.c > @@ -0,0 +1,200 @@ ... > + > +struct nmi_seq_buf { > + atomic_tlen;/* length of written data */ > + struct irq_work work; /* IRQ work that flushes the buffer */ > + unsigned char buffer[PAGE_SIZE - sizeof(atomic_t) - > +sizeof(struct irq_work)]; > +}; > +static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq); PAGE_SIZE isn't always 4K. On typical powerpc systems this will give you 128K, and on some 512K, which is probably not what we wanted. The existing code just did: #define NMI_BUF_SIZE 4096 So I think you should just go back to doing that. cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri, 27 Nov 2015, Petr Mladek wrote: > MN10300 has its own implementation for entering and exiting NMI > handlers. It does not call nmi_enter() and nmi_exit(). Please, find > below an updated patch that adds printk_nmi_enter() and > printk_nmi_exit() to the custom entry points. Then we could add HAVE_NMI > to arch/mn10300/Kconfig and avoid the above warning. Hmm, so what exactly would go wrong if MN10300 (whatever that architecture is) would call nmi_enter() and nmi_exit() at the places where it's starting and finishing NMI handler? >From a cursory look, it seems like most (if not all) of the things called from nmi_{enter,exit}() would be nops there anyway. Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri, 27 Nov 2015, Petr Mladek wrote: > MN10300 has its own implementation for entering and exiting NMI > handlers. It does not call nmi_enter() and nmi_exit(). Please, find > below an updated patch that adds printk_nmi_enter() and > printk_nmi_exit() to the custom entry points. Then we could add HAVE_NMI > to arch/mn10300/Kconfig and avoid the above warning. Hmm, so what exactly would go wrong if MN10300 (whatever that architecture is) would call nmi_enter() and nmi_exit() at the places where it's starting and finishing NMI handler? >From a cursory look, it seems like most (if not all) of the things called from nmi_{enter,exit}() would be nops there anyway. Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri, 2015-11-27 at 12:09 +0100, Petr Mladek wrote: > printk() takes some locks and could not be used a safe way in NMI > context. > > The chance of a deadlock is real especially when printing > stacks from all CPUs. This particular problem has been addressed > on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack > trace on all CPUs"). ... > diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c > new file mode 100644 > index ..3989e13a0021 > --- /dev/null > +++ b/kernel/printk/nmi.c > @@ -0,0 +1,200 @@ ... > + > +struct nmi_seq_buf { > + atomic_tlen;/* length of written data */ > + struct irq_work work; /* IRQ work that flushes the buffer */ > + unsigned char buffer[PAGE_SIZE - sizeof(atomic_t) - > +sizeof(struct irq_work)]; > +}; > +static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq); PAGE_SIZE isn't always 4K. On typical powerpc systems this will give you 128K, and on some 512K, which is probably not what we wanted. The existing code just did: #define NMI_BUF_SIZE 4096 So I think you should just go back to doing that. cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri 2015-11-27 17:26:16, Max Filippov wrote: Hi Max, > > Another exception is Xtensa architecture that uses just a > > fake NMI. > > It's called fake because it's actually maskable, but sometimes > it is safe to use it as NMI (when there are no other IRQs at the > same priority level and that level equals EXCM level). That > condition is checked in arch/xtensa/include/asm/processor.h > So 'fake' here is to avoid confusion with real NMI that exists > on xtensa (and is not currently used in linux), otherwise code > that runs in fake NMI must follow the NMI rules. > > To make xtensa compatible with your change we can add a > choice whether fake NMI should be used to kconfig. It can > then set HAVE_NMI accordingly. I'll post a patch for xtensa. Thanks a lot for explanation. I'll wait for the destiny of the patch adding CONFIG_XTENSA_FAKE_NMI. It is not easy for me to review. Anyway, we could select HAVE_NMI for Xtensa anytime later if this patchset goes in earlier. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri 2015-11-27 17:26:16, Max Filippov wrote: Hi Max, > > Another exception is Xtensa architecture that uses just a > > fake NMI. > > It's called fake because it's actually maskable, but sometimes > it is safe to use it as NMI (when there are no other IRQs at the > same priority level and that level equals EXCM level). That > condition is checked in arch/xtensa/include/asm/processor.h > So 'fake' here is to avoid confusion with real NMI that exists > on xtensa (and is not currently used in linux), otherwise code > that runs in fake NMI must follow the NMI rules. > > To make xtensa compatible with your change we can add a > choice whether fake NMI should be used to kconfig. It can > then set HAVE_NMI accordingly. I'll post a patch for xtensa. Thanks a lot for explanation. I'll wait for the destiny of the patch adding CONFIG_XTENSA_FAKE_NMI. It is not easy for me to review. Anyway, we could select HAVE_NMI for Xtensa anytime later if this patchset goes in earlier. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri 2015-11-27 19:49:48, kbuild test robot wrote: > Hi Petr, > > [auto build test WARNING on powerpc/next] > [also build test WARNING on v4.4-rc2 next-20151127] > [cannot apply to tip/x86/core] > > url: > https://github.com/0day-ci/linux/commits/Petr-Mladek/Cleaning-printk-stuff-in-NMI-context/20151127-191620 > base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next > config: mn10300-asb2364_defconfig (attached as .config) > reproduce: > wget > https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross > -O ~/bin/make.cross > chmod +x ~/bin/make.cross > # save the attached .config to linux build tree > make.cross ARCH=mn10300 > > All warnings (new ones prefixed by >>): > > warning: (MN10300) selects HAVE_NMI_WATCHDOG which has unmet direct > dependencies (HAVE_NMI) MN10300 has its own implementation for entering and exiting NMI handlers. It does not call nmi_enter() and nmi_exit(). Please, find below an updated patch that adds printk_nmi_enter() and printk_nmi_exit() to the custom entry points. Then we could add HAVE_NMI to arch/mn10300/Kconfig and avoid the above warning. The updated patch also fixes includes in kernel/printk/nmi.c and kernel/printk/printk.h to fix the other build errors found by kbuild test robot. The kbuild test robot is really cool thing! >From 1689f635cc423ff9887c6774ad6b59a1ea885e4b Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Thu, 2 Jul 2015 13:17:17 +0200 Subject: [PATCH 1/5] printk/nmi: Generic solution for safe printk in NMI printk() takes some locks and could not be used a safe way in NMI context. The chance of a deadlock is real especially when printing stacks from all CPUs. This particular problem has been addressed on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack trace on all CPUs"). This patch reuses most of the code and makes it generic. It is useful for all messages and architectures that support NMI. The alternative printk_func is set when entering and is reseted when leaving NMI context. It queues IRQ work to copy the messages into the main ring buffer in a safe context. __printk_nmi_flush() copies all available messages and reset the buffer. Then we could use a simple cmpxchg operations to get synchronized with writers. There is also used a spinlock to get synchronized with other flushers. We do not longer use seq_buf because it depends on external lock. It would be hard to make all supported operations safe for a lockless use. It would be confusing and error prone to make only some operations safe. The code is put into separate printk/nmi.c as suggested by Steven Rostedt. It needs a per-CPU buffer and is compiled only on architectures that call nmi_enter(). This is achieved by the new HAVE_NMI Kconfig flag. One exception is arm where the deferred printing is used for printing backtraces even without NMI. For this purpose, we define NEED_PRINTK_NMI Kconfig flag. The alternative printk_func is explicitly set when IPI_CPU_BACKTRACE is handled. Second exception is MN10300 architecture that has its own implementation of entering and exiting NMI handlers. It even has a separate optimized implementation for NMI watchdog. This patch adds printk_nmi_enter() and printk_nmi_exit() to these custom entry points. Note that we have to define HAVE_NMI here. Otherwise, Kconfig complains about unmet direct dependencies for HAVE_NMI_WATCHDOG. Last exception is Xtensa architecture that uses just a fake NMI. The patch is heavily based on the draft from Peter Zijlstra, see https://lkml.org/lkml/2015/6/10/327 Suggested-by: Peter Zijlstra Suggested-by: Steven Rostedt Signed-off-by: Petr Mladek --- arch/Kconfig | 7 ++ arch/arm/Kconfig | 2 + arch/arm/kernel/smp.c | 2 + arch/avr32/Kconfig | 1 + arch/blackfin/Kconfig | 1 + arch/cris/Kconfig | 1 + arch/mips/Kconfig | 1 + arch/mn10300/Kconfig | 1 + arch/mn10300/kernel/mn10300-watchdog.c | 4 + arch/mn10300/kernel/smp.c | 3 + arch/powerpc/Kconfig | 1 + arch/s390/Kconfig | 1 + arch/sh/Kconfig| 1 + arch/sparc/Kconfig | 1 + arch/tile/Kconfig | 1 + arch/x86/Kconfig | 1 + arch/x86/kernel/apic/hw_nmi.c | 1 - include/linux/hardirq.h| 2 + include/linux/percpu.h | 3 - include/linux/printk.h | 12 +- init/Kconfig | 5 + init/main.c| 1 + kernel/printk/Makefile | 1 + kernel/printk/nmi.c| 202 + kernel/printk/printk.c | 19 +--- kernel/printk/printk.h |
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
Hi Peter, On Fri, Nov 27, 2015 at 2:09 PM, Petr Mladek wrote: > printk() takes some locks and could not be used a safe way in NMI > context. > > The chance of a deadlock is real especially when printing > stacks from all CPUs. This particular problem has been addressed > on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack > trace on all CPUs"). > > This patch reuses most of the code and makes it generic. It is > useful for all messages and architectures that support NMI. > > The alternative printk_func is set when entering and is reseted when > leaving NMI context. It queues IRQ work to copy the messages into > the main ring buffer in a safe context. > > __printk_nmi_flush() copies all available messages and reset > the buffer. Then we could use a simple cmpxchg operations to > get synchronized with writers. There is also used a spinlock > to get synchronized with other flushers. > > We do not longer use seq_buf because it depends on external lock. > It would be hard to make all supported operations safe for > a lockless use. It would be confusing and error prone to > make only some operations safe. > > The code is put into separate printk/nmi.c as suggested by > Steven Rostedt. It needs a per-CPU buffer and is compiled only > on architectures that call nmi_enter(). This is achieved by > the new HAVE_NMI Kconfig flag. > > One exception is arm where the deferred printing is used for > printing backtraces even without NMI. For this purpose, > we define NEED_PRINTK_NMI Kconfig flag. The alternative > printk_func is explicitly set when IPI_CPU_BACKTRACE is > handled. > > Another exception is Xtensa architecture that uses just a > fake NMI. It's called fake because it's actually maskable, but sometimes it is safe to use it as NMI (when there are no other IRQs at the same priority level and that level equals EXCM level). That condition is checked in arch/xtensa/include/asm/processor.h So 'fake' here is to avoid confusion with real NMI that exists on xtensa (and is not currently used in linux), otherwise code that runs in fake NMI must follow the NMI rules. To make xtensa compatible with your change we can add a choice whether fake NMI should be used to kconfig. It can then set HAVE_NMI accordingly. I'll post a patch for xtensa. -- Thanks. -- Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
Hi Petr, [auto build test ERROR on: powerpc/next] [also build test ERROR on: v4.4-rc2 next-20151127] [cannot apply to: tip/x86/core] url: https://github.com/0day-ci/linux/commits/Petr-Mladek/Cleaning-printk-stuff-in-NMI-context/20151127-191620 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: i386-randconfig-s1-201547 (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=i386 All error/warnings (new ones prefixed by >>): In file included from arch/x86/include/asm/percpu.h:551:0, from arch/x86/include/asm/current.h:5, from arch/x86/include/asm/processor.h:15, from arch/x86/include/asm/atomic.h:6, from include/linux/atomic.h:4, from include/linux/llist.h:58, from include/linux/smp.h:14, from kernel/printk/nmi.c:18: kernel/printk/printk.h: In function 'vprintk_func': >> include/asm-generic/percpu.h:111:2: error: implicit declaration of function >> 'preempt_disable' [-Werror=implicit-function-declaration] preempt_disable(); \ ^ >> include/asm-generic/percpu.h:305:31: note: in expansion of macro >> 'this_cpu_generic_read' #define this_cpu_read_8(pcp) this_cpu_generic_read(pcp) ^ >> include/linux/percpu-defs.h:311:23: note: in expansion of macro >> 'this_cpu_read_8' case 8: pscr_ret__ = stem##8(variable); break; \ ^ >> include/linux/percpu-defs.h:494:29: note: in expansion of macro >> '__pcpu_size_call_return' #define this_cpu_read(pcp) __pcpu_size_call_return(this_cpu_read_, pcp) ^ >> kernel/printk/printk.h:34:9: note: in expansion of macro 'this_cpu_read' return this_cpu_read(printk_func)(fmt, args); ^ >> include/asm-generic/percpu.h:113:2: error: implicit declaration of function >> 'preempt_enable' [-Werror=implicit-function-declaration] preempt_enable(); \ ^ >> include/asm-generic/percpu.h:305:31: note: in expansion of macro >> 'this_cpu_generic_read' #define this_cpu_read_8(pcp) this_cpu_generic_read(pcp) ^ >> include/linux/percpu-defs.h:311:23: note: in expansion of macro >> 'this_cpu_read_8' case 8: pscr_ret__ = stem##8(variable); break; \ ^ >> include/linux/percpu-defs.h:494:29: note: in expansion of macro >> '__pcpu_size_call_return' #define this_cpu_read(pcp) __pcpu_size_call_return(this_cpu_read_, pcp) ^ >> kernel/printk/printk.h:34:9: note: in expansion of macro 'this_cpu_read' return this_cpu_read(printk_func)(fmt, args); ^ kernel/printk/nmi.c: In function '__printk_nmi_flush': >> kernel/printk/nmi.c:106:9: error: unknown type name 'raw_spinlock_t' static raw_spinlock_t read_lock = ^ >> kernel/printk/nmi.c:107:3: error: implicit declaration of function >> '__RAW_SPIN_LOCK_INITIALIZER' [-Werror=implicit-function-declaration] __RAW_SPIN_LOCK_INITIALIZER(read_lock); ^ >> kernel/printk/nmi.c:107:3: error: initializer element is not constant >> kernel/printk/nmi.c:118:2: error: implicit declaration of function >> 'raw_spin_lock' [-Werror=implicit-function-declaration] raw_spin_lock(_lock); ^ >> kernel/printk/nmi.c:163:2: error: implicit declaration of function >> 'raw_spin_unlock' [-Werror=implicit-function-declaration] raw_spin_unlock(_lock); ^ cc1: some warnings being treated as errors vim +/preempt_disable +111 include/asm-generic/percpu.h 9c28278a24 Tejun Heo 2014-06-17 105(__ret); \ 9c28278a24 Tejun Heo 2014-06-17 106 }) 9c28278a24 Tejun Heo 2014-06-17 107 eba117889a Tejun Heo 2014-06-17 108 #define this_cpu_generic_read(pcp) \ eba117889a Tejun Heo 2014-06-17 109 ({ \ eba117889a Tejun Heo 2014-06-17 110typeof(pcp) __ret; \ 9c28278a24 Tejun Heo 2014-06-17 @111preempt_disable(); \ eba117889a Tejun Heo 2014-06-17 112__ret = *this_cpu_ptr(&(pcp)); \ 9c28278a24 Tejun Heo 2014-06-17 @113preempt_enable(); \ eba117889a Tejun Heo 2014-06-17 114__ret; \ 9c28278a24 Tejun Heo 2014-06-17 115 }) 9c28278a24 Tejun Heo 2014-06-17 116 eba117889a Tejun Heo 2014-06-17 117 #define this_cpu_generic_to_op(pcp, val, op) \ 9c28278a24 Tejun Heo 2014-06-17 118 do { \ eba117889a Tejun Heo 2014-06-17 119
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
Hi Petr, [auto build test WARNING on powerpc/next] [also build test WARNING on v4.4-rc2 next-20151127] [cannot apply to tip/x86/core] url: https://github.com/0day-ci/linux/commits/Petr-Mladek/Cleaning-printk-stuff-in-NMI-context/20151127-191620 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: mn10300-asb2364_defconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=mn10300 All warnings (new ones prefixed by >>): warning: (MN10300) selects HAVE_NMI_WATCHDOG which has unmet direct dependencies (HAVE_NMI) --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
Hi Petr, [auto build test ERROR on: powerpc/next] [also build test ERROR on: v4.4-rc2 next-20151127] [cannot apply to: tip/x86/core] url: https://github.com/0day-ci/linux/commits/Petr-Mladek/Cleaning-printk-stuff-in-NMI-context/20151127-191620 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: i386-randconfig-s1-201547 (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=i386 All error/warnings (new ones prefixed by >>): In file included from arch/x86/include/asm/percpu.h:551:0, from arch/x86/include/asm/current.h:5, from arch/x86/include/asm/processor.h:15, from arch/x86/include/asm/atomic.h:6, from include/linux/atomic.h:4, from include/linux/llist.h:58, from include/linux/smp.h:14, from kernel/printk/nmi.c:18: kernel/printk/printk.h: In function 'vprintk_func': >> include/asm-generic/percpu.h:111:2: error: implicit declaration of function >> 'preempt_disable' [-Werror=implicit-function-declaration] preempt_disable(); \ ^ >> include/asm-generic/percpu.h:305:31: note: in expansion of macro >> 'this_cpu_generic_read' #define this_cpu_read_8(pcp) this_cpu_generic_read(pcp) ^ >> include/linux/percpu-defs.h:311:23: note: in expansion of macro >> 'this_cpu_read_8' case 8: pscr_ret__ = stem##8(variable); break; \ ^ >> include/linux/percpu-defs.h:494:29: note: in expansion of macro >> '__pcpu_size_call_return' #define this_cpu_read(pcp) __pcpu_size_call_return(this_cpu_read_, pcp) ^ >> kernel/printk/printk.h:34:9: note: in expansion of macro 'this_cpu_read' return this_cpu_read(printk_func)(fmt, args); ^ >> include/asm-generic/percpu.h:113:2: error: implicit declaration of function >> 'preempt_enable' [-Werror=implicit-function-declaration] preempt_enable(); \ ^ >> include/asm-generic/percpu.h:305:31: note: in expansion of macro >> 'this_cpu_generic_read' #define this_cpu_read_8(pcp) this_cpu_generic_read(pcp) ^ >> include/linux/percpu-defs.h:311:23: note: in expansion of macro >> 'this_cpu_read_8' case 8: pscr_ret__ = stem##8(variable); break; \ ^ >> include/linux/percpu-defs.h:494:29: note: in expansion of macro >> '__pcpu_size_call_return' #define this_cpu_read(pcp) __pcpu_size_call_return(this_cpu_read_, pcp) ^ >> kernel/printk/printk.h:34:9: note: in expansion of macro 'this_cpu_read' return this_cpu_read(printk_func)(fmt, args); ^ kernel/printk/nmi.c: In function '__printk_nmi_flush': >> kernel/printk/nmi.c:106:9: error: unknown type name 'raw_spinlock_t' static raw_spinlock_t read_lock = ^ >> kernel/printk/nmi.c:107:3: error: implicit declaration of function >> '__RAW_SPIN_LOCK_INITIALIZER' [-Werror=implicit-function-declaration] __RAW_SPIN_LOCK_INITIALIZER(read_lock); ^ >> kernel/printk/nmi.c:107:3: error: initializer element is not constant >> kernel/printk/nmi.c:118:2: error: implicit declaration of function >> 'raw_spin_lock' [-Werror=implicit-function-declaration] raw_spin_lock(_lock); ^ >> kernel/printk/nmi.c:163:2: error: implicit declaration of function >> 'raw_spin_unlock' [-Werror=implicit-function-declaration] raw_spin_unlock(_lock); ^ cc1: some warnings being treated as errors vim +/preempt_disable +111 include/asm-generic/percpu.h 9c28278a24 Tejun Heo 2014-06-17 105(__ret); \ 9c28278a24 Tejun Heo 2014-06-17 106 }) 9c28278a24 Tejun Heo 2014-06-17 107 eba117889a Tejun Heo 2014-06-17 108 #define this_cpu_generic_read(pcp) \ eba117889a Tejun Heo 2014-06-17 109 ({ \ eba117889a Tejun Heo 2014-06-17 110typeof(pcp) __ret; \ 9c28278a24 Tejun Heo 2014-06-17 @111preempt_disable(); \ eba117889a Tejun Heo 2014-06-17 112__ret = *this_cpu_ptr(&(pcp)); \ 9c28278a24 Tejun Heo 2014-06-17 @113preempt_enable(); \ eba117889a Tejun Heo 2014-06-17 114__ret; \ 9c28278a24 Tejun Heo 2014-06-17 115 }) 9c28278a24 Tejun Heo 2014-06-17 116 eba117889a Tejun Heo 2014-06-17 117 #define this_cpu_generic_to_op(pcp, val, op) \ 9c28278a24 Tejun Heo 2014-06-17 118 do { \ eba117889a Tejun Heo 2014-06-17 119
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
Hi Petr, [auto build test WARNING on powerpc/next] [also build test WARNING on v4.4-rc2 next-20151127] [cannot apply to tip/x86/core] url: https://github.com/0day-ci/linux/commits/Petr-Mladek/Cleaning-printk-stuff-in-NMI-context/20151127-191620 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: mn10300-asb2364_defconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=mn10300 All warnings (new ones prefixed by >>): warning: (MN10300) selects HAVE_NMI_WATCHDOG which has unmet direct dependencies (HAVE_NMI) --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
Hi Peter, On Fri, Nov 27, 2015 at 2:09 PM, Petr Mladekwrote: > printk() takes some locks and could not be used a safe way in NMI > context. > > The chance of a deadlock is real especially when printing > stacks from all CPUs. This particular problem has been addressed > on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack > trace on all CPUs"). > > This patch reuses most of the code and makes it generic. It is > useful for all messages and architectures that support NMI. > > The alternative printk_func is set when entering and is reseted when > leaving NMI context. It queues IRQ work to copy the messages into > the main ring buffer in a safe context. > > __printk_nmi_flush() copies all available messages and reset > the buffer. Then we could use a simple cmpxchg operations to > get synchronized with writers. There is also used a spinlock > to get synchronized with other flushers. > > We do not longer use seq_buf because it depends on external lock. > It would be hard to make all supported operations safe for > a lockless use. It would be confusing and error prone to > make only some operations safe. > > The code is put into separate printk/nmi.c as suggested by > Steven Rostedt. It needs a per-CPU buffer and is compiled only > on architectures that call nmi_enter(). This is achieved by > the new HAVE_NMI Kconfig flag. > > One exception is arm where the deferred printing is used for > printing backtraces even without NMI. For this purpose, > we define NEED_PRINTK_NMI Kconfig flag. The alternative > printk_func is explicitly set when IPI_CPU_BACKTRACE is > handled. > > Another exception is Xtensa architecture that uses just a > fake NMI. It's called fake because it's actually maskable, but sometimes it is safe to use it as NMI (when there are no other IRQs at the same priority level and that level equals EXCM level). That condition is checked in arch/xtensa/include/asm/processor.h So 'fake' here is to avoid confusion with real NMI that exists on xtensa (and is not currently used in linux), otherwise code that runs in fake NMI must follow the NMI rules. To make xtensa compatible with your change we can add a choice whether fake NMI should be used to kconfig. It can then set HAVE_NMI accordingly. I'll post a patch for xtensa. -- Thanks. -- Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] printk/nmi: Generic solution for safe printk in NMI
On Fri 2015-11-27 19:49:48, kbuild test robot wrote: > Hi Petr, > > [auto build test WARNING on powerpc/next] > [also build test WARNING on v4.4-rc2 next-20151127] > [cannot apply to tip/x86/core] > > url: > https://github.com/0day-ci/linux/commits/Petr-Mladek/Cleaning-printk-stuff-in-NMI-context/20151127-191620 > base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next > config: mn10300-asb2364_defconfig (attached as .config) > reproduce: > wget > https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross > -O ~/bin/make.cross > chmod +x ~/bin/make.cross > # save the attached .config to linux build tree > make.cross ARCH=mn10300 > > All warnings (new ones prefixed by >>): > > warning: (MN10300) selects HAVE_NMI_WATCHDOG which has unmet direct > dependencies (HAVE_NMI) MN10300 has its own implementation for entering and exiting NMI handlers. It does not call nmi_enter() and nmi_exit(). Please, find below an updated patch that adds printk_nmi_enter() and printk_nmi_exit() to the custom entry points. Then we could add HAVE_NMI to arch/mn10300/Kconfig and avoid the above warning. The updated patch also fixes includes in kernel/printk/nmi.c and kernel/printk/printk.h to fix the other build errors found by kbuild test robot. The kbuild test robot is really cool thing! >From 1689f635cc423ff9887c6774ad6b59a1ea885e4b Mon Sep 17 00:00:00 2001 From: Petr MladekDate: Thu, 2 Jul 2015 13:17:17 +0200 Subject: [PATCH 1/5] printk/nmi: Generic solution for safe printk in NMI printk() takes some locks and could not be used a safe way in NMI context. The chance of a deadlock is real especially when printing stacks from all CPUs. This particular problem has been addressed on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack trace on all CPUs"). This patch reuses most of the code and makes it generic. It is useful for all messages and architectures that support NMI. The alternative printk_func is set when entering and is reseted when leaving NMI context. It queues IRQ work to copy the messages into the main ring buffer in a safe context. __printk_nmi_flush() copies all available messages and reset the buffer. Then we could use a simple cmpxchg operations to get synchronized with writers. There is also used a spinlock to get synchronized with other flushers. We do not longer use seq_buf because it depends on external lock. It would be hard to make all supported operations safe for a lockless use. It would be confusing and error prone to make only some operations safe. The code is put into separate printk/nmi.c as suggested by Steven Rostedt. It needs a per-CPU buffer and is compiled only on architectures that call nmi_enter(). This is achieved by the new HAVE_NMI Kconfig flag. One exception is arm where the deferred printing is used for printing backtraces even without NMI. For this purpose, we define NEED_PRINTK_NMI Kconfig flag. The alternative printk_func is explicitly set when IPI_CPU_BACKTRACE is handled. Second exception is MN10300 architecture that has its own implementation of entering and exiting NMI handlers. It even has a separate optimized implementation for NMI watchdog. This patch adds printk_nmi_enter() and printk_nmi_exit() to these custom entry points. Note that we have to define HAVE_NMI here. Otherwise, Kconfig complains about unmet direct dependencies for HAVE_NMI_WATCHDOG. Last exception is Xtensa architecture that uses just a fake NMI. The patch is heavily based on the draft from Peter Zijlstra, see https://lkml.org/lkml/2015/6/10/327 Suggested-by: Peter Zijlstra Suggested-by: Steven Rostedt Signed-off-by: Petr Mladek --- arch/Kconfig | 7 ++ arch/arm/Kconfig | 2 + arch/arm/kernel/smp.c | 2 + arch/avr32/Kconfig | 1 + arch/blackfin/Kconfig | 1 + arch/cris/Kconfig | 1 + arch/mips/Kconfig | 1 + arch/mn10300/Kconfig | 1 + arch/mn10300/kernel/mn10300-watchdog.c | 4 + arch/mn10300/kernel/smp.c | 3 + arch/powerpc/Kconfig | 1 + arch/s390/Kconfig | 1 + arch/sh/Kconfig| 1 + arch/sparc/Kconfig | 1 + arch/tile/Kconfig | 1 + arch/x86/Kconfig | 1 + arch/x86/kernel/apic/hw_nmi.c | 1 - include/linux/hardirq.h| 2 + include/linux/percpu.h | 3 - include/linux/printk.h | 12 +- init/Kconfig | 5 + init/main.c| 1 + kernel/printk/Makefile | 1 + kernel/printk/nmi.c| 202 +