Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Sat, Feb 20, 2021 at 05:32:30PM +1100, Finn Thain wrote: > Nope. Interrupt priority masking is there to place an upper bound > interrupt latency. That's why this feature is shipping in contemporary > hardware (e.g. ARM GIC). If you care about real time workloads on arm64, > that may interest you. I don't know if it's still true today, but in the past there was a very noticeable difference in timer stability between the 68k macintosh models with the timer interrupt at IPL 1 as compared to the models where the timer interrupt was at IPL 6. The ability to preempt the other interrupt handlers made the difference between a usable clock and one that was pretty unreliable. Brad Boyer f...@allandria.com
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Thu, 18 Feb 2021, Arnd Bergmann wrote: > On Thu, Feb 18, 2021 at 6:30 AM Finn Thain wrote: > > On Wed, 17 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > > On Sat, 13 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > > > > > > That scenario seems a little contrived to me (drivers for two or > > > > more devices sharing state through their interrupt handlers). Is > > > > it real? I suppose every platform has its quirks. The irq lock in > > > > sonic_interrupt() is only there because of a platform quirk (the > > > > same device can trigger either of two IRQs). Anyway, no-one > > > > expects all drivers to work on all platforms; I don't know why it > > > > bothers you so much when platforms differ. > > > > > > Basically, we wrote drivers with the assumption that this driver > > > will be cross-platform. (Of course there are some drivers which can > > > only work on one platform, for example, if the IP of the device is > > > only used in one platform as an internal component of a specific > > > SoC.) > > > > > > So once a device has two or more interrupts, we need to consider one > > > interrupt might preempt another one on m68k on the same cpu if we > > > also want to support this driver on m68k. this usually doesn't > > > matter on other platforms. > > > > When users show up who desire to run your drivers on their platform, > > you can expect them to bring patches and a MAINTAINERS file entry. > > AFAIK, Linux development has always worked that way. > > This is only part of the picture though. We also also constantly trying > to generalize the internal interfaces, to make sure that platforms work > the same way and that it's possible to write drivers in a portable way > without having to rely on platform maintainers to point out the > differences. > > I think it would make a lot of sense to remove the architecture > differences here by making m68k work the same way as the others and > documenting that as the expected behavior. > If you had some great new feature that was incompatible with priority masking, or incompatible with existing drivers portable enough to support such features, then I would be more amenable to your plan to remove functionality. But there's no real justification here. You say platform maintainers should not have to "point out the differences". But is that not their job? > You are probably right that there are no specific bugs on m68k machines > that rely on the nested hardirqs today, but I think they only get away > with it because > > a) there is no SMP support on m68k, so it likely doesn't run into the >more subtle cases with lock ordering that you could get when you have >hardirq handlers on multiple CPUs in parallel > And that's relevant because SMP support is now mandatory? Is this the logical consequence of your intention to "remove the architecture differences"? > b) there is a very limited number of device drivers that are actually >used on m68k, in particular only M54xx has PCI support, but that in >turn has a different interrupt scheme. > Everyone is afraid of some mysterious bug somewhere, yet no one can point to it. Again, I submit that the bug doesn't exist. That's because there is no material difference in semantics between the irqs_disabled() implementation that says "all interrupts are disabled except for NMI (and some others that some ARM platform cares about)" and the implementation that says "interrupts are disabled except higher priority ones than you may be enabled". If you can point to code that cares about such semantics, I predict you've found either a coding anti-pattern or perhaps some obscure hardware design flaw. Either way, there is no justification for your plan. > Changing the behavior on m68k clearly has its own regression risk, but > it could be done as a configuration option that defaults to the > traditional behavior on machines that have not been verified to be > well-behaved without nested hardirqs, and hidden on machines that do not > need it (any more). > This plan will quantifiably increase interrupt latency. It's not some vague risk that you can hand-wave away. It's unavoidable. > As far as I can tell, the only reason you would actually need nested > hardirqs is when a low-priority interrupt has to perform expensive I/O > processing, but we've had countless other methods to do the same over > the years (at least bottom half, softirq, taskqueue, tasklet, keventd, > workqueue, kthread, threaded interrupt handlers and probably others). > Nope. Interrupt priority masking is there to place an upper bound interrupt latency. That's why this feature is shipping in contemporary hardware (e.g. ARM GIC). If you care about real time workloads on arm64, that may interest you. If you don't care about arm hardware or real time workloads, that's fine too, but here's the rub. Song Bao Hua's plan involves reworking the locking in existing drivers (which may be portable
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Fri, Feb 19, 2021 at 09:10:57AM +0100, Geert Uytterhoeven wrote: > Hi Michael, > > On Thu, Feb 18, 2021 at 11:11 PM Michael Schmitz wrote: > > On 19/02/21 12:19 am, Arnd Bergmann wrote: > > > drivers/net/ethernet/8390/apne.c > > > drivers/net/ethernet/8390/ax88796.c > > > drivers/net/ethernet/8390/hydra.c > > > drivers/net/ethernet/8390/mac8390.c > > > drivers/net/ethernet/8390/ne.c > > > drivers/net/ethernet/8390/zorro8390.c > > [...] > > > Most of these are normal short-lived interrupts that only transfer > > > a few bytes or schedule deferred processing of some sort. > > > Most of the scsi and network drivers process the data in > > > a softirq, so those are generally fine here, but I do see that 8390 > > > (ne2000) ethernet and the drivers/ide drivers do transfer their > > > data in hardirq context. > > > > > > Arnd > > > > 8390 ethernet drivers are widely used on m68k platforms (Amiga and > > Atari). At least on Amiga, the corresponding interrupt is a hardirq so > > I'd advise caution. That said, the 8390 drivers might benefit from some > > refactoring (the way these drivers are compiled does prevent e.g. the > > APNE driver from being used with two different variants of PCMCIA > > interfaces. I had begun some work on making IO primitives runtime > > selected two years ago but that ended up looking too ugly ...). > > > > Can't recall what IPL the 8390 interrupts operate at - Geert? > > #define IRQ_AMIGA_PORTS IRQ_AUTO_2 > > Zorro expansion boards can also use > > #define IRQ_AMIGA_EXTER IRQ_AUTO_6 > > and some boards may have a jumper to select the latter, but all Amiga > Linux drivers use IRQ_AMIGA_PORTS. The mac8390 driver will show up wherever NuBus interrupts are routed, which varies by hardware. It's frequently IRQ_AUTO_2 (due to being routed through the second VIA chip on most models), but it can be in other places on systems that don't use a second VIA chip. On the IIfx, NuBus interrupts get routed through OSS to IRQ_AUTO_3. I don't see any cases that use other interrupt levels, but some of the logic is a little harder to follow. It's not an 8390, but it looks like macmace actually uses two levels. The regular interrupt should show up at IRQ_AUTO_3 and the DMA interrupts at IRQ_AUTO_4. Most Macs only use 4 for serial, but this is an exception. Brad Boyer f...@allandria.com
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
Hi Michael, On Thu, Feb 18, 2021 at 11:11 PM Michael Schmitz wrote: > On 19/02/21 12:19 am, Arnd Bergmann wrote: > > drivers/net/ethernet/8390/apne.c > > drivers/net/ethernet/8390/ax88796.c > > drivers/net/ethernet/8390/hydra.c > > drivers/net/ethernet/8390/mac8390.c > > drivers/net/ethernet/8390/ne.c > > drivers/net/ethernet/8390/zorro8390.c > [...] > > Most of these are normal short-lived interrupts that only transfer > > a few bytes or schedule deferred processing of some sort. > > Most of the scsi and network drivers process the data in > > a softirq, so those are generally fine here, but I do see that 8390 > > (ne2000) ethernet and the drivers/ide drivers do transfer their > > data in hardirq context. > > > > Arnd > > 8390 ethernet drivers are widely used on m68k platforms (Amiga and > Atari). At least on Amiga, the corresponding interrupt is a hardirq so > I'd advise caution. That said, the 8390 drivers might benefit from some > refactoring (the way these drivers are compiled does prevent e.g. the > APNE driver from being used with two different variants of PCMCIA > interfaces. I had begun some work on making IO primitives runtime > selected two years ago but that ended up looking too ugly ...). > > Can't recall what IPL the 8390 interrupts operate at - Geert? #define IRQ_AMIGA_PORTS IRQ_AUTO_2 Zorro expansion boards can also use #define IRQ_AMIGA_EXTER IRQ_AUTO_6 and some boards may have a jumper to select the latter, but all Amiga Linux drivers use IRQ_AMIGA_PORTS. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On 19/02/21 12:19 am, Arnd Bergmann wrote: drivers/net/ethernet/8390/apne.c drivers/net/ethernet/8390/ax88796.c drivers/net/ethernet/8390/hydra.c drivers/net/ethernet/8390/mac8390.c drivers/net/ethernet/8390/ne.c drivers/net/ethernet/8390/zorro8390.c [...] Most of these are normal short-lived interrupts that only transfer a few bytes or schedule deferred processing of some sort. Most of the scsi and network drivers process the data in a softirq, so those are generally fine here, but I do see that 8390 (ne2000) ethernet and the drivers/ide drivers do transfer their data in hardirq context. Arnd 8390 ethernet drivers are widely used on m68k platforms (Amiga and Atari). At least on Amiga, the corresponding interrupt is a hardirq so I'd advise caution. That said, the 8390 drivers might benefit from some refactoring (the way these drivers are compiled does prevent e.g. the APNE driver from being used with two different variants of PCMCIA interfaces. I had begun some work on making IO primitives runtime selected two years ago but that ended up looking too ugly ...). Can't recall what IPL the 8390 interrupts operate at - Geert? Regarding allowing preemption of hardirqs - at least in 2017, that was considered perfectly OK (see Linus' comment on https://lore.kernel.org/patchwork/patch/794954/). I concur with Finn on this one - we did look for potential issues with the way irqs_disabled() is defined on m68k (for me, in the context of the above bug that caused memory corruption on my system), and found none. Cheers, Michael
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
Hi Arnd, On Thu, Feb 18, 2021 at 2:59 PM Arnd Bergmann wrote: > On Thu, Feb 18, 2021 at 1:30 PM Geert Uytterhoeven > wrote: > > The reason drivers/ide is doing that may be related to IDE hard drive > > quirks. The old WD Caviar drives didn't obey disabling the IDE interrupt > > at the drive level. On PC, that worked fine, as IRQs 14 and 15 weren't > > shared with other devices. On systems with shared interrupts, that > > broke badly, and led to an interrupt storm. > > So presumably anyone that has one of those old drives will not be > able to move to drivers/ata then? I see that drivers/ata doesn't do > the transfers in interrupt mode, so it would seem to rely on masking > at the device level. I don't know. This was an issue I debugged on a friend's Amiga in 1995 or so ;-) All bad WD Caviars may have died in the meantime... BTW, it wouldn't be the first time a drive quirk handling wasn't ported from drivers/ide/ to drivers/ata/. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Thu, Feb 18, 2021 at 1:30 PM Geert Uytterhoeven wrote: > > The reason drivers/ide is doing that may be related to IDE hard drive > quirks. The old WD Caviar drives didn't obey disabling the IDE interrupt > at the drive level. On PC, that worked fine, as IRQs 14 and 15 weren't > shared with other devices. On systems with shared interrupts, that > broke badly, and led to an interrupt storm. So presumably anyone that has one of those old drives will not be able to move to drivers/ata then? I see that drivers/ata doesn't do the transfers in interrupt mode, so it would seem to rely on masking at the device level. On the other hand, out of the five m68k specific IDE drivers (gayle, buddha, falcon, mac_ide, q40), only the last two don't seem to have an ata driver equivalent. mac_ide uses the highest priority interrupts (NUBUS_C, NUBUS_F) so it appears to not actually benefit from the nested hardirq but would benefit being converted to a sata driver with processing at softirq time. q40 in turn doesn't appear to share interrupts, though its irq_disable() function probably doesn't do what it should. I could not figure out at what priority this one runs, and if it is expected to get interrupted. Arnd
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
Hi Arnd, On Thu, Feb 18, 2021 at 12:20 PM Arnd Bergmann wrote: > Most of these are normal short-lived interrupts that only transfer > a few bytes or schedule deferred processing of some sort. > Most of the scsi and network drivers process the data in > a softirq, so those are generally fine here, but I do see that 8390 > (ne2000) ethernet and the drivers/ide drivers do transfer their > data in hardirq context. The reason drivers/ide is doing that may be related to IDE hard drive quirks. The old WD Caviar drives didn't obey disabling the IDE interrupt at the drive level. On PC, that worked fine, as IRQs 14 and 15 weren't shared with other devices. On systems with shared interrupts, that broke badly, and led to an interrupt storm. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Thu, Feb 18, 2021 at 6:30 AM Finn Thain wrote: > On Wed, 17 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > On Sat, 13 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > > > > That scenario seems a little contrived to me (drivers for two or more > > > devices sharing state through their interrupt handlers). Is it real? I > > > suppose every platform has its quirks. The irq lock in > > > sonic_interrupt() is only there because of a platform quirk (the same > > > device can trigger either of two IRQs). Anyway, no-one expects all > > > drivers to work on all platforms; I don't know why it bothers you so > > > much when platforms differ. > > > > Basically, we wrote drivers with the assumption that this driver will be > > cross-platform. (Of course there are some drivers which can only work on > > one platform, for example, if the IP of the device is only used in one > > platform as an internal component of a specific SoC.) > > > > So once a device has two or more interrupts, we need to consider one > > interrupt might preempt another one on m68k on the same cpu if we also > > want to support this driver on m68k. this usually doesn't matter on > > other platforms. > > When users show up who desire to run your drivers on their platform, you > can expect them to bring patches and a MAINTAINERS file entry. AFAIK, > Linux development has always worked that way. This is only part of the picture though. We also also constantly trying to generalize the internal interfaces, to make sure that platforms work the same way and that it's possible to write drivers in a portable way without having to rely on platform maintainers to point out the differences. I think it would make a lot of sense to remove the architecture differences here by making m68k work the same way as the others and documenting that as the expected behavior. You are probably right that there are no specific bugs on m68k machines that rely on the nested hardirqs today, but I think they only get away with it because a) there is no SMP support on m68k, so it likely doesn't run into the more subtle cases with lock ordering that you could get when you have hardirq handlers on multiple CPUs in parallel b) there is a very limited number of device drivers that are actually used on m68k, in particular only M54xx has PCI support, but that in turn has a different interrupt scheme. Changing the behavior on m68k clearly has its own regression risk, but it could be done as a configuration option that defaults to the traditional behavior on machines that have not been verified to be well-behaved without nested hardirqs, and hidden on machines that do not need it (any more). As far as I can tell, the only reason you would actually need nested hardirqs is when a low-priority interrupt has to perform expensive I/O processing, but we've had countless other methods to do the same over the years (at least bottom half, softirq, taskqueue, tasklet, keventd, workqueue, kthread, threaded interrupt handlers and probably others). It would probably not be too hard to go through all drivers that are normally enabled and see if their interrupt handler should be changed to use deferred processing (threaded irq handler or workqueue) instead of doing expensive I/O in hardirq context and relying on priorities. FWIW, these are the drivers in m68k multi_defconfig that register an interrupt: $ find build/tmp/{drivers,sound} -name \*.o | xargs grep request_.*irq | cut -f 3- -d/ | cut -f 1 -d\ | sed -e "s:o$:c:g" drivers/block/amiflop.c drivers/ide/ide-probe.c drivers/input/joystick/amijoy.c drivers/input/keyboard/amikbd.c drivers/input/keyboard/hilkbd.c drivers/input/mouse/amimouse.c drivers/input/serio/hp_sdc.c drivers/input/serio/q40kbd.c drivers/macintosh/via-cuda.c drivers/macintosh/via-macii.c drivers/macintosh/via-pmu.c drivers/misc/dummy-irq.c drivers/net/ethernet/8390/apne.c drivers/net/ethernet/8390/ax88796.c drivers/net/ethernet/8390/hydra.c drivers/net/ethernet/8390/mac8390.c drivers/net/ethernet/8390/ne.c drivers/net/ethernet/8390/zorro8390.c drivers/net/ethernet/amd/7990.c drivers/net/ethernet/amd/a2065.c drivers/net/ethernet/amd/ariadne.c drivers/net/ethernet/amd/atarilance.c drivers/net/ethernet/amd/sun3lance.c drivers/net/ethernet/apple/macmace.c drivers/net/ethernet/cirrus/mac89x0.c drivers/net/ethernet/i825xx/82596.c drivers/net/ethernet/natsemi/macsonic.c drivers/net/ethernet/smsc/smc91x.c drivers/net/phy/phy.c drivers/parport/parport_amiga.c drivers/parport/parport_atari.c drivers/parport/parport_mfc3.c drivers/parport/parport_pc.c drivers/scsi/a2091.c drivers/scsi/a3000.c drivers/scsi/a4000t.c drivers/scsi/atari_scsi.c drivers/scsi/bvme6000_scsi.c drivers/scsi/gvp11.c drivers/scsi/mac_esp.c drivers/scsi/mac_scsi.c drivers/scsi/mvme147.c drivers/scsi/mvme16x_scsi.c drivers/scsi/sun3x_esp.c drivers/scsi/zorro7xx.c drivers/scsi/zorro_esp.c drivers/tty/amiserial.c drivers/tty/serial/pmac_zilog.c drivers/video/fbdev/amifb.c drivers/video/fbdev/at
RE: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Wed, 17 Feb 2021, Song Bao Hua (Barry Song) wrote: > > On Sat, 13 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > > > > > > So what is really confusing and a pain to me is that: For years > > > people like me have been writing device drivers with the idea that > > > irq handlers run with interrupts disabled after those commits in > > > genirq. So I don't need to care about if some other IRQs on the same > > > cpu will jump out to access the data the current IRQ handler is > > > accessing. > > > > > > but it turns out the assumption is not true on some platform. So > > > should I start to program devices driver with the new idea > > > interrupts can actually come while irqhandler is running? > > > > > > That's the question which really bothers me. > > > > > > > That scenario seems a little contrived to me (drivers for two or more > > devices sharing state through their interrupt handlers). Is it real? I > > suppose every platform has its quirks. The irq lock in > > sonic_interrupt() is only there because of a platform quirk (the same > > device can trigger either of two IRQs). Anyway, no-one expects all > > drivers to work on all platforms; I don't know why it bothers you so > > much when platforms differ. > > Basically, we wrote drivers with the assumption that this driver will be > cross-platform. (Of course there are some drivers which can only work on > one platform, for example, if the IP of the device is only used in one > platform as an internal component of a specific SoC.) > > So once a device has two or more interrupts, we need to consider one > interrupt might preempt another one on m68k on the same cpu if we also > want to support this driver on m68k. this usually doesn't matter on > other platforms. > When users show up who desire to run your drivers on their platform, you can expect them to bring patches and a MAINTAINERS file entry. AFAIK, Linux development has always worked that way. Besides, not all m68k platforms implement priority masking. So there's no problem with portability to m68k per se. > on the other hand, there are more than 400 irqs_disabled() in kernel, I > am really not sure if they are running with the knowledge that the true > irqs_disabled() actually means some interrupts are off and some others > are still open on m68k. Firstly, use of irqs_disabled() is considered an antipattern by some developers. Please see, https://lore.kernel.org/linux-scsi/X8pfD5XtLoOygdez@lx-t490/ and commit e6b6be53ec91 ("Merge branch 'net-in_interrupt-cleanup-and-fixes'") This means that the differences in semantics between the irqs_disabled() implementations on various architectures are moot. Secondly, the existence of irqs_disabled() call sites does not imply a flaw in your drivers nor in the m68k interrupt scheme. The actual semantic differences are immaterial at many (all?) of these call sites. > Or they are running with the assumption that the true irqs_disabled() > means IRQ is totally quiet? If the latter is true, those drivers might > fail to work on m68k as well. > Yes it's possible, and that was my fear too back in 2017 when I raised the same question with the m68k maintainer. But I never found any code making that assumption. If you know of such a bug, do tell. So far, your fears remain unfounded. > Thanks > Barry >
RE: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
> -Original Message- > From: Finn Thain [mailto:fth...@telegraphics.com.au] > Sent: Sunday, February 14, 2021 6:11 PM > To: Song Bao Hua (Barry Song) > Cc: Arnd Bergmann ; t...@linutronix.de; > gre...@linuxfoundation.org; a...@arndb.de; ge...@linux-m68k.org; > fun...@jurai.org; ph...@gnu.org; cor...@lwn.net; mi...@redhat.com; > linux-m...@lists.linux-m68k.org; linux-kernel@vger.kernel.org > Subject: RE: [RFC] IRQ handlers run with some high-priority interrupts(not > NMI) > enabled on some platform > > On Sat, 13 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > > > So what is really confusing and a pain to me is that: > > For years people like me have been writing device drivers > > with the idea that irq handlers run with interrupts > > disabled after those commits in genirq. So I don't need > > to care about if some other IRQs on the same cpu will > > jump out to access the data the current IRQ handler > > is accessing. > > > > but it turns out the assumption is not true on some platform. > > So should I start to program devices driver with the new idea > > interrupts can actually come while irqhandler is running? > > > > That's the question which really bothers me. > > > > That scenario seems a little contrived to me (drivers for two or more > devices sharing state through their interrupt handlers). Is it real? > I suppose every platform has its quirks. The irq lock in sonic_interrupt() > is only there because of a platform quirk (the same device can trigger > either of two IRQs). Anyway, no-one expects all drivers to work on all > platforms; I don't know why it bothers you so much when platforms differ. Basically, we wrote drivers with the assumption that this driver will be cross-platform. (Of course there are some drivers which can only work on one platform, for example, if the IP of the device is only used in one platform as an internal component of a specific SoC.) So once a device has two or more interrupts, we need to consider one interrupt might preempt another one on m68k on the same cpu if we also want to support this driver on m68k. this usually doesn't matter on other platforms. on the other hand, there are more than 400 irqs_disabled() in kernel, I am really not sure if they are running with the knowledge that the true irqs_disabled() actually means some interrupts are off and some others are still open on m68k. Or they are running with the assumption that the true irqs_disabled() means IRQ is totally quiet? If the latter is true, those drivers might fail to work on m68k as well. Thanks Barry
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Mon, 15 Feb 2021, Andy Shevchenko wrote: > On Sun, Feb 14, 2021 at 7:12 AM Finn Thain wrote: > > On Sat, 13 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > So what is really confusing and a pain to me is that: > > > For years people like me have been writing device drivers > > > with the idea that irq handlers run with interrupts > > > disabled after those commits in genirq. So I don't need > > > to care about if some other IRQs on the same cpu will > > > jump out to access the data the current IRQ handler > > > is accessing. > > > > > > but it turns out the assumption is not true on some platform. > > > So should I start to program devices driver with the new idea > > > interrupts can actually come while irqhandler is running? > > > > > > That's the question which really bothers me. > > > > > > > That scenario seems a little contrived to me (drivers for two or more > > devices sharing state through their interrupt handlers). Is it real? > > I suppose every platform has its quirks. The irq lock in sonic_interrupt() > > is only there because of a platform quirk (the same device can trigger > > either of two IRQs). Anyway, no-one expects all drivers to work on all > > platforms; I don't know why it bothers you so much when platforms differ. > > Isn't it any IRQ chip driver which may get IRQ on one or more lines > simultaneously? > The question here is can the handler be re-entrant on the same CPU in > that case? > An irq_chip handler used only for interrupts having the same priority level cannot be re-entered, thanks to the priority mask. Where the priority levels are different, as in auto_irq_chip in arch/m68k/kernel/ints.c, handle_simple_irq() can be re-entered but not with the same descriptor (i.e. no shared state). Documentation/core-api/genericirq.rst says, The locking of chip registers is up to the architecture that defines the chip primitives. The per-irq structure is protected via desc->lock, by the generic layer. Since the synchronization of chip registers is left up to the architecture, I think the related code should be portable. Looking in kernel/irq/*.c, I see that desc->lock is sometimes acquired with raw_spin_lock_irq(&desc->lock) and sometimes raw_spin_lock(&desc->lock). I don't know whether there are portability issues lurking here; this code is subtle and largely unfamiliar to me.
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Sun, Feb 14, 2021 at 7:12 AM Finn Thain wrote: > On Sat, 13 Feb 2021, Song Bao Hua (Barry Song) wrote: > > So what is really confusing and a pain to me is that: > > For years people like me have been writing device drivers > > with the idea that irq handlers run with interrupts > > disabled after those commits in genirq. So I don't need > > to care about if some other IRQs on the same cpu will > > jump out to access the data the current IRQ handler > > is accessing. > > > > but it turns out the assumption is not true on some platform. > > So should I start to program devices driver with the new idea > > interrupts can actually come while irqhandler is running? > > > > That's the question which really bothers me. > > > > That scenario seems a little contrived to me (drivers for two or more > devices sharing state through their interrupt handlers). Is it real? > I suppose every platform has its quirks. The irq lock in sonic_interrupt() > is only there because of a platform quirk (the same device can trigger > either of two IRQs). Anyway, no-one expects all drivers to work on all > platforms; I don't know why it bothers you so much when platforms differ. Isn't it any IRQ chip driver which may get IRQ on one or more lines simultaneously? The question here is can the handler be re-entrant on the same CPU in that case? -- With Best Regards, Andy Shevchenko
RE: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Sat, 13 Feb 2021, Song Bao Hua (Barry Song) wrote: > > So what is really confusing and a pain to me is that: > For years people like me have been writing device drivers > with the idea that irq handlers run with interrupts > disabled after those commits in genirq. So I don't need > to care about if some other IRQs on the same cpu will > jump out to access the data the current IRQ handler > is accessing. > > but it turns out the assumption is not true on some platform. > So should I start to program devices driver with the new idea > interrupts can actually come while irqhandler is running? > > That's the question which really bothers me. > That scenario seems a little contrived to me (drivers for two or more devices sharing state through their interrupt handlers). Is it real? I suppose every platform has its quirks. The irq lock in sonic_interrupt() is only there because of a platform quirk (the same device can trigger either of two IRQs). Anyway, no-one expects all drivers to work on all platforms; I don't know why it bothers you so much when platforms differ.
RE: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
> -Original Message- > From: Song Bao Hua (Barry Song) > Sent: Sunday, February 14, 2021 11:13 AM > To: 'Arnd Bergmann' > Cc: t...@linutronix.de; gre...@linuxfoundation.org; a...@arndb.de; > ge...@linux-m68k.org; fun...@jurai.org; ph...@gnu.org; cor...@lwn.net; > mi...@redhat.com; linux-m...@lists.linux-m68k.org; > fth...@telegraphics.com.au; linux-kernel@vger.kernel.org > Subject: RE: [RFC] IRQ handlers run with some high-priority interrupts(not > NMI) > enabled on some platform > > > > > -Original Message- > > From: Arnd Bergmann [mailto:a...@kernel.org] > > Sent: Sunday, February 14, 2021 5:32 AM > > To: Song Bao Hua (Barry Song) > > Cc: t...@linutronix.de; gre...@linuxfoundation.org; a...@arndb.de; > > ge...@linux-m68k.org; fun...@jurai.org; ph...@gnu.org; cor...@lwn.net; > > mi...@redhat.com; linux-m...@lists.linux-m68k.org; > > fth...@telegraphics.com.au; linux-kernel@vger.kernel.org > > Subject: Re: [RFC] IRQ handlers run with some high-priority interrupts(not > NMI) > > enabled on some platform > > > > On Sat, Feb 13, 2021 at 12:50 AM Song Bao Hua (Barry Song) > > wrote: > > > > > So I was actually trying to warn this unusual case - interrupts > > > get nested while both in_hardirq() and irqs_disabled() are true. > > > > > > diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h > > > index 7c9d6a2d7e90..b8ca27555c76 100644 > > > --- a/include/linux/hardirq.h > > > +++ b/include/linux/hardirq.h > > > @@ -32,6 +32,7 @@ static __always_inline void > > > rcu_irq_enter_check_tick(void) > > > */ > > > #define __irq_enter() \ > > > do {\ > > > + WARN_ONCE(in_hardirq() && irqs_disabled(), "nested > > > interrupts\n"); \ > > > preempt_count_add(HARDIRQ_OFFSET); \ > > > > That seems to be a rather heavyweight change in a critical path. > > > > A more useful change might be to implement lockdep support for m68k > > and see if that warns about any actual problems. I'm not sure > > what is actually missing for that, but these are the commits that > > added it for other architectures in the past: > > > > 3c4697982982 ("riscv: Enable LOCKDEP_SUPPORT & fixup > TRACE_IRQFLAGS_SUPPORT") > > 000591f1ca33 ("csky: Enable LOCKDEP_SUPPORT") > > 78cdfb5cf15e ("openrisc: enable LOCKDEP_SUPPORT and irqflags tracing") > > 8f371c752154 ("xtensa: enable lockdep support") > > bf2d80966890 ("microblaze: Lockdep support") > > > > Yes. M68k lacks lockdep support which might be added. BTW, probably m68k won't run into any problem with lockdep as it has been running for decades. Just like interrupts were widely allowed to preempt irq handlers on all platforms before IRQF_DISABLED was dropped and commit e58aa3d2d0cc (" genirq: Run irq handlers with interrupts disabled"). Rarely we could really run into the stack overflow issue commit e58aa3d2d0cc mentioned at that time. Before those commits we had already made thousands of successful Linux products running irq handlers with interrupts enabled. So what is really confusing and a pain to me is that: For years people like me have been writing device drivers with the idea that irq handlers run with interrupts disabled after those commits in genirq. So I don't need to care about if some other IRQs on the same cpu will jump out to access the data the current IRQ handler is accessing. but it turns out the assumption is not true on some platform. So should I start to program devices driver with the new idea interrupts can actually come while irqhandler is running? That's the question which really bothers me. > > > > And I also think it is better for m68k's arch_irqs_disabled() to > > > return true only when both low and high priority interrupts are > > > disabled rather than try to mute this warn in genirq by a weaker > > > condition: > > > if (WARN_ONCE(!irqs_disabled(),"irq %u handler %pS enabled > > interrupts\n", > > > irq, action->handler)) > > >local_irq_disable(); > > > } > > > > > > This warn is not activated on m68k because its arch_irqs_disabled() return > > > true though its high-priority interrupts are still enabled. > > > > Then it would just end up always warning when a nested hardirq happens, > > right? That seems no different to dropping suppor
RE: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
> -Original Message- > From: Arnd Bergmann [mailto:a...@kernel.org] > Sent: Sunday, February 14, 2021 5:32 AM > To: Song Bao Hua (Barry Song) > Cc: t...@linutronix.de; gre...@linuxfoundation.org; a...@arndb.de; > ge...@linux-m68k.org; fun...@jurai.org; ph...@gnu.org; cor...@lwn.net; > mi...@redhat.com; linux-m...@lists.linux-m68k.org; > fth...@telegraphics.com.au; linux-kernel@vger.kernel.org > Subject: Re: [RFC] IRQ handlers run with some high-priority interrupts(not > NMI) > enabled on some platform > > On Sat, Feb 13, 2021 at 12:50 AM Song Bao Hua (Barry Song) > wrote: > > > So I was actually trying to warn this unusual case - interrupts > > get nested while both in_hardirq() and irqs_disabled() are true. > > > > diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h > > index 7c9d6a2d7e90..b8ca27555c76 100644 > > --- a/include/linux/hardirq.h > > +++ b/include/linux/hardirq.h > > @@ -32,6 +32,7 @@ static __always_inline void rcu_irq_enter_check_tick(void) > > */ > > #define __irq_enter() \ > > do {\ > > + WARN_ONCE(in_hardirq() && irqs_disabled(), "nested > > interrupts\n"); \ > > preempt_count_add(HARDIRQ_OFFSET); \ > > That seems to be a rather heavyweight change in a critical path. > > A more useful change might be to implement lockdep support for m68k > and see if that warns about any actual problems. I'm not sure > what is actually missing for that, but these are the commits that > added it for other architectures in the past: > > 3c4697982982 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT") > 000591f1ca33 ("csky: Enable LOCKDEP_SUPPORT") > 78cdfb5cf15e ("openrisc: enable LOCKDEP_SUPPORT and irqflags tracing") > 8f371c752154 ("xtensa: enable lockdep support") > bf2d80966890 ("microblaze: Lockdep support") > Yes. M68k lacks lockdep support which might be added. > > And I also think it is better for m68k's arch_irqs_disabled() to > > return true only when both low and high priority interrupts are > > disabled rather than try to mute this warn in genirq by a weaker > > condition: > > if (WARN_ONCE(!irqs_disabled(),"irq %u handler %pS enabled > interrupts\n", > > irq, action->handler)) > >local_irq_disable(); > > } > > > > This warn is not activated on m68k because its arch_irqs_disabled() return > > true though its high-priority interrupts are still enabled. > > Then it would just end up always warning when a nested hardirq happens, > right? That seems no different to dropping support for nested hardirqs > on m68k altogether, which of course is what you suggested already. This won't end up a warning on other architectures like arm,arm64, x86 etc as interrupts won't come while arch_irqs_disabled() is true in hardIRQ. For example, I_BIT of CPSR of ARM is set: static inline int arch_irqs_disabled_flags(unsigned long flags) { return flags & IRQMASK_I_BIT; } So it would only give a backtrace on platforms whose arch_irqs_disabled() return true while only some interrupts are disabled and some others are still open, thus nested interrupts can come without any explicit code to enable interrupts. This warn seems to give consistent interpretation on what's "Run irq handlers with interrupts disabled" in commit e58aa3d2d0cc (" genirq: Run irq handlers with interrupts disabled") > >Arnd Thanks Barry
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Sat, Feb 13, 2021 at 12:50 AM Song Bao Hua (Barry Song) wrote: > So I was actually trying to warn this unusual case - interrupts > get nested while both in_hardirq() and irqs_disabled() are true. > > diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h > index 7c9d6a2d7e90..b8ca27555c76 100644 > --- a/include/linux/hardirq.h > +++ b/include/linux/hardirq.h > @@ -32,6 +32,7 @@ static __always_inline void rcu_irq_enter_check_tick(void) > */ > #define __irq_enter() \ > do {\ > + WARN_ONCE(in_hardirq() && irqs_disabled(), "nested > interrupts\n"); \ > preempt_count_add(HARDIRQ_OFFSET); \ That seems to be a rather heavyweight change in a critical path. A more useful change might be to implement lockdep support for m68k and see if that warns about any actual problems. I'm not sure what is actually missing for that, but these are the commits that added it for other architectures in the past: 3c4697982982 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT") 000591f1ca33 ("csky: Enable LOCKDEP_SUPPORT") 78cdfb5cf15e ("openrisc: enable LOCKDEP_SUPPORT and irqflags tracing") 8f371c752154 ("xtensa: enable lockdep support") bf2d80966890 ("microblaze: Lockdep support") > And I also think it is better for m68k's arch_irqs_disabled() to > return true only when both low and high priority interrupts are > disabled rather than try to mute this warn in genirq by a weaker > condition: > if (WARN_ONCE(!irqs_disabled(),"irq %u handler %pS enabled > interrupts\n", > irq, action->handler)) >local_irq_disable(); > } > > This warn is not activated on m68k because its arch_irqs_disabled() return > true though its high-priority interrupts are still enabled. Then it would just end up always warning when a nested hardirq happens, right? That seems no different to dropping support for nested hardirqs on m68k altogether, which of course is what you suggested already. Arnd
RE: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
> -Original Message- > From: Arnd Bergmann [mailto:a...@kernel.org] > Sent: Saturday, February 13, 2021 12:06 PM > To: Song Bao Hua (Barry Song) > Cc: t...@linutronix.de; gre...@linuxfoundation.org; a...@arndb.de; > ge...@linux-m68k.org; fun...@jurai.org; ph...@gnu.org; cor...@lwn.net; > mi...@redhat.com; linux-m...@lists.linux-m68k.org; > fth...@telegraphics.com.au; linux-kernel@vger.kernel.org > Subject: Re: [RFC] IRQ handlers run with some high-priority interrupts(not > NMI) > enabled on some platform > > On Sat, Feb 13, 2021 at 12:00 AM Song Bao Hua (Barry Song) > wrote: > > > -Original Message- > > > From: Arnd Bergmann [mailto:a...@kernel.org] > > > Sent: Saturday, February 13, 2021 11:34 AM > > > To: Song Bao Hua (Barry Song) > > > Cc: t...@linutronix.de; gre...@linuxfoundation.org; a...@arndb.de; > > > ge...@linux-m68k.org; fun...@jurai.org; ph...@gnu.org; cor...@lwn.net; > > > mi...@redhat.com; linux-m...@lists.linux-m68k.org; > > > fth...@telegraphics.com.au; linux-kernel@vger.kernel.org > > > Subject: Re: [RFC] IRQ handlers run with some high-priority interrupts(not > NMI) > > > enabled on some platform > > > > > > On Fri, Feb 12, 2021 at 2:18 AM Song Bao Hua (Barry Song) > > > wrote: > > > > > > > So I am requesting comments on: > > > > 1. are we expecting all interrupts except NMI to be disabled in irq > > > > handler, > > > > or do we actually allow some high-priority interrupts between low and > NMI > > > to > > > > come in some platforms? > > > > > > I tried to come to an answer but this does not seem particularly > > > well-defined. > > > There are a few things I noticed: > > > > > > - going through the local_irq_save()/restore() implementations on all > > > architectures, I did not find any other ones besides m68k that leave > > > high-priority interrupts enabled. I did see that at least alpha and > > > openrisc > > > are designed to support that in hardware, but the code just leaves the > > > interrupts disabled. > > > > The case is a little different. Explicit local_irq_save() does disable all > > high priority interrupts on m68k. The only difference is > > arch_irqs_disabled() > > of m68k will return true while low-priority interrupts are masked and high > > -priority are still open. M68k's hardIRQ also runs in this context with high > > priority interrupts enabled. > > My point was that on most other architectures, local_irq_save()/restore() > always disables/enables all interrupts, while on m68k it restores the > specific level they were on before. On alpha, it does the same as on m68k, > but then the top-level interrupt handler just disables them all before calling > into any other code. That's what I think m68k is better to do. Looks weird that nested interrupts can enter while arch_irqs_disabled() is true on m68k because masking low-priority interrupts with high-interrupts still enabled would be able to make m68k's arch_irqs_disabled() true, which is exactly the environment m68k's irq handler is running. So I was actually trying to warn this unusual case - interrupts get nested while both in_hardirq() and irqs_disabled() are true. diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h index 7c9d6a2d7e90..b8ca27555c76 100644 --- a/include/linux/hardirq.h +++ b/include/linux/hardirq.h @@ -32,6 +32,7 @@ static __always_inline void rcu_irq_enter_check_tick(void) */ #define __irq_enter() \ do {\ + WARN_ONCE(in_hardirq() && irqs_disabled(), "nested interrupts\n"); \ preempt_count_add(HARDIRQ_OFFSET); \ lockdep_hardirq_enter();\ account_hardirq_enter(current); \ @@ -44,6 +45,7 @@ static __always_inline void rcu_irq_enter_check_tick(void) */ #define __irq_enter_raw() \ do {\ + WARN_ONCE(in_hardirq() && irqs_disabled(), " nested interrupts\n"); \ preempt_count_add(HARDIRQ_OFFSET); \ lockdep_hardirq_enter();\ } while (0) And I also think it is better for m68k's arch_irqs_disabled() to return true only when both low and high priority interrupts are disabled rather than try to mute this warn in genirq by a weaker condition: irqreturn_t __handle_irq_event_percpu(struct irq_desc *desc, unsigned int *flags) { ... trace_irq_h
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Sat, Feb 13, 2021 at 12:00 AM Song Bao Hua (Barry Song) wrote: > > -Original Message- > > From: Arnd Bergmann [mailto:a...@kernel.org] > > Sent: Saturday, February 13, 2021 11:34 AM > > To: Song Bao Hua (Barry Song) > > Cc: t...@linutronix.de; gre...@linuxfoundation.org; a...@arndb.de; > > ge...@linux-m68k.org; fun...@jurai.org; ph...@gnu.org; cor...@lwn.net; > > mi...@redhat.com; linux-m...@lists.linux-m68k.org; > > fth...@telegraphics.com.au; linux-kernel@vger.kernel.org > > Subject: Re: [RFC] IRQ handlers run with some high-priority interrupts(not > > NMI) > > enabled on some platform > > > > On Fri, Feb 12, 2021 at 2:18 AM Song Bao Hua (Barry Song) > > wrote: > > > > > So I am requesting comments on: > > > 1. are we expecting all interrupts except NMI to be disabled in irq > > > handler, > > > or do we actually allow some high-priority interrupts between low and NMI > > to > > > come in some platforms? > > > > I tried to come to an answer but this does not seem particularly > > well-defined. > > There are a few things I noticed: > > > > - going through the local_irq_save()/restore() implementations on all > > architectures, I did not find any other ones besides m68k that leave > > high-priority interrupts enabled. I did see that at least alpha and > > openrisc > > are designed to support that in hardware, but the code just leaves the > > interrupts disabled. > > The case is a little different. Explicit local_irq_save() does disable all > high priority interrupts on m68k. The only difference is arch_irqs_disabled() > of m68k will return true while low-priority interrupts are masked and high > -priority are still open. M68k's hardIRQ also runs in this context with high > priority interrupts enabled. My point was that on most other architectures, local_irq_save()/restore() always disables/enables all interrupts, while on m68k it restores the specific level they were on before. On alpha, it does the same as on m68k, but then the top-level interrupt handler just disables them all before calling into any other code. It's possible that I missed some other implementation doing the same as m68k, as this code is fairly subtle on some architectures. Arnd
RE: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
> -Original Message- > From: Arnd Bergmann [mailto:a...@kernel.org] > Sent: Saturday, February 13, 2021 11:34 AM > To: Song Bao Hua (Barry Song) > Cc: t...@linutronix.de; gre...@linuxfoundation.org; a...@arndb.de; > ge...@linux-m68k.org; fun...@jurai.org; ph...@gnu.org; cor...@lwn.net; > mi...@redhat.com; linux-m...@lists.linux-m68k.org; > fth...@telegraphics.com.au; linux-kernel@vger.kernel.org > Subject: Re: [RFC] IRQ handlers run with some high-priority interrupts(not > NMI) > enabled on some platform > > On Fri, Feb 12, 2021 at 2:18 AM Song Bao Hua (Barry Song) > wrote: > > > So I am requesting comments on: > > 1. are we expecting all interrupts except NMI to be disabled in irq handler, > > or do we actually allow some high-priority interrupts between low and NMI > to > > come in some platforms? > > I tried to come to an answer but this does not seem particularly well-defined. > There are a few things I noticed: > > - going through the local_irq_save()/restore() implementations on all > architectures, I did not find any other ones besides m68k that leave > high-priority interrupts enabled. I did see that at least alpha and openrisc > are designed to support that in hardware, but the code just leaves the > interrupts disabled. The case is a little different. Explicit local_irq_save() does disable all high priority interrupts on m68k. The only difference is arch_irqs_disabled() of m68k will return true while low-priority interrupts are masked and high -priority are still open. M68k's hardIRQ also runs in this context with high priority interrupts enabled. > > - The generic code is clearly prepared to handle nested hardirqs, and >the irq_enter()/irq_exit() functions have a counter in preempt_count >for the nesting level, using a 4-bit number for hardirq, plus another >4-bit number for NMI. Yes, I understand nested interrupts are supported by an explicit local_irq_enable_in_hardirq(). Mk68k's case is different, nested interrupts can come with arch_irqs_disabled() is true and while nobody has called local_irq_enable_in_hardirq() in the previous hardIRQ because hardIRQ keeps high-priority interrupts open. > > - There are a couple of (ancient) drivers that enable interrupts in their >interrupt handlers, see the four callers of local_irq_enable_in_hardirq() >(all in the old drivers/ide stack) and arch/ia64/kernel/time.c, which >enables interupts in its timer function (I recently tried removing this >and my patch broke ia64 timers, but I'm not sure if the cause was >the local_irq_enable() or something else). > > - The local_irq_enable_in_hardirq() function itself turns into a nop > when lockdep is enabled, since d7e9629de051 ("[PATCH] lockdep: > add local_irq_enable_in_hardirq() API"). According to the comment > in there, lockdep already enforces the behavior you suggest. Note that > lockdep support is missing on m68k (and also alpha, h8300, ia64, nios2, > and parisc). > > > 2. If either side is true, I think we need to document it somewhere as there > > is always confusion about this. > > > > Personally, I would expect all interrupts to be disabled and I like the way > > of ARM64 to only use high-priority interrupt as pseudo NMI: > > https://lwn.net/Articles/755906/ > > Though Finn argued that this will contribute to lose hardware feature of > > m68k. > > Regardless of what is documented, I would argue that any platform > that relies on this is at the minimum doing something risky because at > the minimum this runs into hard to debug code paths that are not > exercised on any of the common architectures. > > Arnd Thanks Barry
Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform
On Fri, Feb 12, 2021 at 2:18 AM Song Bao Hua (Barry Song) wrote: > So I am requesting comments on: > 1. are we expecting all interrupts except NMI to be disabled in irq handler, > or do we actually allow some high-priority interrupts between low and NMI to > come in some platforms? I tried to come to an answer but this does not seem particularly well-defined. There are a few things I noticed: - going through the local_irq_save()/restore() implementations on all architectures, I did not find any other ones besides m68k that leave high-priority interrupts enabled. I did see that at least alpha and openrisc are designed to support that in hardware, but the code just leaves the interrupts disabled. - The generic code is clearly prepared to handle nested hardirqs, and the irq_enter()/irq_exit() functions have a counter in preempt_count for the nesting level, using a 4-bit number for hardirq, plus another 4-bit number for NMI. - There are a couple of (ancient) drivers that enable interrupts in their interrupt handlers, see the four callers of local_irq_enable_in_hardirq() (all in the old drivers/ide stack) and arch/ia64/kernel/time.c, which enables interupts in its timer function (I recently tried removing this and my patch broke ia64 timers, but I'm not sure if the cause was the local_irq_enable() or something else). - The local_irq_enable_in_hardirq() function itself turns into a nop when lockdep is enabled, since d7e9629de051 ("[PATCH] lockdep: add local_irq_enable_in_hardirq() API"). According to the comment in there, lockdep already enforces the behavior you suggest. Note that lockdep support is missing on m68k (and also alpha, h8300, ia64, nios2, and parisc). > 2. If either side is true, I think we need to document it somewhere as there > is always confusion about this. > > Personally, I would expect all interrupts to be disabled and I like the way > of ARM64 to only use high-priority interrupt as pseudo NMI: > https://lwn.net/Articles/755906/ > Though Finn argued that this will contribute to lose hardware feature of m68k. Regardless of what is documented, I would argue that any platform that relies on this is at the minimum doing something risky because at the minimum this runs into hard to debug code paths that are not exercised on any of the common architectures. Arnd