On 17.03.26 12:01, Sebastian Andrzej Siewior wrote:
> On 2026-03-17 08:49:38 [+0100], Jan Kiszka wrote:
>>>> +void vmbus_isr(void)
>>>> +{
>>>> + if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
>>>> + vmbus_irqd_wake();
>>>> + } else {
>>>> + lockdep_hardirq_threaded();
>>>
>>> What clears this? This is wrongly placed. This should go to
>>> sysvec_hyperv_callback() instead with its matching canceling part. The
>>> add_interrupt_randomness() should also be there and not here.
>>> sysvec_hyperv_stimer0() managed to do so.
>>
>> First of all, we need to keep all this in generic code to avoid missing
>> arm64.
>
> This kind of belongs to the IRQ core code so I would prefer to see it on
> IRQ entry, not in a random driver.
I have no idea why hv is so special, starting with having its own
vectors. But if you have an idea how to address those needs via core
APIs or to create new ones for it, I guess that is welcome.
>
>> But the question about lockdep_hardirq_threaded() is valid - and that
>> not only for this new code: I tried hard to understand from the code how
>> hardirq_threaded is managed, but I simply couldn't find the spot where
>> it is reset after lockdep_hardirq_threaded() but before returning from
>> the interrupt to the task that now has hardirq_threaded=1. I failed, and
>> so I started a debugger. That confirms for the existing code path
>> (__handle_irq_event_percpu) that we are indeed returning to the
>> interrupted task with hardirq_threaded set. I'm not sure if that is
>> intended that only the next irq_enter_rcu->lockdep_hardirq_enter of the
>> next IRQ over this same task will reset the flag again.
>
> While looking into it again, it assumes that you enter an IRQ and due to
> the implementation if one is threaded, all of them are. So if you switch
> from IRQ handling to TIMER then this does not happen "as-is" but exit
> from one and then entry another at which point it is set to zero again.
Point is that a task that was interrupted by a potentially threaded
interrupt keeps this flag longer that it needs it. And that is
apparently harmless, but fairly confusing.
>
>> With that in mind, the new logic here is no different from the one the
>> kernel used before. If both are not doing what they should, we likely
>> want to add a generic reset of hardirq_threaded to the IRQ exit path(s).
>
> The difference is that you expect that _everyone_ calling this driver
> has everything else threaded. This might not be the case. That is why
> this should be in core knowing what is called if threaded, use in driver
> after explicit killing that flag afterwards since you don't know what
> can follow or add a generic threaded infrastructure here.
This driver is different, unfortunately. I'm not sure if we can / want
to thread everything that the platform interrupt does on x86. So far,
only the last part of it - vmbus handling - is threaded. On arm64, the
irq is exclusive (see vmbus_percpu_isr), thus everything can be and is
threaded.
>
> A different option which I would prefer in the drivere, would be an
> explicit lockdep override for the locking class without using
> lockdep_hardirq_threaded()
Happy to learn how to do that.
>
>>> Different question: What guarantees that there won't be another
>>> interrupt before this one is done? The handshake appears to be
>>> deprecated. The interrupt itself returns ACKing (or not) but the actual
>>> handler is delayed to this thread. Depending on the userland it could
>>> take some time and I don't know how impatient the host is.
>>>
>>
>> Good question. I guess people familiar with the hv interface need to
>> comment on that.
>>
>>>> + __vmbus_isr();
>>> Moving on. This (trying very hard here) even schedules tasklets. Why?
>>> You need to disable BH before doing so. Otherwise it ends in ksoftirqd.
>>> You don't want that.
>>>
>>
>> You are referring to the re-existing logic now, aren't you?
>
> Yes.
>
Then someone else needs to answer this.
>>> Couldn't the whole logic be integrated into the IRQ code? Then we could
>>> have mask/ unmask if supported/ provided and threaded interrupts. Then
>>> sysvec_hyperv_reenlightenment() could use a proper threaded interrupt
>>> instead apic_eoi() + schedule_delayed_work().
>>>
>>
>> Again, you are thinking x86-only. We need a portable solution.
>
> well, ARM could use a threaded interrupt, too.
For a reason we didn't explore in details, per-CPU interrupts aren't
threaded. See older version of this patch
(https://lore.kernel.org/lkml/[email protected]/)
where I thought I only had to fix x86, but arm64 was needing care as well.
Jan
--
Siemens AG, Foundational Technologies
Linux Expert Center