Hi, 

with this series I'm trying to summarize some analysis done on the
machine that is still affected by some timing issues. See [1] for the
old postings (including hardware description).

I tried to split the important things from the necessary infrastructure
preparation. This should allow us to focus on the relevant parts.

The following kernel versions have been tested:
 - 5.10.61-dovetail  the base version for this series
 - 5.10.70-dovetail  see 5.15-rc2
 - 5.15-rc2-dovetail no problem during system boot, but similar 
                     problem when shutting down

Attached to this coverletter:
 - An output of the modified /proc/interrupts file
 - An output of the new /proc/interrupt_log_times

Each of the following patches holds additional information about 
things that I noticed when looking at the results.

This issue is really hard to track down. When I do one of the following
the problem is "gone":

 - Set CONFIG_HZ to 1000, makes the problem seen during boot go away,
   but the system freezes during reboot/shutdown. We're using a Debian 
   based kernel configuration. CONFIG_HZ is set to 250 per default.

 - Enable tracing (even with a minimal filter): Problem gone

 - Instrumenting the IRQ pipeline code with printk() makes the problem
   go away. So accounting can be done but printing the results has to be
   done somehow in userspace. But...

 - Once the netdev watchdog triggered I have only a few seconds left
   before the system is unusable. Network as well as serial connection
   are no longer usable.

 - Booting server class hardware takes serveral minutes...

 - Booting with idle=poll makes the problem go away, see notes in patch
   4

 - Booting with maxcpus=1 makes the problem go away

[1] https://xenomai.org/pipermail/xenomai/2021-September/046449.html

Best regards,
Florian

Florian Bezdeka (5):
  net/sched/sch_generic: Measure the time between watchdog runs
  kernel/irq/irqdesc: Extend struct irq_desc with a few cpu specific
    members
  kernel/irq/pipeline: Instrument irq_post_stage and pull_next_irq
  kernel/irq/proc: Integrate post/pull/merge counters into
    /proc/interrupts
  kernel/irq/proc: Implement /proc/interrupt_log_times

 fs/proc/interrupts.c      |  8 +++++
 include/linux/interrupt.h |  1 +
 include/linux/irqdesc.h   | 11 ++++++
 kernel/irq/irqdesc.c      | 24 +++++++++++--
 kernel/irq/pipeline.c     | 34 ++++++++++++++++++
 kernel/irq/proc.c         | 73 ++++++++++++++++++++++++++++++++++++---
 net/sched/sch_generic.c   | 21 +++++++++++
 7 files changed, 166 insertions(+), 6 deletions(-)

-- 
2.30.2



-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: proc_interrupt_log_times.txt
URL: 
<http://xenomai.org/pipermail/xenomai/attachments/20211102/9ad544fc/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: proc_interrupts.txt
URL: 
<http://xenomai.org/pipermail/xenomai/attachments/20211102/9ad544fc/attachment-0001.txt>

Reply via email to