суббота, 19 октября 2024 г. пользователь Muni Sekhar < munisekhar...@gmail.com> написал:
> Dear Linux Kernel Developers, > > I am encountering a soft lockup issue in my system related to the > continuous while loop in the empty_rx_fifo() function. Below is the > relevant code: > > > #include <linux/io.h> // For readw() > > #define FIFO_STATUS 0x0014 > #define FIFO_MAN_READ 0x0015 > #define RX_FIFO_EMPTY 0x01 // Assuming RX_FIFO_EMPTY is defined as 0x01 > > static inline uint16_t read16_shifted(void __iomem *addr, u32 offset) > { > void __iomem *target_addr = addr + (offset << 1); // Left shift > the offset by 1 and add to the base address > uint16_t value = readw(target_addr); // Read the 16-bit value from > the calculated address > return value; > } > > void empty_rx_fifo(void __iomem *addr) > { > while (!(read16_shifted(addr, FIFO_STATUS) & RX_FIFO_EMPTY)) { > read16_shifted(addr, FIFO_MAN_READ); // Keep reading from the > FIFO until it's empty > } > } > > Explanation: > Function Name: read16_shifted — The function reads a 16-bit value from > an offset address with a left shift operation. > Operation: It shifts the offset left by 1 (offset << 1), adds it to > the base address, and reads the value from the new address. > The empty_rx_fifo function is designed to clear out the RX FIFO, but > I've encountered soft lockup issues. Specifically, the system logs > repeated soft lockup messages in the kernel log, with a time gap of > roughly 28 seconds between them (as per the kernel log timestamps). > Here's an example log: > > watchdog: BUG: soft lockup - CPU#0 stuck for 23s! > > In all cases, the RIP points to: > RIP: 0010:read16_shifted+0x11/0x20 > > > Analysis: > The soft lockup seems to be caused by the continuous while loop in the > empty_rx_fifo() function. The RX FIFO takes a considerable amount of > time to empty, sometimes up to 1000 seconds. As a result, from the > first occurrence of the soft lockup trace, the log repeats > approximately every 28 seconds for the entire 1000 seconds duration. > After 1000 seconds, the system resumes normal operation. > > Questions: > 1. How should I best handle this kind of issue? Even if the hardware > takes time, I would like advice on the best approach to prevent these > lockups. I guess that you can switch on interrupt model or run a thread to check the status there (here I mean check RX empty and release cpu) 2. Do soft lockup issues auto-recover like this? Is this something I > should consider serious, or can it be ignored? The kernel tells you that your cpu resource is stuck instead of doing something useful > I would appreciate any guidance on how to resolve or mitigate this problem. > > > -- > Thanks, > Sekhar > > _______________________________________________ > Kernelnewbies mailing list > Kernelnewbies@kernelnewbies.org > https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > -- Regards / Mit besten Grüßen, Denis
_______________________________________________ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies