Thomas Gleixner <t...@linutronix.de> 于2020年9月24日周四 下午11:37写道: > > On Tue, Sep 15 2020 at 19:56, qianjun kernel wrote: > > > > +#define SOFTIRQ_PENDING_MASK ((1UL << NR_SOFTIRQS) - 1) > > > > +/* > > + * The pending_next_bit is recorded for the next processing order when > > + * the loop is broken. This per cpu variable is to solve the following > > + * scenarios: > > + * Assume bit 0 and 1 are pending when the processing starts. Now it > > + * breaks out after bit 0 has been handled and stores back bit 1 as > > + * pending. Before ksoftirqd runs bit 0 gets raised again. ksoftirqd > > + * runs and handles bit 0, which takes more than the timeout. As a > > + * result the bit 0 processing can starve all other softirqs. > > + * > > + * so we need the pending_next_bit to record the next process order. > > + */ > > +DEFINE_PER_CPU(u32, pending_next_bit); > > static if at all. > > > + > > asmlinkage __visible void __softirq_entry __do_softirq(void) > > { > > u64 start = sched_clock(); > > @@ -261,8 +277,11 @@ asmlinkage __visible void __softirq_entry > > __do_softirq(void) > > unsigned int max_restart = MAX_SOFTIRQ_RESTART; > > struct softirq_action *h; > > unsigned long pending; > > + unsigned long pending_left, pending_again; > > unsigned int vec_nr; > > bool in_hardirq; > > + int next_bit; > > + unsigned long flags; > > > > /* > > * Mask out PF_MEMALLOC as the current task context is borrowed for > > the > > @@ -283,25 +302,66 @@ asmlinkage __visible void __softirq_entry > > __do_softirq(void) > > > > local_irq_enable(); > > > > - for_each_set_bit(vec_nr, &pending, NR_SOFTIRQS) { > > - int prev_count; > > - > > - __clear_bit(vec_nr, &pending); > > - > > - h = softirq_vec + vec_nr; > > - > > - prev_count = preempt_count(); > > - > > - kstat_incr_softirqs_this_cpu(vec_nr); > > + /* > > + * pending_left means that the left bits unhandled when the loop is > > + * broken without finishing the vectors. These bits will be handled > > + * first in the next time. pending_again means that the new bits is > > + * generated in the other time. These bits should be handled after > > + * the pending_left bits have been handled. > > + * > > + * For example > > + * If the pending bits is 1101010110, and the loop is broken after > > + * the bit4 is handled. Then, the pending_next_bit will be 5, and > > + * the pending_left is 1101000000, the pending_again is 000000110. > > + */ > > If you need such a comment to explain the meaning of your variables then > you did something fundamentaly wrong. > > > + next_bit = __this_cpu_read(pending_next_bit); > > + pending_left = pending & > > + (SOFTIRQ_PENDING_MASK << next_bit); > > + pending_again = pending & > > + (SOFTIRQ_PENDING_MASK >> (NR_SOFTIRQS - next_bit)); > > + > > + while (pending_left || pending_again) { > > + if (pending_left) { > > + pending = pending_left; > > + pending_left = 0; > > + } else if (pending_again) { > > + pending = pending_again; > > + pending_again = 0; > > + } else > > + break; > > Aside of lacking brackets how is that 'else' patch ever going to be > reached? > > But TBH that whole patch is a completely unreviewable maze. > > This can be done without all this pending, pending_left, pending_again, > pending_next_bit, next_bit convolution. It's inconsistent anyway: > > __do_softirq() > > pending = 0x25; > next = 0; > > for (...) > break after bit 0 > > ==> pending == 0x24 > > ==> next = 2 > > now on the next invocation > > pending = 0x35; > next = 2; > > So the processing order is 2, 4, 5, 0 > > and there is nothing you can do about that with that approach. > > But the whole point is to ensure that the not yet processed bits are > processed first. > > Find attached an updated series based on the original one from Peter > with the authorship preserved, intact SOB chains and proper changelogs. > > The last one is new and addressing the starvation issue in a readable > way. > > All of this is again completely untested. > > Thanks, > > tglx >
I will fix it and test. After test, i will send the patch again. thanks