>
> I guess you means 'consumer' here. The scheduler doesn't fail to migrate
> it: the consumer is actually migrated a lot of times, but on each cpu a
> competing and running ksoftirqd thread is found.
>
> The general problem is that under significant network load (not
> necessary udp flood, similar behavior is observed even with TCP_RR
> tests), with enough rx queue available and enough flows running, no
> single thread/process can use 100% of any cpu, even if the overall
> capacity would allow it.
>

Looks like a general process scheduler issue ?

Really, allowing the RX processing to be migrated among cpus is
problematic for TCP,
as it will increase reorders.

RFS for example has a very specific logic to avoid these problems as
much as possible.

                /*
                 * If the desired CPU (where last recvmsg was done) is
                 * different from current CPU (one in the rx-queue flow
                 * table entry), switch if one of the following holds:
                 *   - Current CPU is unset (>= nr_cpu_ids).
                 *   - Current CPU is offline.
                 *   - The current CPU's queue tail has advanced beyond the
                 *     last packet that was enqueued using this table entry.
                 *     This guarantees that all previous packets for the flow
                 *     have been dequeued, thus preserving in order delivery.
                 */
                if (unlikely(tcpu != next_cpu) &&
                    (tcpu >= nr_cpu_ids || !cpu_online(tcpu) ||
                     ((int)(per_cpu(softnet_data, tcpu).input_queue_head -
                      rflow->last_qtail)) >= 0)) {
                        tcpu = next_cpu;
                        rflow = set_rps_cpu(dev, skb, rflow, next_cpu);
                }

Reply via email to