Re: "meaningful" spinlock contention when bound to non-intr CPU?

Rick Jones Fri, 02 Feb 2007 10:47:06 -0800

Andi Kleen wrote:

Rick Jones <[EMAIL PROTECTED]> writes:
Still, does this look like something worth persuing?  In a past
life/OS when one was able to eliminate one percentage point of
spinlock contention, two percentage points of improvement ensued.
The stack is really designed to go fast with per CPU local RX processingof packets. This normally works because waking on up a taskthe scheduler tries to move it to that CPU. Since the wakeups are
on the CPU that process the incoming packets it should usually
end up correctly.

The trouble is when your NICs are so fast that a single
CPU can't keep up, or when you have programs that process many
different sockets from a single thread.

The fast NIC case will be eventually fixed by adding proper
support for MSI-X and connection hashing. Then the NIC can fanout to multiple interrupts and use multiple CPUs to processthe incoming packets.

If that is implemented "well" (for some definition of well) then itmight address the many sockets from a thread issue too, but if not...

If it is simple "hash on the headers" then you still have issues with aprocess/thread servicing mutiple connections - the hash of the differentheaders will take things up different CPUs and you induce thescheduler to flip the process back and forth between them.

The meta question behind all that would seem to be whether the schedulershould be telling us where to perform the network processing, or shouldthe network processing be telling the scheduler what to do? (eg all myold blathering about IPS vs TOPS in HP-UX...)

Then there is the case of a single process having manysockets from different NICs This will be of course somewhat slowerbecause there will be cross CPU traffic.

The extreme case I see with the netperf test suggests it will be apretty big hit. Dragging cachelines from CPU to CPU is evil. Sometimesa necessary evil of course, but still evil.

However there should
be not much socket lock contention because a process handling
many sockets will be hopefully unlikely to bang on each of
its many sockets at the exactly same time as the stack
receives RX packets. This should also eliminate the spinlock
contenion.

From that theory your test sounds somewhat unrealistic to me.

Do you have any evidence you're modelling a real world scenario
here? I somehow doubt it.

Well, yes and no. If I drop the "burst" and instead have N times morenetperf's going, I see the same lock contention situation. I wasn'texpecting to - thinking that if there were then N different processes oneach CPU the likelihood of there being a contention on any one socketwas low, but it was there just the same.

That is part of what makes me wonder if there is a race between wakeupand release of a lock.



rick
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "meaningful" spinlock contention when bound to non-intr CPU?

Reply via email to