Marko Zec wrote: > On Wednesday 30 November 2005 16:18, Danial Thom wrote: >> --- Hiten Pandya <[EMAIL PROTECTED]> wrote: >> > Marko Zec wrote: >> > > Should we be really that pessimistic about >> > > potential MP performance, >> > > even with two NICs only? Typically packet >> > > flows are bi-directional, >> > > and if we could have one CPU/core taking care >> > > of one direction, then >> > > there should be at least some room for >> > > parallelism, especially once the >> > > parallelized routing tables see the light. >> > > Of course provided that >> > > each NIC is handled by a separate core, and >> > > that IPC doesn't become the >> > > actual bottleneck. >> > >> > On a similar note, it is important that we add >> > the *hardware* support >> > for binding a set of CPUs to particular >> > interrupt lines. I believe that >> > the API support for CPU-affinitized interrupt >> > threads is already there >> > so only the hard work is left of converting the >> > APIC code from physical >> > to logical access mode. >> > >> > I am not sure how the AMD64 platform handles >> > CPU affinity, by that I >> > mean if the same infrastructure put in place >> > for i386 would work or not >> > with a few modifications here and there. The >> > recent untangling of the >> > interrupt code should make it simpler for >> > others to dig into adding >> > interrupt affinity support. >> >> This, by itself, it not enough, albeit useful. >> What you need to do is separate transmit and >> receive (which use the same interrupts, of >> course). The only way to increase capacity for a >> single stream with MP is to separate tx and rx. > > Unless doing fancy oubound queuing, which typically doesn't make much > sense at 1Gbit/s speeds and above, I'd bet that significantly more CPU > cycles are spent in the "RX" part than in the "TX", which basically > only has to enqueue a packet into the devices' DMA ring, and recycle > already transmitted mbufs. The other issue with having separate CPUs > handling RX and TX parts of the same interface would be the locking > mess -> you would end up with the per-data-structure locking model of > FreeBSD 5.0 and later, which DragonFly diverted from.
And what about using CPUs to both RX and TX? That is, bound a packet to a CPU to both RX and TX? Cheers -- Alfredo Beaumont. GPG: http://aintel.bi.ehu.es/~jtbbesaa/jtbbesaa.gpg.asc Elektronika eta Telekomunikazioak Saila (Ingeniaritza Telematikoa) Euskal Herriko Unibertsitatea, Bilbao (Basque Country). http://www.ehu.es