On Tue, May 17, 2005 at 06:32:38PM -0700, Jeff Carr wrote: > >>>But IPoIB can't really implement NAPI since it's sending work to > >>>a shared HCA. > > Hmm. I'm not knowledgeable to know why; I'll have to take your word for > it. I'm not sure yet all the conditions that the HCA can generate > interrupts.
Wellll..Looks like I'm wrong. Previous email on this thread suggested it's possible by people who know alot more about it than I do. But I'm still concerned it's going to affect latency. > > But if I sit back and look at the logic of this arguement then it seemed > like: > > Hey. is there was a way to not generate so many interrupts? > That's handled by NAPI > OK. That looks interesting. Right - but that's the generic "this is how linux deals with this" argument. > But, we can't do NAPI because we can't just disable interrupts. Sorry - seems like I'm assuming too much about the capabilities of the HCAs. > Darn. > But wait, why can't we just not generate interrupts in the first place then? > > Isn't that what the midas touch of netdev->poll() really is? e1000 has: > quit_polling: netif_rx_complete(netdev); > e1000_irq_enable(adapter); I'm more familiar with tg3 driver. The disadvantage to NAPI in the tg3 implementation is it *alays* disables interrupts on the card before calling netif_rx_schedule(). Then it lets the OS decide when to actually process those packets *already* received in a safer context....then once tg3 decides it's done all the work it re-eables the interrupts. Just like e1000 does above. There are some workloads where the PCI bus utilization is "suboptimal" because the enable/disable of interrupts interfers with the DMA flows and costs excessive overhead. > Maybe IB can mimic the concept here by acting intellegently for us? > Have disable_rx_and_rxnobuff_ints() only disable interrupts for the > IPoIB ULP? Anyway, my knowledge here still sucks so I'm probably so far > off base I'm not even on the field. Either way it's fun digging around here. Based on previous comments, I'm hoping that's the case. But I don't know either. > >One can. Using SDP, netperf TCP_STREAM measured 650 MB/s using the > >regular PCI-X card. > > Yes, I have the same speed results using perf_main(). > > The perf_main() test isn't that interesting I think though. It really > just transfers the exact same memory window across two nodes. (at least > as far as I can tell that is what it does). > > Anyway, I'm just noticing that this simple dd test from memory doesn't > go much over 1GB/sec. So this is an interesting non-IB problem. > > [EMAIL PROTECTED]:/# dd if=/dev/shm/test of=/dev/null bs=4K > 196608+0 records in > 196608+0 records out > 805306368 bytes transferred in 0.628504 seconds (1281306571 bytes/sec) Yeah. Sounds like there is. Should be able to do several GB/s like that. I suppose it's possibly an issue with the Memory controller too. ZX1 interleaves accesses across 4 DIMMs to get the memory bandwidth. Might check to make sure your box is "optimally" configured too. [EMAIL PROTECTED] dd if=/dev/shm/test of=/dev/null bs=4K dd: opening `/dev/shm/test': No such file or directory Sorry - what do I need to do to create /dev/shm/test? I should probably "cheat" and use 16KB block since that is the native page size on ia64. thanks, grant _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general