Re: [ofa-general] SRP/mlx4 interrupts throttling performance

Cameron Harr Mon, 06 Oct 2008 13:14:23 -0700


Vu Pham wrote:

Cameron Harr wrote:

Vu Pham wrote:
Alternatively, is there anything in the SCST layer I should tweak. I'm
still running rev 245 of that code (kinda old, but works with OFED1.3.1
w/o hacks).


With blockio I get the best performance + stability with scst_threads=1

I got best performance with threads=2 or 3, and I've noticed that thesrpt_thread is often at 99%, though if I increase/decrease the"thread=?" parameter for ib_srpt, it doesn't seem to make a difference.A second initiator doesn't seem to help much either, with a singleinitiator writing to two targets, can now usually get between 95K and105K IOPs.

My target server (with DAS) contains 8 2.8 GHz CPU cores and cansustain over 200K IOPs locally, but only around 73K IOPs over SRP.
Is this number from one initiator or multiple?
One initiator. At first I thought it might be a limitation of theSRP, and added a second initiator, but the aggregate performance ofthe two was about equal to that of a single initiator.
Try again with scst_threads=1. I expect that you can get ~140K withtwo initiators

Unfortunately, I'm nowhere close that high, though I am significantlyhigher than before. 2 initiators does seem to reduce the contextswitching rate however, which is good.

Looking at /proc/interrupts, I see that the mlx_core (comp) deviceis pushing about 135K Int/s on 1 of 2 CPUs. All CPUs are enabledfor that PCI-E slot, but it only ever uses 2 of the CPUs, and only1 at a time. None of the other CPUs has an interrupt rate morethan about 40-50K/s.
The number of interrupt can be cut down if there are morecompletions to be processed by sw. ie. please test with multiple QPsbetween one initiator vs. your target and multiple initiators vs.your target

Interrupts are still pretty high (around 160K/s now), but that seems tonot be my bottleneck. Context switching seems to be about 2-2.5 forevery IOP and sometimes less - not perfect, but not horrible either.

ib_srpt process completions in event callback handler. With more QPsthere are more completions pending per interrupt instead of onecompletion event per interrupt.You can have multiple QPs between initiator vs. target by usingdifferent initiator_id_ext ie.echo id_ext=xxx,ioc_guid=yyy,....initiator_ext=1 >/sys/class/infiniband_srp/.../add_targetecho id_ext=xxx,ioc_guid=yyy,....initiator_ext=2 >/sys/class/infiniband_srp/.../add_targetecho id_ext=xxx,ioc_guid=yyy,....initiator_ext=3 >/sys/class/infiniband_srp/.../add_target

This doesn't seem to net much of an improvement, though I understand thereasoning behind it. My hunch is there's another bottleneck now to look for.


Cameron
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] SRP/mlx4 interrupts throttling performance

Reply via email to