Using the mvapi btl you can now set OMPI_MCA_btl_mvapi_use_srq=1 which will cause mvapi to use a shared receive queue. This will allow much better scaling as receives are posted per interface port and not per queue pair. Note: older versions of mellanox firmware may see a substantial performance impact on small message latency but the latest firmware shows only a small cost on the order of 2/10 uSec.

- Galen

Reply via email to