On Jun 15, 2012, at 11:26 AM, Daniels, Marcus G wrote:
>> Were there any clues in /var/log/messages or dmesg?
>
> Thanks. I found a suggestion from Nathan Hjelm to add "options mlx4_core
> log_mtts_per_seg=X" (where X is 5 in my case).
> Offline suggestions (which also included that) were als
Hi Marcus
Sounds like you might be running out of IB resources as opposed to main memory
- not much we can suggest there other than trying to set queue sizes, which is
a complicated option. You might look at "ompi_info --param btl openib" and see
if adjusting some of those helps.
Ralph
On Ju
On Jun 15, 2012, at 8:02 AM, Jeff Squyres wrote:
> Were there any clues in /var/log/messages or dmesg?
>
Thanks. I found a suggestion from Nathan Hjelm to add "options mlx4_core
log_mtts_per_seg=X" (where X is 5 in my case).
Offline suggestions (which also included that) were also add "--mc
Were there any clues in /var/log/messages or dmesg?
You might also want to check out this IBM writeup about some Mellanox
parameters:
http://www.ibm.com/developerworks/wikis/display/hpccentral/Using+RDMA+with+pagepool+larger+than+8GB
That IBM server seems to be misbehaving right now (500/intern