On Jun 15, 2012, at 11:26 AM, Daniels, Marcus G wrote:
>> Were there any clues in /var/log/messages or dmesg?
>
> Thanks. I found a suggestion from Nathan Hjelm to add "options mlx4_core
> log_mtts_per_seg=X" (where X is 5 in my case).
> Offline suggestions (which also included that) were
Hi Marcus
Sounds like you might be running out of IB resources as opposed to main memory
- not much we can suggest there other than trying to set queue sizes, which is
a complicated option. You might look at "ompi_info --param btl openib" and see
if adjusting some of those helps.
Ralph
On
On Jun 15, 2012, at 8:02 AM, Jeff Squyres wrote:
> Were there any clues in /var/log/messages or dmesg?
>
Thanks. I found a suggestion from Nathan Hjelm to add "options mlx4_core
log_mtts_per_seg=X" (where X is 5 in my case).
Offline suggestions (which also included that) were also add
Were there any clues in /var/log/messages or dmesg?
You might also want to check out this IBM writeup about some Mellanox
parameters:
http://www.ibm.com/developerworks/wikis/display/hpccentral/Using+RDMA+with+pagepool+larger+than+8GB
That IBM server seems to be misbehaving right now
Hi,
Is there anything I can do about this? I don't have any locked memory limits.
Thanks,
Marcus
Creating ensight file: EnSight6.geo01 elapsed secs= 6.84
--
The OpenFabrics (openib) BTL failed to register memory