I believe there is an issue over the default setting for the number of sockets
on a node. We changed to discovering it in the 1.7 and beyond series, but the
default value in the 1.6 series got set to zero (it defaults to 1 I believe for
1.4).
Try adding "-mca orte_num_sockets N -mca orte_num_co
Yevgeny, Jeff,
I've tried 26/2 on a node with 2TB RAM - the IB cards are not reachable with
this setup.
26/3 not yet tested (it's a bit work for our admins to 'repair' a node in case
it is not reachable over the IB interface).
By now we've a couple of nodes with up to 2TB RAM running with 23