[lustre-discuss] Multiple IB Interfaces

2021-03-09 Thread Ms. Megan Larko via lustre-discuss
Greetings Alastair, Bonding is supported on InfiniBand, but I believe that it is only active/passive. I think what you might be looking for WRT avoiding data travel through the inter-cpu link is cpu "affinity" AKA cpu "pinning". Cheers, megan WRT = "with regards to" AKA = "also known as"

[lustre-discuss] o2ib nid connections timeout until an snmp ping

2021-03-09 Thread Christian Kuntz via lustre-discuss
Hello all, Requisite preamble: This is debian 10.7 with lustre 2.13.0 (compiled by yours truly). We've been observing some odd behavior recently with o2ib NIDs. Everyone's all connected over the same switch (cards and switch are all mellanox), each machine has a single network card connected in

Re: [lustre-discuss] MDT hanging

2021-03-09 Thread Simon Guilbault via lustre-discuss
Hi, One of the things that the ZFS pacemaker resource does not seem to pick up failure is when MMP fails due to some problem with the SAS bus. We added this short script running as a systemd daemon to do a failover when this happens. The other check in this script is using NHC, mostly to check if

[lustre-discuss] MDT hanging

2021-03-09 Thread Christopher Mountford via lustre-discuss
Hi, We've had a couple of MDT hangs on 2 of our lustre filesystems after updating to 2.12.6 (though I'm sure I've seen this exact behaviour on previous versions). Ths symptoms are a gradualy increasing load on the affected MDS, processes doing I/O on the filesystem blocking indefinately,

[lustre-discuss] Multiple IB interfaces

2021-03-09 Thread Alastair Basden via lustre-discuss
Hi, We are installing some new Lustre servers with 2 InfiniBand cards, 1 attached to each CPU socket. Storage is nvme, again, some drives attached to each socket. We want to ensure that data to/from each drive uses the appropriate IB card, and doesn't need to travel through the inter-cpu