[lustre-discuss] Mistake while removing an OST

2023-02-01 Thread BALVERS Martin via lustre-discuss
Hi,

I have a defective OSS with a single OST that I was tying to remove from the 
lustre filesystem completely (2.15.1). I was following 
https://doc.lustre.org/lustre_manual.xhtml#lustremaint.remove_ost
I had drained the OST and was now using the commands with llog_cancel to remove 
the config. This is where it went wrong. I first deleted attach, setup, add_osc 
indexes for the 'client' and then needed to also delete those for MDT and 
MDT0001, but I accidentally removed two more indexes from the 'client'.
Now I have incomplete client llogs for one OST, I am missing the add_uuid and 
attach lines for OST0002.

[root@mds ~]# lctl --device MGS llog_print lustre-client
- { index: 34, event: add_uuid, nid: 192.168.2.3@tcp(0x2c0a80203), node: 
192.168.2.3@tcp }
- { index: 35, event: attach, device: lustre-OST0001-osc, type: osc, UUID: 
lustre-clilov_UUID }
- { index: 36, event: setup, device: lustre-OST0001-osc, UUID: 
lustre-OST0001_UUID, node: 192.168.2.3@tcp }
- { index: 37, event: add_osc, device: lustre-clilov, ost: lustre-OST0001_UUID, 
index: 1, gen: 1 }

- { index: 42, event: setup, device: lustre-OST0002-osc, UUID: 
lustre-OST0002_UUID, node: 192.168.2.4@tcp }
- { index: 43, event: add_osc, device: lustre-clilov, ost: lustre-OST0002_UUID, 
index: 2, gen: 1 }

Is there a way to recover from this ?

I hope someone can help.

Regards,
Martin Balvers

Ce message électronique et tous les fichiers attachés qu'il contient sont 
confidentiels et destinés exclusivement à l'usage de la personne à laquelle ils 
sont adressés. Si vous avez reçu ce message par erreur, merci de le retourner à 
son émetteur. Les idées et opinions présentées dans ce message sont celles de 
son auteur, et ne représentent pas nécessairement celles de DANONE ou d'une 
quelconque de ses filiales. La publication, l'usage, la distribution, 
l'impression ou la copie non autorisée de ce message et des attachements qu'il 
contient sont strictement interdits. 

This e-mail and any files transmitted with it are confidential and intended 
solely for the use of the individual to whom it is addressed. If you have 
received this email in error please send it back to the person that sent it to 
you. Any views or opinions presented are solely those of its author and do not 
necessarily represent those of DANONE or any of its subsidiary companies. 
Unauthorized publication, use, dissemination, forwarding, printing or copying 
of this email and its associated attachments is strictly prohibited.
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Mistake while removing an OST

2023-02-01 Thread Andreas Dilger via lustre-discuss
You should just be able to run the "writeconf" process to regenerate the config 
logs. The removed OST will not re-register with the MGS, but all of the other 
servers will, so it should be fine.

Cheers, Andreas

On Feb 1, 2023, at 03:48, BALVERS Martin via lustre-discuss 
 wrote:


Hi,

I have a defective OSS with a single OST that I was tying to remove from the 
lustre filesystem completely (2.15.1). I was following 
https://doc.lustre.org/lustre_manual.xhtml#lustremaint.remove_ost
I had drained the OST and was now using the commands with llog_cancel to remove 
the config. This is where it went wrong. I first deleted attach, setup, add_osc 
indexes for the ‘client’ and then needed to also delete those for MDT and 
MDT0001, but I accidentally removed two more indexes from the ‘client’.
Now I have incomplete client llogs for one OST, I am missing the add_uuid and 
attach lines for OST0002.

[root@mds ~]# lctl --device MGS llog_print lustre-client
- { index: 34, event: add_uuid, nid: 192.168.2.3@tcp(0x2c0a80203), node: 
192.168.2.3@tcp }
- { index: 35, event: attach, device: lustre-OST0001-osc, type: osc, UUID: 
lustre-clilov_UUID }
- { index: 36, event: setup, device: lustre-OST0001-osc, UUID: 
lustre-OST0001_UUID, node: 192.168.2.3@tcp }
- { index: 37, event: add_osc, device: lustre-clilov, ost: lustre-OST0001_UUID, 
index: 1, gen: 1 }

- { index: 42, event: setup, device: lustre-OST0002-osc, UUID: 
lustre-OST0002_UUID, node: 192.168.2.4@tcp }
- { index: 43, event: add_osc, device: lustre-clilov, ost: lustre-OST0002_UUID, 
index: 2, gen: 1 }

Is there a way to recover from this ?

I hope someone can help.

Regards,
Martin Balvers
Ce message électronique et tous les fichiers attachés qu'il contient sont 
confidentiels et destinés exclusivement à l'usage de la personne à laquelle ils 
sont adressés. Si vous avez reçu ce message par erreur, merci de le retourner à 
son émetteur. Les idées et opinions présentées dans ce message sont celles de 
son auteur, et ne représentent pas nécessairement celles de DANONE ou d'une 
quelconque de ses filiales. La publication, l'usage, la distribution, 
l'impression ou la copie non autorisée de ce message et des attachements qu'il 
contient sont strictement interdits.

This e-mail and any files transmitted with it are confidential and intended 
solely for the use of the individual to whom it is addressed. If you have 
received this email in error please send it back to the person that sent it to 
you. Any views or opinions presented are solely those of its author and do not 
necessarily represent those of DANONE or any of its subsidiary companies. 
Unauthorized publication, use, dissemination, forwarding, printing or copying 
of this email and its associated attachments is strictly prohibited.
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Binding OST to CPU NUMA

2023-02-01 Thread Jeffrey Wong via lustre-discuss
Hello,

I am relatively new to Lustre. I have a running Lustre setup and  would like to 
ensure that there is no cross-numa traffic going on in the server side.

Each Lustre OSS has the following:

  *   Dual socket AMD CPUs
  *   4 NUMA regions (2 NUMA going to each socket)
  *   2x Mellanox NIC (one for each CPU socket).
  *   16x NVMe drives (8 for each CPU socket).

The way I currently have it set up:

  *   Each group of 8 NVMe on the same socket, I created a RAID0 group. Each 
RAID0 group is an OST.
  *   In the LNET configuration, each NIC is assigned to the corresponding CPT. 
We are using RDMA over Ethernet.
 *   o2ib0(ens3f0np0)[0,1], o2ib1(ens7f0np0)[2,3]


According to the documentation, messages coming to the OSS will have the 
corresponding CPT/NUMA cores that match the NIC handle the incoming traffic. 
However when accessing the OST themselves I did not find any tuning to bind a 
specific OST to CPU cores.

I did see an option called “options ksocklnd enable_irq_affinity=0” but it 
appears it is only for the ethernet driver and not RDMA.

Is there  a way to limit each OST’s read/write threads to a specific set of CPU 
cores?


Thank you,
Jeff


Sent from Mail for Windows 10

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Full List of Required Open Lustre Ports?

2023-02-01 Thread Ellis Wilson via lustre-discuss
Hi folks,

We've seen some weird stuff recently with UFW/iptables dropping packets on our 
OSS and MDS nodes.  We are running 2.15.1.  Example:

[   69.472030] [UFW BLOCK] IN=eth0 OUT= MAC= SRC= DST= LEN=52 
TOS=0x00 PREC=0x00 TTL=64 ID=58224 DF PROTO=TCP SPT=1022 DPT=988 WINDOW=510 
RES=0x00 ACK FIN URGP=0

[11777.280724] [UFW BLOCK] IN=eth0 OUT= MAC= SRC= DST= LEN=64 
TOS=0x00 PREC=0x00 TTL=64 ID=44206 DF PROTO=TCP SPT=988 DPT=1023 WINDOW=509 
RES=0x00 ACK URGP=0

Previously, we were only allowing 988 bidirectionally on BOTH clients and 
servers.  This was based on guidance from the Lustre manual.  From the above 
messages it appears we may need to expand that range.  This thread discusses it:
https://www.mail-archive.com/lustre-discuss@lists.lustre.org/msg17229.html

Based on that thread and some code reading it appears that sans explicit 
configuration of conns_per_peer the extra ports potentially required are 
autotuning (ksocklnd_speed2cpp).  E.G., if we have a node with 50Gbps 
interface, we may need up to 3 ports open to accommodate the extra ports.  
These appear to be selected beginning at 1023 and going down as far as 512.

Questions:
1. If we do not open up more than 988, are there known performance issues for 
machines at or below say, 50Gbps?  It does seem that with these closed we don't 
have correctness or visible performance problems, so there must be some 
fallback mechanism at play.
2. Can we just open 1023 to 1021 for a 50GigE machine?  Or are there situations 
where binding might fail and the algorithm could potentially attempt to create 
sockets all the way down to 512?
3. Regardless of the answer to #2, do we need to open these ports on all client 
and server nodes, or can we get away with just server nodes?
4. Do these need to be opened just for egress from the node in question, or 
bidirectionally?

Thanks in advance!

Best,

ellis
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org