There's also an "lctl --device <dev> activate" that I've used in the past though I don't know what conditions need to be for it to work.

On 8/27/24 07:46, Andreas Dilger via lustre-discuss wrote:
Hi Jan,
There is "lctl --device XXXX recover" that will trigger a reconnect to the named OST device (per "lctl dl" output), but not sure if that will help.


Cheers, Andreas

On Aug 22, 2024, at 06:36, Haarst, Jan van via lustre-discuss <lustre-discuss@lists.lustre.org> wrote:



Hi,

Probably the wording of the subject doesn’t actually cover the issue, what we see is this :

We have a client behind a router (linking tcp to Omnipath) that shows an inactive OST (all on 2.15.5).

Other clients that go through the router do not have this issue.

One client had the same issue, although it showed a different OST as inactive.

After a reboot, all was well again on that machine.

The clients can lctl ping the OSSs.

So although we have a workaround (reboot the client), it would be nice to:

 1. Fix the issue without a reboot
 2. Fix the underlying issue.

It might be unrelated, but we also see another routing issue every now and then:

The router stops routing request toward a certain OSS, and this can be fixed by deleting the peer_nid of the OSS from the router.

I am probably missing informative logs, but I’m more than happy to try to generate them, if somebody has a pointer to how.

We are a bit stumped right now.

With kind regards,

--

Jan van Haarst

HPC Administrator

For Anunna/HPC questions, please use https://support.wur.nl <https://urldefense.us/v3/__https://support.wur.nl__;!!G2kpM7uM-TzIFchu!1YPSOGUFPvipdg8HUxDkmcB7rvfUxuSATnKZq-9LFTP16TrMxtlrPe7m3ccX4BmKFoLsVnaKiIL3u4pxK2GT6mMjyuAoAg$> (with HPC as service)

Aanwezig: maandag, dinsdag, donderdag & vrijdag

Facilitair Bedrijf, onderdeel van Wageningen University & Research

Afdeling Informatie Technologie

Postbus 59, 6700 AB, Wageningen

Gebouw 116, Akkermaalsbos 12, 6700 WB, Wageningen

http://www.wur.nl/nl/Disclaimer.htm <https://urldefense.us/v3/__http://www.wur.nl/nl/Disclaimer.htm__;!!G2kpM7uM-TzIFchu!1YPSOGUFPvipdg8HUxDkmcB7rvfUxuSATnKZq-9LFTP16TrMxtlrPe7m3ccX4BmKFoLsVnaKiIL3u4pxK2GT6mP2LXgG1Q$>

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org <https://urldefense.us/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!G2kpM7uM-TzIFchu!1YPSOGUFPvipdg8HUxDkmcB7rvfUxuSATnKZq-9LFTP16TrMxtlrPe7m3ccX4BmKFoLsVnaKiIL3u4pxK2GT6mNJQIy33g$>

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
https://urldefense.us/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!G2kpM7uM-TzIFchu!1YPSOGUFPvipdg8HUxDkmcB7rvfUxuSATnKZq-9LFTP16TrMxtlrPe7m3ccX4BmKFoLsVnaKiIL3u4pxK2GT6mNJQIy33g$
  
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to