Re: [lustre-discuss] OSS crashes - could be LU-14341

2021-03-02 Thread Peter Jones via lustre-discuss
Yes 2.12.6 supports CentOS 7.9 - https://wiki.whamcloud.com/display/PUB/Lustre+Support+Matrix From: lustre-discuss on behalf of Sid Young via lustre-discuss Reply-To: Sid Young Date: Tuesday, March 2, 2021 at 6:48 PM To: lustre-discuss Subject: [lustre-discuss] OSS crashes - could be LU-1434

[lustre-discuss] OSS node crash/high CPU latency when deleting 100's of empty test files

2021-03-02 Thread Sid Young via lustre-discuss
Thx Karsten, looks like I found it at the same time you posted... I will have a go at re-imaging with 1160.6.1 (the build updates to 1160.15.2) and re-testing. Do you know if 2.14 will be released for Centos 7.9? Sid Hi Sid, if you are using a CentOS 7.9 kernel newer than 3.10.0-1160.6.1.el7.

[lustre-discuss] OSS crashes - could be LU-14341

2021-03-02 Thread Sid Young via lustre-discuss
G'Day all, Is 2.12.6 supported on Centos 7.9? After more investigation, I believe this is the issue I am seeing: https://jira.whamcloud.com/browse/LU-14341 If there is a patch release built for 7.9 I am really happy to test it, as it's easy to reproduce and crash the OSS's Sid Young __

[lustre-discuss] OSS Nodes crashing (and an MDS crash as well)

2021-03-02 Thread Sid Young via lustre-discuss
G'Day all, As I reported in a previous email my OSS nodes crash soon after initiating a file creation script using "dd" in a loop and then trying to delete all the files at once. At first I thought it was related to the Melanox 100G cards but after rebuilding everything using just the 10G network

Re: [lustre-discuss] OSS node crash/high CPU latency when deleting 100's of emty test files

2021-03-02 Thread Weiss, Karsten via lustre-discuss
Hi Sid, if you are using a CentOS 7.9 kernel newer than 3.10.0-1160.6.1.el7.x86_64 then check out LU-14341 as these kernel versions cause a timer related regression: https://jira.whamcloud.com/browse/LU-14341 We learnt this the hard way during the last couple of days and downgraded to kernel-3

[lustre-discuss] Stray files after failed lfs_migrate

2021-03-02 Thread Angelos Ching via lustre-discuss
Dear all, I was dealing with some OST migration using lfs_migrate and things went mostly fine albeit for a few files that might have been in use during the migration: # ls ls: cannot access ibleTHWm: No such file or directory ls: cannot access ib7rP0qy: No such file or directory ls: cannot a