I think it was this one:
https://git.whamcloud.com/?p=fs/lustre-release.git;a=commit;h=878561880d2aba038db95e199f82b186f22daa45
On 07.03.2022 09.05, Hans Henrik Happe via lustre-discuss wrote:
Hi Thomas,
They should work together, but there are other requirements that need
to be fulfilled:
https://wiki.lustre.org/Lustre_2.12.8_Changelog
I guess your servers are CentOS 7.9 as required for 2.12.8.
I had an issue with Rocky 8.5 and the latest kernel with 2.12.8. While
RHEL 8.5 is supported there was something new after
4.18.0-348.2.1.el8_5, which caused problems. I found an LU fixing it
post 2.12.8 (can't remember the number), but downgrading to
4.18.0-348.2.1.el8_5 was the quick fix.
Cheers,
Hans Henrik
On 03.03.2022 08.40, Thomas Roth via lustre-discuss wrote:
Dear all,
this might be just something I forgot or did not read thoroughly, but
shouldn't a 2.12.7-client work with 2.12.8 - servers?
The 2.12.8-changelog has the standard disclaimer
Interoperability Support:
Clients & Servers: Latest 2.10.X and Latest 2.11.X
I have this test cluster that I upgraded recently to 2.12.8 on the
servers.
The fist client I attached now is a fresh install of rhel 8.5 (Alma).
I installed 'kmod-lustre-client' and `lustre-client` from
https://downloads.whamcloud.com/public/lustre/lustre-2.12.8/el8.5.2111/
I copied a directory containing ~5000 files - no visible issues
The next client was also installed with rhel 8.5 (Alma), but now
using 'lustre-client-2.12.7-1' and 'lustre-client-dkms-2.12.7-1' from
https://downloads.whamcloud.com/public/lustre/lustre-2.12.7/el8/client/RPMS/x86_64/
As on my first client, I copied a directory containing ~5000 files.
The copy stalled, and the OSTs exploded in my face
kernel: LustreError: 23345:0:(events.c:310:request_in_callback())
event type 2, status -103,
service ost_io
kernel: LustreError:
40265:0:(pack_generic.c:605:__lustre_unpack_msg()) message length 0
too small
for magic/version check
kernel: LustreError:
40265:0:(sec.c:2217:sptlrpc_svc_unwrap_request()) error unpacking
request from
12345-10.20.2.167@o2ib6 x1726208297906176
kernel: LustreError: 23345:0:(events.c:310:request_in_callback())
event type 2, status -103,
service ost_io
The latter message is repeated ad infinitum.
The client log blames the network:
Request sent has failed due to network error
Connection to was lost; in progress operations using this service
will wait for recovery to complete
LustreError: 181316:0:(events.c:205:client_bulk_callback()) event
type 1, status -103, desc0000000086e248d6
LustreError: 181315:0:(events.c:205:client_bulk_callback()) event
type 1, status -5, desc
00000000e569130f
There is also a client running Debian 9 and Lustre 2.12.6 (compiled
from git) - no trouble at all.
The I switched those two rhel8.5-clients: reinstalled the OS, gave
the first one the 2.12.7 -packages, the second on the 2.12.8 - and
the error followed: again the client running with
'lustre-client-dkms-2.12.7-1' immedeately ran into trouble, causing
the same error messages in the logs.
So this is not a network problem in the sense of broken hardware etc.
What did I miss?
Some important Jira I did not read?
Regards
Thomas
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org