Hi all, I hit the same issue and pushed a fix for it. See: https://jira.whamcloud.com/browse/LU-19820 https://review.whamcloud.com/c/fs/lustre-release/+/63536
Aurélien ________________________________ De : lustre-discuss <[email protected]> de la part de Mark Dixon via lustre-discuss <[email protected]> Envoyé : jeudi 22 janvier 2026 10:33 À : Christopher J Orr <[email protected]> Cc : [email protected] <[email protected]> Objet : Re: [lustre-discuss] DKMS build broken with NVIDIA doca packages External email: Use caution opening links or attachments Hi Christopher, We previously used a similar approach but, with the (very welcome!!) move to using DKMS on EL, DOCA now supports multiple kernels at the same time and so maintains a per-kernel ofa source directory - a blanket default is no longer appropriate. In fact, on one of my test hosts /usr/src/ofa_kernel/default ended up becoming a dangling link. Not sure if that was a bug, or if DOCA has given up on it. Unless Jon gets there first, I'll get a ticket opened when I get to it. Best, Mark On Wed, 21 Jan 2026, Christopher J Orr wrote: > [You don't often get email from [email protected]. Learn why this is important > at https://aka.ms/LearnAboutSenderIdentification ] > > [EXTERNAL EMAIL] > > This is how I ended up fixing it on Lustre 2.14.0_ddn191 on Rocky 9.7 > with DOCA-OFED. > > ------------------------------------------------------------------ > --- lustre-dkms_pre-build.sh.orig 2026-01-06 16:55:25.428285300 - > 0500 > +++ lustre-dkms_pre-build.sh 2026-01-06 18:00:28.357307490 -0500 > @@ -9,8 +9,9 @@ > > case $1 in > lustre-client) > + [ -f /etc/sysconfig/lustre ] && . /etc/sysconfig/lustre > SERVER="--disable-server" > - KERNEL_STUFF="" > + KERNEL_STUFF="${KERNEL_STUFF:-}" > ;; > > lustre-zfs|lustre-all) > ------------------------------------------------------------------ > > ...and then, add > KERNEL_STUFF="--with-o2ib=/usr/src/ofa_kernel/default/" > ...to /etc/sysconfig/lustre > > I hope this helps! > Thanks, > Christopher Orr > > > On Wed, 2026-01-21 at 16:16 +0000, Patrick Farrell via lustre-discuss > wrote: >> >> ---- External Email: Use caution with attachments, links, or sharing >> data ---- >> >> >> >> >> Folks, if you want to create a JIRA ticket, you can ask for an >> account. We're very happy to get contributions. >> >> >> Regards, >> Patrick >> >> >> From: lustre-discuss <[email protected]> on >> behalf of Jon Marshall via lustre-discuss >> <[email protected]> >> Sent: Wednesday, January 21, 2026 9:36 AM >> To: Mark Dixon <[email protected]> >> Cc: [email protected] <[email protected]> >> Subject: Re: [lustre-discuss] DKMS build broken with NVIDIA doca >> packages >> >> >> >> >> >> >> Hi Mark, >> >> >> Thanks for confirming I'm not on my own - I've not got any further, >> other than starting to look at creating a dummy RPM package that fits >> the criteria Lustre is looking for! That or using a very clunky >> wrapper script around rpm itself to lie to the configure script. I >> actually have got this second approach working so there is nothing >> wrong with building against the doca packages, but its a bit annoying >> to automate the build process for our servers like this. >> >> >> I've not got access to create a Jira ticket myself either. >> >> >> Cheers >> Jon >> >> >> From: Mark Dixon <[email protected]> >> Sent: Wednesday, January 21, 2026 12:23 >> To: Jon Marshall <[email protected]> >> Cc: [email protected] <[email protected]> >> Subject: Re: [lustre-discuss] DKMS build broken with NVIDIA doca >> packages >> >> >> >> >> Hi Jon, >> >> As it happens, I've been looking at the same thing. I hadn't spotted >> LU-18002 (thanks), but unfortunately it isn't enough to accommodate >> the >> move to dkms on rhel. >> >> I don't know how far you've got since Monday, but there now seems a >> need >> for an explicit check of /usr/src/ofa_kernel (as it's no longer owned >> by a >> package) and the "find" for rdma_cm.h needs the -L flag to make sense >> of >> the new maze of twisty passages. >> >> I think that a new jira ticket needs to be opened... >> >> Cheers, >> >> Mark >> >> >> On Mon, 19 Jan 2026, Jon Marshall via lustre-discuss wrote: >> >>> [EXTERNAL EMAIL] >>> Hi, >>> >>> I'm in the process of rebuilding lustre on Rocky 8.10 and have >>> noticed that NVIDIA have been messing around with their packages >>> again, now rebranding everything under the doca label. For LTS >>> purposes we're sticking with 2.15.8 for lustre, and I'm trying to >>> get this to build with NVIDIA DOCA 3.2.1 LTS. >>> >>> The trouble is, it seems they have rename the package mlnx- >>> ofa_kernel-devel to mlnx-ofa_kernel-dkms. Looking at the DKMS >>> configure script, it is searching for: >>> O2IBPKG="mlnx-ofed-kernel-dkms" >>> O2IBPKG+="|mlnx-ofed-kernel-modules" >>> O2IBPKG+="|mlnx-ofa_kernel-devel" >>> O2IBPKG+="|compat-rdma-devel" >>> O2IBPKG+="|kernel-ib-devel" >>> O2IBPKG+="|ofa_kernel-devel" >>> >>> And hence it can't find the package (underscore instead of hyphen), >>> which causes the build to fail. >>> >>> Digging around the JIRA, I found >>> this<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fjira.whamc >>> loud.com%2fbrowse%2fLU- >>> 18002%3fjql%3dtext%2520~%2520dkms%2520ORDER%2520BY%2520created%2520 >>> DESC&c=E,1,jSSRk0tXHMx8RQEMnGYEBCTdjBWE- >>> 7d4UZni7OYRCsspax3v09_1sRG4eF9iy77rKx5DppDWrhVsH9ZQ7lk_1OT3Wmb_XeUj >>> WfNuEPbhpR8,&typo=1> issue, but it looks to only have been fixed in >>> 2.16, which we've sort of ruled out at this stage. Looking at the >>> actual >>> patch<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2freview.wh >>> amcloud.com%2fc%2ffs%2flustre- >>> release%2f%20%2f55625%2f4%2flnet%2fautoconf%2flustre- >>> lnet.m4&c=E,1,Wi5eGkf0dY16u2VrGeX06tAPDP6YCLAJhfgPURLolu4ssfvLF8Xiw >>> PpqpixQifO1NdxtNZ5tpz8FAqP5gd419t_Yvuu_c- >>> NzIAY1JvTjYeVLYQ,,&typo=1>, it seems pretty minor and I was >>> wondering if this could be back ported to 2.15 as well. >>> >>> I can work around by building things myself, but I was hoping to be >>> able to yum install the packages direct from the whamcloud repos, >>> as this greatly simplifies my rollout. >>> >>> Cheers >>> Jon >>> >>> >>> Jon Marshall >>> >>> High Performance Computing Specialist >>> >>> >>> >>> IT and Scientific Computing Team >>> >>> >>> >>> Cancer Research UK Cambridge Institute >>> >>> Li Ka Shing Centre | Robinson Way | Cambridge | CB2 0RE >>> >>> Web<http://www.cruk.cam.ac.uk/> | >>> Facebook<http://www.facebook.com/cancerresearchuk> | >>> Twitter<https://linkprotect.cudasvc.com/url?a=http%3a%2f%2ftwitter. >>> com%2fCR_UK&c=E,1,aCcWa5p892R3_9Lj1VLXiO9wgithO5AHQZh841zayJAVcOaCk >>> JC2gyGFMTpTADviZ3xtPn6klyCExiJqHjg1k5lzggxNNPrsaIis62wIBwOJ&typo=1> >>> >>> >>> >>> [Description: CRI Logo]<http://www.cruk.cam.ac.uk/> >>> >>> >> _______________________________________________ >> lustre-discuss mailing list >> [email protected] >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
