Hi Christopher,

We previously used a similar approach but, with the (very welcome!!) move to using DKMS on EL, DOCA now supports multiple kernels at the same time and so maintains a per-kernel ofa source directory - a blanket default is no longer appropriate.

In fact, on one of my test hosts /usr/src/ofa_kernel/default ended up becoming a dangling link. Not sure if that was a bug, or if DOCA has given up on it.

Unless Jon gets there first, I'll get a ticket opened when I get to it.

Best,

Mark

On Wed, 21 Jan 2026, Christopher J Orr wrote:

[You don't often get email from [email protected]. Learn why this is important 
at https://aka.ms/LearnAboutSenderIdentification ]

[EXTERNAL EMAIL]

This is how I ended up fixing it on Lustre 2.14.0_ddn191 on Rocky 9.7
with DOCA-OFED.

------------------------------------------------------------------
--- lustre-dkms_pre-build.sh.orig       2026-01-06 16:55:25.428285300 -
0500
+++ lustre-dkms_pre-build.sh    2026-01-06 18:00:28.357307490 -0500
@@ -9,8 +9,9 @@

case $1 in
    lustre-client)
+       [ -f /etc/sysconfig/lustre ] && . /etc/sysconfig/lustre
       SERVER="--disable-server"
-       KERNEL_STUFF=""
+       KERNEL_STUFF="${KERNEL_STUFF:-}"
       ;;

    lustre-zfs|lustre-all)
------------------------------------------------------------------

...and then, add
KERNEL_STUFF="--with-o2ib=/usr/src/ofa_kernel/default/"
...to /etc/sysconfig/lustre

I hope this helps!
Thanks,
Christopher Orr


On Wed, 2026-01-21 at 16:16 +0000, Patrick Farrell via lustre-discuss
wrote:

---- External Email: Use caution with attachments, links, or sharing
data ----




Folks, if you want to create a JIRA ticket, you can ask for an
account.  We're very happy to get contributions.


Regards,
Patrick


From: lustre-discuss <[email protected]> on
behalf of Jon Marshall via lustre-discuss
<[email protected]>
Sent: Wednesday, January 21, 2026 9:36 AM
To: Mark Dixon <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: [lustre-discuss] DKMS build broken with NVIDIA doca
packages






Hi Mark,


Thanks for confirming I'm not on my own - I've not got any further,
other than starting to look at creating a dummy RPM package that fits
the criteria Lustre is looking for! That or using a very clunky
wrapper script around rpm itself to lie to the configure script. I
actually have got this second approach working so there is nothing
wrong with building against the doca packages, but its a bit annoying
to automate the build process for our servers like this.


I've not got access to create a Jira ticket myself either.


Cheers
Jon


From: Mark Dixon <[email protected]>
Sent: Wednesday, January 21, 2026 12:23
To: Jon Marshall <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: [lustre-discuss] DKMS build broken with NVIDIA doca
packages




Hi Jon,

As it happens, I've been looking at the same thing. I hadn't spotted
LU-18002 (thanks), but unfortunately it isn't enough to accommodate
the
move to dkms on rhel.

I don't know how far you've got since Monday, but there now seems a
need
for an explicit check of /usr/src/ofa_kernel (as it's no longer owned
by a
package) and the "find" for rdma_cm.h needs the -L flag to make sense
of
the new maze of twisty passages.

I think that a new jira ticket needs to be opened...

Cheers,

Mark


On Mon, 19 Jan 2026, Jon Marshall via lustre-discuss wrote:

[EXTERNAL EMAIL]
Hi,

I'm in the process of rebuilding lustre on Rocky 8.10 and have
noticed that NVIDIA have been messing around with their packages
again, now rebranding everything under the doca label. For LTS
purposes we're sticking with 2.15.8 for lustre, and I'm trying to
get this to build with NVIDIA DOCA 3.2.1 LTS.

The trouble is, it seems they have rename the package mlnx-
ofa_kernel-devel to mlnx-ofa_kernel-dkms. Looking at the DKMS
configure script, it is searching for:
                        O2IBPKG="mlnx-ofed-kernel-dkms"
                        O2IBPKG+="|mlnx-ofed-kernel-modules"
                        O2IBPKG+="|mlnx-ofa_kernel-devel"
                        O2IBPKG+="|compat-rdma-devel"
                        O2IBPKG+="|kernel-ib-devel"
                        O2IBPKG+="|ofa_kernel-devel"

And hence it can't find the package (underscore instead of hyphen),
which causes the build to fail.

Digging around the JIRA, I found
this<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fjira.whamc
loud.com%2fbrowse%2fLU-
18002%3fjql%3dtext%2520~%2520dkms%2520ORDER%2520BY%2520created%2520
DESC&c=E,1,jSSRk0tXHMx8RQEMnGYEBCTdjBWE-
7d4UZni7OYRCsspax3v09_1sRG4eF9iy77rKx5DppDWrhVsH9ZQ7lk_1OT3Wmb_XeUj
WfNuEPbhpR8,&typo=1> issue, but it looks to only have been fixed in
2.16, which we've sort of ruled out at this stage. Looking at the
actual
patch<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2freview.wh
amcloud.com%2fc%2ffs%2flustre-
release%2f%20%2f55625%2f4%2flnet%2fautoconf%2flustre-
lnet.m4&c=E,1,Wi5eGkf0dY16u2VrGeX06tAPDP6YCLAJhfgPURLolu4ssfvLF8Xiw
PpqpixQifO1NdxtNZ5tpz8FAqP5gd419t_Yvuu_c-
NzIAY1JvTjYeVLYQ,,&typo=1>, it seems pretty minor and I was
wondering if this could be back ported to 2.15 as well.

I can work around by building things myself, but I was hoping to be
able to yum install the packages direct from the whamcloud repos,
as this greatly simplifies my rollout.

Cheers
Jon


Jon Marshall

High Performance Computing Specialist



IT and Scientific Computing Team



Cancer Research UK Cambridge Institute

Li Ka Shing Centre | Robinson Way | Cambridge | CB2 0RE

Web<http://www.cruk.cam.ac.uk/> |
Facebook<http://www.facebook.com/cancerresearchuk> |
Twitter<https://linkprotect.cudasvc.com/url?a=http%3a%2f%2ftwitter.
com%2fCR_UK&c=E,1,aCcWa5p892R3_9Lj1VLXiO9wgithO5AHQZh841zayJAVcOaCk
JC2gyGFMTpTADviZ3xtPn6klyCExiJqHjg1k5lzggxNNPrsaIis62wIBwOJ&typo=1>



[Description: CRI Logo]<http://www.cruk.cam.ac.uk/>


_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to