Hello, I have some trouble installing the Nvidia drivers into a compute
node, using a custom script.

Using xcat 2.13.5 on Centos 7.3

We repackaged the Nvidia driver in a RPM, which installs fine when the
node is up.

But when we install it during a node re-image, it fails, because there
are two different kernel version.

Bellow are more details, does anyone has some experience with the Nvidia
driver ?

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
This RPM is installed during the deployment process, which uses the
default Centos 7.3 kernel (3.10.0-514.el7). The kernel is also updated
during the installation process (but *before* the Nvidia driver
installation).

Once the node deployment is finished, it reboots into the latest kernel
(3.10.0-514.26.2.el7), and the Nvidia driver fails to load. If I reboot
into the older kernel, it works.

So I'd like to know if there is an options to install the Nvidia driver
for another kernel than the running one?

I have this error, if that helps:

Making nvidia.ko silently in
/opt/sgi/Factory-Install/nvidia/NVIDIA-Linux-x86_64-418.40.04/kernel
Module nvidia.ko from kernel 3.10.0-514.el7.x86_64 is not compatible
with kernel 3.10.0-514.26.2.el7.x86_64 in symbols:
acpi_bus_register_driver acpi_bus_get_device acpi_bus_unregister_driver
nvidia.ko:
/lib/modules/3.10.0-514.el7.x86_64/video/nvidia.ko

Curiously enough, if I re-install the same RPM by hand while running the
latest kernel, it works ... So I'm a bit lost here ...
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Thanks.
-- 
Nicolas

_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to