Hi Arnuld

It is most important to keep the Slurm version the same across the board.

As you are mentioning the "deb" package I am assuming all of your nodes are
of a debian-based distribution that should be close enough for each other.
However, Debian based distros are not as "binary compatible" as RHEL based
distros (Say, RHEL, Alma, Rocky, CentOS, Oracle, Fedora etc.), thus even
though they all use "deb" package, it would be better to avoid sharing deb
across different distros.

If all of your distros have a similar package version for the dependencies
(say, at least on glibc level), except for different way to name a package
(e.g. apache2 - httpd), that would potentially allow you to run the same
slurm on other distros. In this case, you may work around them by using the
DEBIAN/control Depends field to list all of the potential names for each
dependency.

Static linking packages or using a conda-like environment may help you more
if those distros are more different and require a rebuild per distro.
Otherwise, it would probably make more sense to just build them on each and
every node based on the feature they need (say, ROCm or nvml makes no sense
on a node without such devices).

More complex structure does indeed require more maintenance work. I got
quite tired of it and decided to just ship with RHEL-family OS for all
computer nodes and let those who are  more familiar with whatever distro to
start one up with singularity or docker by themselves.

Sincerely,

S. Zhang

2024年5月22日(水) 17:11 Arnuld via slurm-users <slurm-users@lists.schedmd.com>:

> We have several nodes, most of which have different Linux distributions
> (distro for short). Controller has a different distro as well. The only
> common thing between controller and all the does is that all of them ar
> x86_64.
>
> I can install Slurm using package manager on all the machines but this
> will not work because controller will have a different version of Slurm
> compared to the nodes (21.08 vs 23.11)
>
> If I build from source then I see two solutions:
>  - build a deb package
>  - build a custom package (./configure, make, make install)
>
> Building a debian package on the controller and then distributing the
> binaries on nodes won't work either because that binary will start looking
> for the shared libraries that it was built for and those don't exist on the
> nodes.
>
> So the only solution I have is to build a static binary using a custom
> package. Am I correct or is there another solution here?
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to