Are you sure the fabric is up when lnet starts at boot? Double check the order your services start and be sure Lnet waits for the fabric/network before starting.
Thanks, Keith > -----Original Message----- > From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On > Behalf > Of David Rackley > Sent: Monday, August 13, 2018 2:14 PM > To: lustre-discuss@lists.lustre.org > Cc: sciops <sci...@jlab.org> > Subject: [lustre-discuss] lnet fails to start on reboot > > Hello, > I have built and installed lustre client 2.10.4-1 with centos 7.3 (3.10.0- > 514.el7.x86_64) and on reboot lnet fails with: > root@scissd1801:~] systemctl status lnet.service ● lnet.service - lnet > management > Loaded: loaded (/usr/lib/systemd/system/lnet.service; enabled; vendor > preset: disabled) > Active: failed (Result: exit-code) since Mon 2018-08-13 16:54:31 EDT; 16min > ago > Process: 2334 ExecStart=/usr/sbin/lnetctl import /etc/lnet.conf > (code=exited, > status=254) > Process: 2331 ExecStart=/usr/sbin/lnetctl lnet configure (code=exited, > status=0/SUCCESS) > Process: 2071 ExecStart=/usr/sbin/modprobe lnet (code=exited, > status=0/SUCCESS) Main PID: 2334 (code=exited, status=254) > > Aug 13 16:54:31 scissd1801 lnetctl[2334]: - net: > Aug 13 16:54:31 scissd1801 lnetctl[2334]: errno: -100 Aug 13 16:54:31 > scissd1801 lnetctl[2334]: descr: "cannot add network: Network is down" > Aug 13 16:54:31 scissd1801 lnetctl[2334]: - numa_range: > Aug 13 16:54:31 scissd1801 lnetctl[2334]: errno: 0 Aug 13 16:54:31 scissd1801 > lnetctl[2334]: descr: "success" > Aug 13 16:54:31 scissd1801 systemd[1]: lnet.service: main process exited, > code=exited, status=254/n/a Aug 13 16:54:31 scissd1801 systemd[1]: Failed to > start lnet management. > Aug 13 16:54:31 scissd1801 systemd[1]: Unit lnet.service entered failed state. > Aug 13 16:54:31 scissd1801 systemd[1]: lnet.service failed. > > The /etc/lnet.conf file exists and when I manually execute /usr/sbin/lnetctl > import /etc/lnet.conf it succeeds and lnet works and I can mount lustre as > expected. > > Any ideas? > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > David Rackley | ******** ** ** ** ******** ******** > CC Sci Comp Sys Admin | ** ** *** ** ** ** ** > rack...@jlab.org | ** ** ** * ** ******** ****** > | ** * ** ** * ** ** ** ** > Phone: 757.269.7041 | ** ****** ** *** ** ** ** > FAX: 757.269.6248 | TJNAF - Thomas Jefferson National Accelerator Facility > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > _______________________________________________ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org