I am having a serious problem that I am finding hard to diagnose.

I am running Ubuntu 14.04.1 LTS Trusty Tahir with LTSP 5.5.1-1ubuntu2.
Both the server and client images are at the latest level and are kept
up to date.

Occasionally when one client is powered up and negotiates a DHCP
address all other machines go blank, or maybe have error messages
indicating a SQUASHFS error, unable to connect.  This happens fairly
randomly and (annoyingly I can't reproduce it.

Looking at /var/log/syslog I think these messages may relate to when I
have a problem.

Mar 19 11:02:39 BFoE01 dhcpd: DHCPACK on 192.168.0.26 to 00:1e:4f:4d:72:4a via e
th0
Mar 19 11:02:39 BFoE01 nbd_server[2944]: Spawned a child process
Mar 19 11:02:39 BFoE01 nbd_server[28560]: virststyle ipliteral
Mar 19 11:02:39 BFoE01 nbd_server[28560]: connect from 192.168.0.26, assigned fi
le is /opt/ltsp/images/i386.img
Mar 19 11:02:39 BFoE01 nbd_server[28560]: Can't open authorization file /etc/lts
p/nbd-server.allow (No such file or directory).
Mar 19 11:02:39 BFoE01 nbd_server[28560]: Starting to serve
Mar 19 11:02:39 BFoE01 nbd_server[28560]: Size of exported file/device is 503033
856
Mar 19 11:02:44 BFoE01 ldminfod[28562]: connect from 192.168.0.26 (192.168.0.26)
Mar 19 11:02:45 BFoE01 nbd_server[28567]: virststyle ipliteral
Mar 19 11:02:45 BFoE01 nbd_server[28567]: connect from 192.168.0.26, assigned fi
le is /opt/ltsp/images/i386.img
Mar 19 11:02:45 BFoE01 nbd_server[28567]: Can't open authorization file /etc/lts
p/nbd-server.allow (No such file or directory).
Mar 19 11:02:45 BFoE01 nbd_server[28567]: Starting to serve
Mar 19 11:02:45 BFoE01 nbd_server[28567]: Size of exported file/device is 503033
856
Mar 19 11:02:45 BFoE01 nbd_server[28567]: Disconnect request received.
Mar 19 11:02:45 BFoE01 nbd_server[2944]: Spawned a child process
Mar 19 11:02:45 BFoE01 nbd_server[2944]: Child exited with 0
Mar 19 11:03:06 BFoE01 nbd_server[2944]: Spawned a child process
Mar 19 11:03:06 BFoE01 nbd_server[28569]: Negotiation failed/5a: magic mismatch
Mar 19 11:03:06 BFoE01 nbd_server[28569]: Exiting.
Mar 19 11:03:06 BFoE01 nbd_server[28569]: Modern initial negotiation failed
Mar 19 11:03:06 BFoE01 nbd_server[2944]: Child exited with 1

The two error messages:
Mar 19 11:03:06 BFoE01 nbd_server[28569]: Negotiation failed/5a: magic mismatch
Mar 19 11:03:06 BFoE01 nbd_server[28569]: Modern initial negotiation failed

Only seem to occur when the problem happens.

The client that caused the problem (192.168.0.26 in this case) seems
to renegotiate with the server and the boot process is successful.
Other machines need to be rebooted, and until they do I get numerous
messages in the log which are:

  Mar 19 11:03:18 BFoE01 ldminfod[28570]: connect from 192.168.0.22
(192.168.0.22)

These keep occurring until 192.168.0.22 is rebooted.

I am thinking of changing to use NFS rather than NBD, but would prefer
to fix this as is.

Charles Barnwell

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_____________________________________________________________________
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
      https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net

Reply via email to