** Description changed: + [Impact] + When TFTP booting with UEFI, the TFTP server would stack trace when terminating the transfer. This would lead to some UEFI boot issues when using UEFI + + [Test Case] + 1. Install MAAS + 2. Setup UEFI on machine to PXE boot from MAAS + 3. UEFI boot machine, it will fail as tftp chrases. + + 4. With fix, UEFI boot machine, it will succeed as tftp doesn't crash. + + [Regression Potential] + Minimal. This has tested and QA and proven to be working as expected. + ubuntu 14.04LTS + MaaS 1.5 on x86_64 Controller: esxi vm xeon + vmnet3/ixgbe - Nodes: + Nodes: supermicro twinblades Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz 128GB RAM 2@ ige 2@ ixgbe <<< used for PXE booting Trying to add physical nodes configured for Trusty Tahr amd64. IPMI powerctl cycles the node, tftp's two boot files, then commissioning goes out to lunch: 15:12:11.465976 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:25:90:e5:5a:56 (oui Unknown), length 359 15:12:11.468982 IP 172.30.193.38.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 300 15:12:11.475270 IP 172.30.255.101.1294 > 172.30.193.38.tftp: 41 RRQ "bootx64.efi" octet tsize 0 blksize 1468 15:12:11.535326 IP 172.30.255.101.1295 > 172.30.193.38.tftp: 33 RRQ "bootx64.efi" octet blksize 1468 15:12:12.024716 IP 172.30.255.101.1296 > 172.30.193.38.tftp: 33 RRQ "/grubx64.efi" octet blksize 512 These tb's coincide with above traffic and node sitting at the grub prompt indefinitely: 2014-05-08 15:12:11-0700 [-] Starting protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469a098> 2014-05-08 15:12:11-0700 [RemoteOriginReadSession (UDP)] Unhandled Error - Traceback (most recent call last): - File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 73, in callWithContext - return context.call({ILogContext: newCtx}, func, *args, **kw) - File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext - return self.currentContext().callWithContext(ctx, func, *args, **kw) - File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext - return func(*args,**kw) - File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite - why = selectable.doRead() - --- <exception caught here> --- - File "/usr/lib/python2.7/dist-packages/twisted/internet/udp.py", line 234, in doRead - self.protocol.datagramReceived(data, addr) - File "/usr/lib/python2.7/dist-packages/tftp/bootstrap.py", line 171, in datagramReceived - datagram = TFTPDatagramFactory(*split_opcode(datagram)) - File "/usr/lib/python2.7/dist-packages/tftp/datagram.py", line 394, in __call__ - return datagram_class.from_wire(payload) - File "/usr/lib/python2.7/dist-packages/tftp/datagram.py", line 323, in from_wire - raise InvalidErrorcodeError(errorcode) - tftp.errors.InvalidErrorcodeError: Unknown error code: 8 - + Traceback (most recent call last): + File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 73, in callWithContext + return context.call({ILogContext: newCtx}, func, *args, **kw) + File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext + return self.currentContext().callWithContext(ctx, func, *args, **kw) + File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext + return func(*args,**kw) + File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite + why = selectable.doRead() + --- <exception caught here> --- + File "/usr/lib/python2.7/dist-packages/twisted/internet/udp.py", line 234, in doRead + self.protocol.datagramReceived(data, addr) + File "/usr/lib/python2.7/dist-packages/tftp/bootstrap.py", line 171, in datagramReceived + datagram = TFTPDatagramFactory(*split_opcode(datagram)) + File "/usr/lib/python2.7/dist-packages/tftp/datagram.py", line 394, in __call__ + return datagram_class.from_wire(payload) + File "/usr/lib/python2.7/dist-packages/tftp/datagram.py", line 323, in from_wire + raise InvalidErrorcodeError(errorcode) + tftp.errors.InvalidErrorcodeError: Unknown error code: 8 + 2014-05-08 15:12:11-0700 [RemoteOriginReadSession (UDP)] Logged OOPS id OOPS-20c0e9854c8b0ef29998d4a27454fc6a: InvalidErrorcodeError: Unknown error code: 8 2014-05-08 15:12:11-0700 [TFTP (UDP)] Datagram received from ('172.30.255.101', 1295): <RRQDatagram(filename=bootx64.efi, mode=octet, options={'blksize': '1468'})> 2014-05-08 15:12:11-0700 [TFTP (UDP)] Datagram received from ('172.30.255.101', 1295): <RRQDatagram(filename=bootx64.efi, mode=octet, options={'blksize': '1468'})> 2014-05-08 15:12:11-0700 [-] RemoteOriginReadSession starting on 43143 2014-05-08 15:12:11-0700 [-] RemoteOriginReadSession starting on 43143 2014-05-08 15:12:11-0700 [-] Starting protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469aea8> 2014-05-08 15:12:11-0700 [-] Starting protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469aea8> 2014-05-08 15:12:12-0700 [RemoteOriginReadSession (UDP)] Final ACK received, transfer successful 2014-05-08 15:12:12-0700 [RemoteOriginReadSession (UDP)] Final ACK received, transfer successful 2014-05-08 15:12:12-0700 [-] (UDP Port 43143 Closed) 2014-05-08 15:12:12-0700 [-] (UDP Port 43143 Closed) 2014-05-08 15:12:12-0700 [-] Stopping protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469aea8> 2014-05-08 15:12:12-0700 [-] Stopping protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469aea8> 2014-05-08 15:12:12-0700 [TFTP (UDP)] Datagram received from ('172.30.255.101', 1296): <RRQDatagram(filename=/grubx64.efi, mode=octet, options={'blksize': '512'})> 2014-05-08 15:12:12-0700 [TFTP (UDP)] Datagram received from ('172.30.255.101', 1296): <RRQDatagram(filename=/grubx64.efi, mode=octet, options={'blksize': '512'})> 2014-05-08 15:12:12-0700 [-] RemoteOriginReadSession starting on 56400 2014-05-08 15:12:12-0700 [-] RemoteOriginReadSession starting on 56400 2014-05-08 15:12:12-0700 [-] Starting protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469a440> 2014-05-08 15:12:12-0700 [-] Starting protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469a440> 2014-05-08 15:12:12-0700 [RemoteOriginReadSession (UDP)] (UDP Port 41252 Closed) 2014-05-08 15:12:12-0700 [RemoteOriginReadSession (UDP)] (UDP Port 41252 Closed) 2014-05-08 15:12:12-0700 [RemoteOriginReadSession (UDP)] Stopping protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469a098> 2014-05-08 15:12:12-0700 [RemoteOriginReadSession (UDP)] Stopping protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469a098> 2014-05-08 15:12:12-0700 [RemoteOriginReadSession (UDP)] Final ACK received, transfer successful 2014-05-08 15:12:12-0700 [RemoteOriginReadSession (UDP)] Final ACK received, transfer successful 2014-05-08 15:12:12-0700 [-] (UDP Port 56400 Closed) 2014-05-08 15:12:12-0700 [-] (UDP Port 56400 Closed) 2014-05-08 15:12:12-0700 [-] Stopping protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469a440> 2014-05-08 15:12:12-0700 [-] Stopping protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7fbbe469a440> 2014-05-08 15:12:13-0700 [-] Unhandled Error - Traceback (most recent call last): - File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 392, in startReactor - self.config, oldstdout, oldstderr, self.profiler, reactor) - File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 313, in runReactorWithLogging - reactor.run() - File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1192, in run - self.mainLoop() - File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1201, in mainLoop - self.runUntilCurrent() - --- <exception caught here> --- - File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 824, in runUntilCurrent - call.func(*call.args, **call.kw) - File "/usr/lib/python2.7/dist-packages/tftp/util.py", line 80, in _call_and_schedule - self.callable(*self.callable_args, **self.callable_kwargs) - File "/usr/lib/python2.7/dist-packages/twisted/internet/udp.py", line 254, in write - return self.socket.send(datagram) - exceptions.AttributeError: 'Port' object has no attribute 'socket' - - 2014-05-08 15:12:13-0700 [-] Logged OOPS id OOPS-4ad4c1419556eb88cc72311fd54f737b: AttributeError: 'Port' object has no attribute 'socket' + Traceback (most recent call last): + File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 392, in startReactor + self.config, oldstdout, oldstderr, self.profiler, reactor) + File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 313, in runReactorWithLogging + reactor.run() + File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1192, in run + self.mainLoop() + File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1201, in mainLoop + self.runUntilCurrent() + --- <exception caught here> --- + File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 824, in runUntilCurrent + call.func(*call.args, **call.kw) + File "/usr/lib/python2.7/dist-packages/tftp/util.py", line 80, in _call_and_schedule + self.callable(*self.callable_args, **self.callable_kwargs) + File "/usr/lib/python2.7/dist-packages/twisted/internet/udp.py", line 254, in write + return self.socket.send(datagram) + exceptions.AttributeError: 'Port' object has no attribute 'socket' + 2014-05-08 15:12:13-0700 [-] Logged OOPS id OOPS- + 4ad4c1419556eb88cc72311fd54f737b: AttributeError: 'Port' object has no + attribute 'socket' - Nodes and controller are on the same untagged subnet but there is an lldp'd link between the bladeserver's onboard xgb switches and the controller's connected xgb Arista. + Nodes and controller are on the same untagged subnet but there is an + lldp'd link between the bladeserver's onboard xgb switches and the + controller's connected xgb Arista. root@pre-maas-ctrl:/var/log/maas/oops/2014-05-08# dpkg -l '*maas*' | cat Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-===================================-=============================-============-=============================================================================== ii maas 1.5+bzr2252-0ubuntu1 all MAAS server all-in-one metapackage ii maas-cli 1.5+bzr2252-0ubuntu1 all MAAS command line API tool ii maas-cluster-controller 1.5+bzr2252-0ubuntu1 all MAAS server cluster controller ii maas-common 1.5+bzr2252-0ubuntu1 all MAAS server common files ii maas-dhcp 1.5+bzr2252-0ubuntu1 all MAAS DHCP server ii maas-dns 1.5+bzr2252-0ubuntu1 all MAAS DNS server ii maas-region-controller 1.5+bzr2252-0ubuntu1 all MAAS server complete region controller ii maas-region-controller-min 1.5+bzr2252-0ubuntu1 all MAAS Server minimum region controller ii python-django-maas 1.5+bzr2252-0ubuntu1 all MAAS server Django web framework ii python-maas-client 1.5+bzr2252-0ubuntu1 all MAAS python API client ii python-maas-provisioningserver 1.5+bzr2252-0ubuntu1 all MAAS server provisioning libraries - Repro: This is a pretty standard initial configuration afaict, following the provided instructions. I notice there are no grub.cfg-* anywhere, only the grub.cfg template. Could that be why none of the nodes are doing anything once they're in the grub shell? root@pre-maas-ctrl:~# cat /var/lib/maas/boot- resources/current/grub/grub.cfg # MAAS GRUB2 pre-loader configuration file # Load based on MAC address first. configfile (pxe)/grub/grub.cfg-${net_default_mac} # Failed to load based on MAC address. # Load amd64 by default, UEFI only supported by 64-bit configfile (pxe)/grub/grub.cfg-default-amd64 root@pre-maas-ctrl:~# ls -l /var/lib/maas/boot-resources/current/grub/ total 4 -rw-r--r-- 1 root root 270 May 6 18:23 grub.cfg root@pre-maas-ctrl:~# locate grub.cfg /boot/grub/grub.cfg /usr/share/doc/grub-common/examples/grub.cfg /var/lib/maas/boot-resources/snapshot-20140506-172255/grub/grub.cfg - - Controller VM is connected to unrouted internal private network and external lab, which is not used by MaaS. Nodes are only connected to the private n/w. Controller is managing tftp, dhcp and dns and ip helper pointed to its private IP. + Controller VM is connected to unrouted internal private network and + external lab, which is not used by MaaS. Nodes are only connected to + the private n/w. Controller is managing tftp, dhcp and dns and ip + helper pointed to its private IP. Nodes are configured for 'Default Ubuntu Release' Trusty Tahr. Boot images: 4 trusty amd64 generic commissioning release May 6, 2014, 6:23 p.m. 7 trusty amd64 generic install release May 6, 2014, 6:23 p.m. 3 trusty amd64 generic xinstall release May 6, 2014, 6:23 p.m. 5 trusty i386 generic commissioning release May 6, 2014, 6:23 p.m. 12 trusty i386 generic install release May 6, 2014, 6:23 p.m. 9 trusty i386 generic xinstall release May 6, 2014, 6:23 p.m. 6 precise amd64 generic commissioning release May 6, 2014, 6:23 p.m. 11 precise amd64 generic install release May 6, 2014, 6:23 p.m. 10 precise amd64 generic xinstall release May 6, 2014, 6:23 p.m. 2 precise i386 generic commissioning release May 6, 2014, 6:23 p.m. 8 precise i386 generic install release May 6, 2014, 6:23 p.m. 1 precise i386 generic xinstall release May 6, 2014, 6:23 p.m.
-- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to python-tx-tftp in Ubuntu. https://bugs.launchpad.net/bugs/1317705 Title: Commissioning x86_64 node never completes, sitting at grub prompt, pserv py tbs To manage notifications about this bug go to: https://bugs.launchpad.net/maas/+bug/1317705/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs