The change is from:
commit 1889ec879d2ba721869217ad2e4f03d47b7fba40
Author: yangsbj <[email protected]>
Date: Thu Nov 1 23:29:01 2018 -0400
support site.httpport in nodeset and mknb
Prior to that change, non-80 ports did not work.
What is unusual is that 80 should be the normal port and the url parsing should
be xNBA and not UEFI specific, so I’m uncertain why :80 would cause a problem
in your environment.
Nodes that have not been ‘nodeset’ since your upgrade would not have the :80….
A reasonable mitigation in the code would be to skip the port designation if it
is default, though it is still fairly odd that this would do anything different…
From: Carl <[email protected]>
Sent: Thursday, July 18, 2019 4:01 AM
To: xCAT Users Mailing list <[email protected]>
Subject: [External] Re: [xcat-user] Unable to pxe boot node after mainboard
replacement
Hi all,
Further to the above I have managed to isolate the issue.
It looks like when nodeset is run, it is adding :80 to the boot options in the
leases file.
Eg:
host comp078 {
dynamic;
hardware ethernet 00:0a:f7:be:fc:de;
uid 00:0a:f7:be:fc:de;
fixed-address 100.64.1.78;
supersede server.ddns-hostname = "comp078";
supersede host-name = "comp078";
if option user-class-identifier = "xNBA" and option client-architecture
= 00:00 {
supersede server.always-broadcast = 01;
supersede server.filename =
"http://${next-server}:80/tftpboot/xcat/xnba/nodes/comp078<http://$%7bnext-server%7d:80/tftpboot/xcat/xnba/nodes/comp078>";
} elsif option user-class-identifier = "xNBA" and option
client-architecture = 00:09 {
supersede server.filename =
"http://${next-server}:80/tftpboot/xcat/xnba/nodes/comp078.uefi<http://$%7bnext-server%7d:80/tftpboot/xcat/xnba/nodes/comp078.uefi>";
} elsif option client-architecture = 00:07 {
supersede server.filename = "xcat/xnba.efi";
} elsif option client-architecture = 00:00 {
supersede server.filename = "xcat/xnba.kpxe";
} else {
supersede server.filename = "";
}
}
If I manually edit the leases file and remove :80 from the two filename entries
above, the node is able to boot fine.
Is anyone able to advise on why my environment might be now doing this?
Thanks,
Carl.
On Thu, 18 Jul 2019 at 16:22, Carl
<[email protected]<mailto:[email protected]>> wrote:
Hi Folks,
We recently replaced the mainboard on a Dell R640.
I removed the mac address from the node definition and let switch based
discovery take care of discovering the new MAC address and running BMC setup.
Everything went well and the node ended at the xcat shell.
However when I tried to boot the node (statelite) its failing to find the image
and if I persist it dies with a horible UEFI error. The node also has this
problem if I nodeset it to boot to shell.
As other nodes are able to boot statelite fine, I assumed that it was a
hardware error. Dell has replaced the mainboard a second time, but the issue
still persists.
It might be worth mentioning that the last time that we had a mainboard
replacement on a comp node was about 9 months ago and we have updated xCat a
couple of times since then. Attached is the console log of the UEFI crash and
the pxe boot messages that are seen on a working and non-working node.
Is anyone able to suggest any tricks to further debug this issue. I'm reluctant
to pin the problem on xCat, but find it unlikely that I have hit two mainboards
with the same fault.
Thanks,
Carl.
#### These are the pxe boot messages for the node that isnt working ####
[2019-07-10T10:45:47+10:00] ESC[2JESC[01;01HBooting from PXE Device 2:
Integrated NIC 1 Port 3 Partition 1
[2019-07-10T10:45:48+10:00]
[2019-07-10T10:45:48+10:00] >>Start PXE over IPv4.
[2019-07-10T10:45:52+10:00] Station IP address is 100.64.1.78
[2019-07-10T10:45:52+10:00]
[2019-07-10T10:45:52+10:00] Server IP address is 100.64.0.1
[2019-07-10T10:45:52+10:00] NBP filename is xcat/xnba.efi
[2019-07-10T10:45:52+10:00] NBP filesize is 139200 Bytes
[2019-07-10T10:45:52+10:00] Downloading NBP file...
[2019-07-10T10:45:52+10:00]
[2019-07-10T10:45:52+10:00] NBP file downloaded successfully.
[2019-07-10T10:45:52+10:00] xNBA initialising devices...ok
[2019-07-10T10:45:52+10:00]
[2019-07-10T10:45:52+10:00]
[2019-07-10T10:45:52+10:00] xCAT Network Boot Agent
[2019-07-10T10:45:52+10:00] ESC[1mESC[37mESC[40miPXE 1.0.3-131028
(d603e)ESC[0mESC[37mESC[40m -- Open Source Network Boot Firmware --
ESC[0mESC[36mESC[40mhttp://ipxe.orgESC[0mESC[37mESC[40m
[2019-07-10T10:45:52+10:00] Features: HTTP HTTPS iSCSI DNS TFTP EFI
[2019-07-10T10:45:52+10:00] net0: 00:0a:f7:be:b7:d2 using <NULL> on EFI SNP
(open)
[2019-07-10T10:45:52+10:00] [Link:up, TX:0 TXE:0 RX:0 RXE:0]
[2019-07-10T10:45:52+10:00] DHCP (net0 00:0a:f7:be:b7:d2)... ok
[2019-07-10T10:45:52+10:00] net0:
100.64.1.78/255.255.248.0<http://100.64.1.78/255.255.248.0> gw 100.64.0.1
[2019-07-10T10:45:52+10:00] Next server: 100.64.0.1
[2019-07-10T10:45:52+10:00] Filename:
http://100.64.0.1:80/tftpboot/xcat/xnba/nodes/comp078.uefi
[2019-07-10T10:45:52+10:00]
http://100.64.0.1:80/tftpboot/xcat/xnba/nodes/comp078.uefi..................
Connection timed out (http://ipxe.org/4c0a6012)
[2019-07-10T10:46:08+10:00] No more network devices
[2019-07-10T10:46:08+10:00] xNBA initialising devices...ok
[2019-07-10T10:46:08+10:00]
[2019-07-10T10:46:08+10:00]
[2019-07-10T10:46:08+10:00] xCAT Network Boot Agent
[2019-07-10T10:46:08+10:00] ESC[1mESC[37mESC[40miPXE 1.0.3-131028
(d603e)ESC[0mESC[37mESC[40m -- Open Source Network Boot Firmware --
ESC[0mESC[36mESC[40mhttp://ipxe.orgESC[0mESC[37mESC[40m
[2019-07-10T10:46:08+10:00] Features: HTTP HTTPS iSCSI DNS TFTP EFI
[2019-07-10T10:46:08+10:00] net1: 00:0a:f7:be:b7:d2 using <NULL> on EFI SNP
(open)
[2019-07-10T10:46:08+10:00] [Link:up, TX:0 TXE:0 RX:0 RXE:0]
[2019-07-10T10:46:08+10:00] DHCP (net1 00:0a:f7:be:b7:d2)... ok
[2019-07-10T10:46:08+10:00] net1:
100.64.1.78/255.255.248.0<http://100.64.1.78/255.255.248.0> gw 100.64.0.1
[2019-07-10T10:46:08+10:00] Next server: 100.64.0.1
[2019-07-10T10:46:08+10:00] Filename:
http://100.64.0.1:80/tftpboot/xcat/xnba/nodes/comp078.uefi
[2019-07-10T10:46:08+10:00]
http://100.64.0.1:80/tftpboot/xcat/xnba/nodes/comp078.uefi..................
Connection timed out (http://ipxe.org/4c0a6012)
[2019-07-10T10:46:24+10:00] No more network devices
#### As a comparison, this is what we see on a node that boots fine ####
[2019-07-18T11:59:45+10:00] ESC[0mESC[37mESC[40mESC[2JESC[01;01HBooting from
PXE Device 1: Integrated NIC 1 Port 3 Partition 1
[2019-07-18T11:59:46+10:00]
[2019-07-18T11:59:46+10:00] >>Start PXE over IPv4.
[2019-07-18T11:59:50+10:00] Station IP address is 100.64.1.86
[2019-07-18T11:59:50+10:00]
[2019-07-18T11:59:50+10:00] Server IP address is 100.64.0.1
[2019-07-18T11:59:50+10:00] NBP filename is xcat/xnba.efi
[2019-07-18T11:59:50+10:00] NBP filesize is 139200 Bytes
[2019-07-18T11:59:50+10:00] Downloading NBP file...
[2019-07-18T11:59:50+10:00]
[2019-07-18T11:59:50+10:00] NBP file downloaded successfully.
[2019-07-18T11:59:50+10:00] xNBA initialising devices...ok
[2019-07-18T11:59:50+10:00]
[2019-07-18T11:59:50+10:00]
[2019-07-18T11:59:50+10:00] xCAT Network Boot Agent
[2019-07-18T11:59:50+10:00] ESC[1mESC[37mESC[40miPXE 1.0.3-131028
(d603e)ESC[0mESC[37mESC[40m -- Open Source Network Boot Firmware --
ESC[0mESC[36mESC[40mhttp://ipxe.orgESC[0mESC[37mESC[40m
[2019-07-18T11:59:50+10:00] Features: HTTP HTTPS iSCSI DNS TFTP EFI
[2019-07-18T11:59:50+10:00] net0: 00:0a:f7:bd:e6:b8 using <NULL> on EFI SNP
(open)
[2019-07-18T11:59:50+10:00] [Link:up, TX:0 TXE:0 RX:0 RXE:0]
[2019-07-18T11:59:50+10:00] DHCP (net0 00:0a:f7:bd:e6:b8)... ok
[2019-07-18T11:59:50+10:00] net0:
100.64.1.86/255.255.248.0<http://100.64.1.86/255.255.248.0> gw 100.64.0.1
[2019-07-18T11:59:50+10:00] Next server: 100.64.0.1
[2019-07-18T11:59:50+10:00] Filename:
http://100.64.0.1/tftpboot/xcat/xnba/nodes/comp086.uefi
[2019-07-18T11:59:51+10:00]
http://100.64.0.1/tftpboot/xcat/xnba/nodes/comp086.uefi... ok
[2019-07-18T11:59:51+10:00] http://100.64.0.1/tftpboot/xcat/elilo-x64.efi... ok
[2019-07-18T11:59:51+10:00] ELILO v3.14 for EFI/x86_64
[2019-07-18T11:59:51+10:00] Loading kernel
/tftpboot/xcat/osimage/centos75-gpfs5.0.2.0-compute/kernel... done
[2019-07-18T11:59:51+10:00] Loading file
/tftpboot/xcat/osimage/centos75-gpfs5.0.2.0-compute/initrd-stateless.gz...done
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user