On 07/09/2021 20:45, Mark Gurevich wrote:
So this behavior is different from the errors you were seeing prior to
this dhcp.pm change, where dhcpd.conf contained "http://<xCAT
<https://urldefense.com/v3/__http://*3CxCAT__;JQ!!JFdNOqOXpB6UZW0!73iGfOzl4KeZf8jeqRWXlCSICaqtX2rjjeHDAq8aCGDGTpqxp-xOn-L9T_RcCCNF6b_uhQ$>
MN>:80/tftpboot/xcat/xnba/nets/<network>.uefi" for
"else if option user-class-identifier = "xNBA" and option
client-architecture = 00:09" ?
Can you show what /tftpboot/xcat/xnba/nets/<network> file contains on
your management node ?
Hello Mark, sorry for the delay and the long post below
Here's the answer to your question:
# ls -l /tftpboot/xcat/xnba/nets/
total 24
-rw-r--r-- 1 root root 247 Jul 6 18:42 127.0.0.0_8
-rw-r--r-- 1 root root 241 Jul 6 18:42 127.0.0.0_8.elilo
-rw-r--r-- 1 root root 112 Jul 6 18:42 127.0.0.0_8.uefi
-rw-r--r-- 1 root root 253 Jul 6 18:42 192.168.144.0_20
-rw-r--r-- 1 root root 259 Jul 6 18:42 192.168.144.0_20.elilo
-rw-r--r-- 1 root root 117 Jul 6 18:42 192.168.144.0_20.uefi
# cat /tftpboot/xcat/xnba/nets/192.168.144.0_20.uefi
#!gpxe
chain http://${next-server}:80/tftpboot/xcat/elilo-x64.efi -C
/tftpboot/xcat/xnba/nets/192.168.144.0_20.elilo
# cat /tftpboot/xcat/xnba/nets/192.168.144.0_20.elilo
default="xCAT Genesis (192.168.149.100)"
delay=5
image=/tftpboot/xcat/genesis.kernel.x86_64
label="xCAT Genesis (192.168.149.100)"
initrd=/tftpboot/xcat/genesis.fs.x86_64.gz
append="quiet xcatd=192.168.149.100:3001 destiny=discover BOOTIF=%B"
I'd like to add the following new info regarding xNBA issues:
To sum up what's been going on in this thread, I ended up having 3
different possible xNBA's, one of which I choose to run by cp -p'ing it
into xnba.efi:
I named them like this:
# ls -1 xnba*\.efi*
xnba.efi -> cp -p of one of the 3 below
xnba.efi-2.15 -> the original from the xCAT-2.15 branch
xnba.efi-2.16.2 -> the original from the latest stable xCAT-2.16 branch
xnba.efi-beta -> the original (from xCAT-2.16) patched you provided me
earlier in this thread
Depending on which one I use I have a different issue on a different
component/process:
xnba.efi-2.16: pb booting some stateless hardware nodes (the original
post opening this thread)
xnba.efi-beta: pb booting genesis (our current discussion)
that's the reason why I currently and for some time now am running the
xnba.efi-2.15.
However I recently discovered an issue with this xnba.efi-2.15 also:
stateful VM's (VMWare) booting in UEFI mode and having PXE first in
their UEFI boot order just won't boot : it seems that this xNBA somehow
triggers a bug which prevents UEFI from booting on the next target
configured in the UEFI boot order (disk). And this wether or not the
/tftpboot/xcat/xnba/nodes/host.uefi file (iPXE script just issuing
'exit') is here or not
Basically, with xNBA-2.15 and such a vm:
- hosts tftp xNBA
- xNBA http GET's successfully or not the node.uefi script file (does
not change anything if http response is 404 or 200)
- vm does not manage to boot en ends up in firmware interface
This issue does not occur with 2.16 or 2.16 beta versions
Which leads me to the additionnal following questions:
1) Initially, those vm are instantiated from a VMWare template which
itself comes from a standard xCAT stateful install (remember I'm talking
CentOS 8.3 here).
Ususally such a remote stateful install ends up, as expected, with UEFI
boot order changed from PXE first to disk first.
However, I can see some of my VM's having currently PXE first, which I
cannot figure out why : do you think this could be xCAT related as in
for some reason the stateful install process would fail to complete or
change the UEFI order ?
2) Also, I know that on a particular date on August where these VM have
rebooted (for some issue in the datacenter) but looking at the logs for
this day and those VM's I cannot see any tftp transfer, I only see
DHCPDISCOVER/OFFER.
I always assumed that PXE boot in the xCAT paradigm would imply tftp of
xNBA agent : do you know of a use case where the node would only issue
DHCP without receiving any next-server option ? (Note: I know the tftpd
daemon was running at that time) ?
3) finally, I'm using on those install the confignetwork -s script which
supercedes the NetworkManager dhcp profile with a higher priority static
one : do you think that this could generate dhcp logs on the MN node
(while the static profile has not auto-connected yet ?) - although from
the timestamp this the OFFER/DISCOVER I mentionned in 2) could not come
from this (same second as tftp transaction)
Thanks for your help and time
--
Thomas HUMMEL
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user