On 07/09/2021 20:45, Mark Gurevich wrote:
So this behavior is different from the errors you were seeing prior to this dhcp.pm change, where dhcpd.conf contained "http://<xCAT <https://urldefense.com/v3/__http://*3CxCAT__;JQ!!JFdNOqOXpB6UZW0!73iGfOzl4KeZf8jeqRWXlCSICaqtX2rjjeHDAq8aCGDGTpqxp-xOn-L9T_RcCCNF6b_uhQ$> MN>:80/tftpboot/xcat/xnba/nets/<network>.uefi" for "else if option user-class-identifier = "xNBA" and option client-architecture = 00:09" ?

Can you show what /tftpboot/xcat/xnba/nets/<network> file contains on your management node ?


Hello Mark, sorry for the delay and the long post below

Here's the answer to your question:

# ls -l /tftpboot/xcat/xnba/nets/
total 24
-rw-r--r-- 1 root root 247 Jul  6 18:42 127.0.0.0_8
-rw-r--r-- 1 root root 241 Jul  6 18:42 127.0.0.0_8.elilo
-rw-r--r-- 1 root root 112 Jul  6 18:42 127.0.0.0_8.uefi
-rw-r--r-- 1 root root 253 Jul  6 18:42 192.168.144.0_20
-rw-r--r-- 1 root root 259 Jul  6 18:42 192.168.144.0_20.elilo
-rw-r--r-- 1 root root 117 Jul  6 18:42 192.168.144.0_20.uefi

# cat /tftpboot/xcat/xnba/nets/192.168.144.0_20.uefi
#!gpxe
chain http://${next-server}:80/tftpboot/xcat/elilo-x64.efi -C /tftpboot/xcat/xnba/nets/192.168.144.0_20.elilo

# cat /tftpboot/xcat/xnba/nets/192.168.144.0_20.elilo
default="xCAT Genesis (192.168.149.100)"
   delay=5
   image=/tftpboot/xcat/genesis.kernel.x86_64
   label="xCAT Genesis (192.168.149.100)"
   initrd=/tftpboot/xcat/genesis.fs.x86_64.gz
   append="quiet xcatd=192.168.149.100:3001 destiny=discover  BOOTIF=%B"

I'd like to add the following new info regarding xNBA issues:
To sum up what's been going on in this thread, I ended up having 3 different possible xNBA's, one of which I choose to run by cp -p'ing it into xnba.efi:

I named them like this:

# ls -1 xnba*\.efi*
xnba.efi -> cp -p of one of the 3 below
xnba.efi-2.15 -> the original from the xCAT-2.15 branch
xnba.efi-2.16.2 -> the original from the latest stable xCAT-2.16 branch
xnba.efi-beta -> the original (from xCAT-2.16) patched you provided me earlier in this thread

Depending on which one I use I have a different issue on a different component/process:

xnba.efi-2.16: pb booting some stateless hardware nodes (the original post opening this thread)
xnba.efi-beta: pb booting genesis (our current discussion)

that's the reason why I currently and for some time now am running the xnba.efi-2.15.

However I recently discovered an issue with this xnba.efi-2.15 also:
stateful VM's (VMWare) booting in UEFI mode and having PXE first in their UEFI boot order just won't boot : it seems that this xNBA somehow triggers a bug which prevents UEFI from booting on the next target configured in the UEFI boot order (disk). And this wether or not the /tftpboot/xcat/xnba/nodes/host.uefi file (iPXE script just issuing 'exit') is here or not

Basically, with xNBA-2.15 and such a vm:

- hosts tftp xNBA
- xNBA http GET's successfully or not the node.uefi script file (does not change anything if http response is 404 or 200)
- vm does not manage to boot en ends up in firmware interface

This issue does not occur with 2.16 or 2.16 beta versions

Which leads me to the additionnal following questions:

1) Initially, those vm are instantiated from a VMWare template which itself comes from a standard xCAT stateful install (remember I'm talking CentOS 8.3 here). Ususally such a remote stateful install ends up, as expected, with UEFI boot order changed from PXE first to disk first. However, I can see some of my VM's having currently PXE first, which I cannot figure out why : do you think this could be xCAT related as in for some reason the stateful install process would fail to complete or change the UEFI order ?

2) Also, I know that on a particular date on August where these VM have rebooted (for some issue in the datacenter) but looking at the logs for this day and those VM's I cannot see any tftp transfer, I only see DHCPDISCOVER/OFFER. I always assumed that PXE boot in the xCAT paradigm would imply tftp of xNBA agent : do you know of a use case where the node would only issue DHCP without receiving any next-server option ? (Note: I know the tftpd daemon was running at that time) ?

3) finally, I'm using on those install the confignetwork -s script which supercedes the NetworkManager dhcp profile with a higher priority static one : do you think that this could generate dhcp logs on the MN node (while the static profile has not auto-connected yet ?) - although from the timestamp this the OFFER/DISCOVER I mentionned in 2) could not come from this (same second as tftp transaction)

Thanks for your help and time

--
Thomas HUMMEL


_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to