But by all means if it can be done with only genesis I agree it would be better and simpler. Whatever is causing it to generate the wrong configuration though needs to be fixed. In my case it still works with both xnba and genesis because of the nature of PXE chainloading. It probably adds deployment time, but it actually works in such a mixed configuration.
-Josh On Tue, Jan 21, 2014 at 3:25 PM, Josh Nielsen <jniel...@hudsonalpha.org> wrote: > Evidently though something in his xCAT setup it creating the files in > /tftpboot/pxelinux.cfg/ with reference to xnba just like my > installation. Where does xCAT grab the configuration for that? Maybe > it was because I didn't do a completely clean install and did an > in-place upgrade, but my cluster actually works perfectly with both > xnba & genesis installed because it uses xnba first to bootstrap and > then requests the Genesis image. xCAT must support that scenario else > I haven't the slightest idea by what miracle my installation is > running with such a configuration. :-) > > -Josh > > On Tue, Jan 21, 2014 at 2:58 PM, Russell Jones > <russell-l...@jonesmail.me> wrote: >> xNBA is a customized gpxe image that xCAT uses. >> >> NBFS is the older maintenance image that was used for if you set your >> node to boot to shell, or booted a runimage script. NBFS is deprecated, >> and Genesis replaced NBFS as the maintenance image for these tasks. >> >> In a standard 2.8 install, there should no longer be any nbk/nbfs RPMs >> installed - Genesis replaced them. >> >> perl-xCAT-2.8.3-snap201311122316.noarch >> xCAT-2.8.3-snap201311122318.x86_64 >> xCAT-client-2.8.3-snap201311122316.noarch >> xCAT-genesis-base-x86_64-2.8-snap201308090229.noarch >> elilo-xcat-3.14-4.noarch >> xCAT-server-2.8.3-snap201311122316.noarch >> xCAT-genesis-scripts-x86_64-2.8.3-snap201311122318.noarch >> ipmitool-xcat-1.8.11-3.x86_64 >> conserver-xcat-8.1.16-10.x86_64 >> xCAT-buildkit-2.8.3-snap201311122318.noarch >> syslinux-xcat-3.86-2.noarch >> >> >> >> On 1/21/2014 2:38 PM, Josh Nielsen wrote: >>> Hi Jonathan, >>> >>> Yes, I definitely think that would cause a problem. This is jogging my >>> memory because I think that when the new Genesis boot loader was >>> rolled out in the first version of xCAT that supported it that I faced >>> a similar problem. I had assumed that only Genesis was needed but xNBA >>> is still used an an intermediate image even if it is no longer the >>> final image. I will check my yum repos as soon as I can - but by some >>> unfortunate coincidence I just discovered that YUM is not working >>> since our RHEL license expired three days ago (unbeknownst to me until >>> 10 minutes ago). Do you have xCAT-genesis-x86_64 and elilo-xCAT? You >>> may even have to pull xNBA images from an older install(?) and then >>> run mknb to build the images. >>> >>> I remember downloading the tarred files with the RPM manually and >>> creating a local repo for xCAT. Whenever I get YUM back I'll give you >>> more specifics if I can. >>> >>> -Josh >>> >>> On Tue, Jan 21, 2014 at 1:54 PM, Jonathan Mills <jonmi...@renci.org> wrote: >>>> Josh, >>>> >>>> I don't doubt that you're on to something. But if this is the case, it >>>> means my systems are missing some files, namely: >>>> >>>> /tftpboot/xcat/nbk.x86_64 >>>> /tftpboot/xcat/nbfs.x86_64.gz >>>> >>>> Can you tell me what RPM installed those files on your system? They >>>> don't exist on mine, and even a 'yum provides' doesn't find them. >>>> >>>> >>>> On 01/21/2014 11:51 AM, Josh Nielsen wrote: >>>>> Hi Jonathan, >>>>> >>>>> It is my understanding, from extensive debugging and notes that I have >>>>> taken about the xCAT netbooting process in the past, that xCAT uses a >>>>> two-stage image deployment method. It will first come up with a more >>>>> "generic" boot image (normally xnba or sometimes yaboot) which - when it >>>>> contacts the xCAT headnode (or the node handling DHCP requests) - the >>>>> headnode will then recognize the current image on the client that is >>>>> sending requests to DHCP for further boot instructions, and will tell >>>>> the client to then load another image based on the subnet and image type >>>>> it is currently using. For example my headnode's /etc/dhcpd.conf file >>>>> has an entry that looks like this: >>>>> >>>>> hared-network eth0 { >>>>> subnet 10.20.0.0 netmask 255.255.0.0 { >>>>> max-lease-time 43200; >>>>> min-lease-time 43200; >>>>> default-lease-time 43200; >>>>> next-server 10.20.0.1; >>>>> option log-servers 10.20.0.1; >>>>> option ntp-servers 10.20.0.1; >>>>> option domain-name "xxxxxxxxx"; >>>>> option domain-name-servers 10.20.0.1; >>>>> if option user-class-identifier = "xNBA" and option >>>>> client-architecture = 00:00 { #x86, xCAT Network Boot Agent >>>>> always-broadcast on; >>>>> filename = >>>>> "http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16"; >>>>> } else if option user-class-identifier = "xNBA" and option >>>>> client-architecture = 00:09 { #x86, xCAT Network Boot Agent >>>>> filename = >>>>> "http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16.uefi"; >>>>> } else if option client-architecture = 00:00 { #x86 >>>>> filename "xcat/xnba.kpxe"; >>>>> } else if option vendor-class-identifier = "Etherboot-5.4" { #x86 >>>>> filename "xcat/xnba.kpxe"; >>>>> } else if option client-architecture = 00:07 { #x86_64 uefi >>>>> filename "xcat/xnba.efi"; >>>>> } else if option client-architecture = 00:09 { #x86_64 uefi >>>>> alternative id >>>>> filename "xcat/xnba.efi"; >>>>> } else if option client-architecture = 00:02 { #ia64 >>>>> filename "elilo.efi"; >>>>> } else if substring(filename,0,1) = null { #otherwise, provide >>>>> yaboot if the client isn't specific >>>>> filename "/yaboot"; >>>>> } >>>>> range dynamic-bootp 10.20.200.254 10.20.254.254; >>>>> } # 10.20.0.0/255.255.0.0 <http://10.20.0.0/255.255.0.0> subnet_end >>>>> >>>>> So if it boots with the xNBA image it then directs it to the >>>>> http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16 which has the >>>>> genesis boot instructions in it: >>>>> >>>>> #!gpxe >>>>> imgfetch -n kernel >>>>> http://${next-server}/tftpboot/xcat/genesis.kernel.x86_64 quiet >>>>> xcatd=10.20.0.1:3001 <http://10.20.0.1:3001> BOOTIF=01-${netX/machyp} >>>>> imgfetch -n nbfs http://${next-server}/tftpboot/xcat/genesis.fs.x86_64.gz >>>>> imgload kernel >>>>> imgexec kernel >>>>> >>>>> So first it boots with xnba (first stage of boot), it contacts the DHCP >>>>> server which gives it a "next-server" option of itself (saying to the >>>>> client: request the next image from me - the headnode - again), and then >>>>> gives it a boot file with instructions for the next image, then it >>>>> executes it and finally loads genesis. You will also notice that the >>>>> very last options (if it matches nothing else) is yaboot, which is >>>>> another generic image, which will in turn probably request the next >>>>> image. Try watching your log for the tftp daemon messages to see what is >>>>> being sent. >>>>> >>>>> I wonder if you are having problems at the first stage DHCP redirecting >>>>> stage though. Check your options statements in /etc/dhcpd.conf to see >>>>> where it is directing xNBA images. >>>>> >>>>> Regards, >>>>> Josh Nielsen >>>>> >>>>> >>>>> On Tue, Jan 21, 2014 at 10:26 AM, Jonathan Mills <jonmi...@renci.org >>>>> <mailto:jonmi...@renci.org>> wrote: >>>>> >>>>> Wang, >>>>> >>>>> Thank you for your response. I did some digging and here is what I >>>>> found. >>>>> >>>>> cat /tftpboot/xcat/xnba/nets/10.100.0.0_24 >>>>> #!gpxe >>>>> imgfetch -n kernel >>>>> http://${next-server}/tftpboot/xcat/genesis.kernel.x86_64 quiet >>>>> xcatd=10.100.0.1:3001 <http://10.100.0.1:3001> >>>>> BOOTIF=01-${netX/machyp} >>>>> imgfetch -n nbfs >>>>> http://${next-server}/tftpboot/xcat/genesis.fs.x86_64.lzma >>>>> imgload kernel >>>>> imgexec kernel >>>>> >>>>> >>>>> >>>>> cat /tftpboot/pxelinux.cfg/0A6400 >>>>> DEFAULT xCAT >>>>> LABEL xCAT >>>>> KERNEL xcat/nbk.x86_64 >>>>> APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.100.0.1:3001 >>>>> <http://10.100.0.1:3001> >>>>> >>>>> >>>>> >>>>> So, clearly, those things don't match up. That strikes me as an xCAT >>>>> issue, but nevermind. I manually modified >>>>> /tftpboot/pxelinux.cfg/0A6400 >>>>> to make it look like: >>>>> >>>>> DEFAULT xCAT >>>>> LABEL xCAT >>>>> KERNEL xcat/genesis.kernel.x86_64 >>>>> APPEND initrd=xcat/genesis.fs.x86_64.lzma quiet >>>>> xcatd=10.100.0.1:3001 <http://10.100.0.1:3001> >>>>> BOOTIF=eth0 >>>>> >>>>> >>>>> (It is safe, in this case, to designate BOOTIF as 'eth0' -- with >>>>> Cisco >>>>> UCS hardware, and using vNICs, the first interface will always show >>>>> up >>>>> in Linux as eth0 -- at least, that is my experience). >>>>> >>>>> After this change, I was indeed able to PXE boot the first node, and >>>>> I >>>>> was hopeful that node discovery would then take place. However, this >>>>> still did not occur. On console, I dug into the running genesis >>>>> image >>>>> on the first node, and I found that it had no ethernet interfaces >>>>> whatsoever, because the genesis kernel has no driver support for >>>>> Cisco >>>>> UCS hardware. >>>>> >>>>> For example, this is the ethtool output of a Cisco UCS vNIC: >>>>> >>>>> [root@ncsu-hn nets]# ethtool -i eth0 >>>>> driver: enic >>>>> version: 2.1.1.39 >>>>> firmware-version: 2.0(4b) >>>>> bus-info: 0000:06:00.0 >>>>> supports-statistics: yes >>>>> supports-test: no >>>>> supports-eeprom-access: no >>>>> supports-register-dump: no >>>>> supports-priv-flags: no >>>>> >>>>> >>>>> You can see it requires the 'enic' kernel module, usually located at: >>>>> /lib/modules/`uname -r`/kernel/drivers/net/enic/enic.ko >>>>> >>>>> This module isn't found within the genesis image, so the node PXE >>>>> boots, >>>>> and then can do no more. Node discovery fails. >>>>> >>>>> On 01/20/2014 09:19 PM, Xiao Peng Wang wrote: >>>>> > xCAT is using genesis (an xCAT customized pxe tool) to function >>>>> the >>>>> > discovery process. The configuration for genesis is put in >>>>> > /tftpboot/xcat/xnba/nets/ for a specific network. Could you check >>>>> your >>>>> > specific xnba configuration file for your deployment network has >>>>> been >>>>> > put in /tftpboot/xcat/xnba/nets/? >>>>> > >>>>> > The prerequisite for booting of genesis is to make the node has a >>>>> > dynamic IP address. Did you configure the dynamic IP range for >>>>> your >>>>> > deployment network? Could you take a look of your syslog to see >>>>> whether >>>>> > the node has sent out dhcp request and what did your dhcp server >>>>> replied >>>>> > to them? >>>>> > >>>>> > Thanks >>>>> > Best Regards >>>>> > >>>>> >>>>> ---------------------------------------------------------------------- >>>>> > Wang Xiaopeng (王晓朋) >>>>> > IBM China System Technology Laboratory >>>>> > Tel: 86-10-82453455 >>>>> > Email: w...@cn.ibm.com <mailto:w...@cn.ibm.com> >>>>> > Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West >>>>> Road, >>>>> > Haidian District Beijing P.R.China 100193 >>>>> > >>>>> > Inactive hide details for Jonathan Mills ---2014/01/19 >>>>> 06:24:02---I'm >>>>> > running xCAT 2.8.3 and CentOS 6.4 atop of Cisco UCS-C harJonathan >>>>> Mills >>>>> > ---2014/01/19 06:24:02---I'm running xCAT 2.8.3 and CentOS 6.4 >>>>> atop of >>>>> > Cisco UCS-C hardware. I'm attempting to do a sequent >>>>> > >>>>> > From: Jonathan Mills <jonmi...@renci.org >>>>> <mailto:jonmi...@renci.org>> >>>>> > To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net >>>>> <mailto:xcat-user@lists.sourceforge.net>>, >>>>> > Date: 2014/01/19 06:24 >>>>> > Subject: [xcat-user] Frustrating time with sequential node >>>>> discovery >>>>> > >>>>> > >>>>> >>>>> ------------------------------------------------------------------------ >>>>> > >>>>> > >>>>> > >>>>> > I'm running xCAT 2.8.3 and CentOS 6.4 atop of Cisco UCS-C >>>>> hardware. I'm >>>>> > attempting to do a sequential nodediscovery. I've pre-populated >>>>> the >>>>> > nodelist table with the nodenames, so I shouldn't need to do >>>>> anything >>>>> > more than >>>>> > >>>>> > nodediscoverystart noderange=node[1-15] >>>>> > >>>>> > However, none of the nodes ever gets discovered. >>>>> > >>>>> > Digging deeper, it seems that none of them ever successfully PXE >>>>> boot at >>>>> > all. They should be PXE booting off of the genesis netboot image >>>>> and >>>>> > speaking back to the xcatmaster, correct? >>>>> > >>>>> > When I run 'mknb x86_64', it populates /tftpboot/pxelinux.cfg with >>>>> > entries to non-existent netboot images. Watch: >>>>> > >>>>> > [root@ncsu-hn ~]# rpm -qf /opt/xcat/sbin/mknb >>>>> > xCAT-client-2.8.3-snap201311122316.noarch >>>>> > [root@ncsu-hn ~]# mknb x86_64 >>>>> > Creating genesis.fs.x86_64.lzma in /tftpboot/xcat >>>>> > [root@ncsu-hn ~]# cd /tftpboot/pxelinux.cfg/ >>>>> > [root@ncsu-hn pxelinux.cfg]# ls >>>>> > 0A6400 0A6500 0A6600 7F 98300D 98300DE6 98300DE7 C0A86B >>>>> > [root@ncsu-hn pxelinux.cfg]# cat * >>>>> > DEFAULT xCAT >>>>> > LABEL xCAT >>>>> > KERNEL xcat/nbk.x86_64 >>>>> > APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.100.0.1:3001 >>>>> <http://10.100.0.1:3001> >>>>> > DEFAULT xCAT >>>>> > LABEL xCAT >>>>> > KERNEL xcat/nbk.x86_64 >>>>> > APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.101.0.1:3001 >>>>> <http://10.101.0.1:3001> >>>>> > DEFAULT xCAT >>>>> > LABEL xCAT >>>>> > KERNEL xcat/nbk.x86_64 >>>>> > APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.102.0.1:3001 >>>>> <http://10.102.0.1:3001> >>>>> > DEFAULT xCAT >>>>> > LABEL xCAT >>>>> > KERNEL xcat/nbk.x86_64 >>>>> > APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=127.0.0.1:3001 >>>>> <http://127.0.0.1:3001> >>>>> > DEFAULT xCAT >>>>> > LABEL xCAT >>>>> > KERNEL xcat/nbk.x86_64 >>>>> > APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=152.48.13.3:3001 >>>>> <http://152.48.13.3:3001> >>>>> > DEFAULT xCAT >>>>> > LABEL xCAT >>>>> > KERNEL xcat/nbk.x86_64 >>>>> > APPEND initrd=xcat/nbfs.x86_64.gz quiet >>>>> xcatd=152.48.13.230:3001 <http://152.48.13.230:3001> >>>>> > DEFAULT xCAT >>>>> > LABEL xCAT >>>>> > KERNEL xcat/nbk.x86_64 >>>>> > APPEND initrd=xcat/nbfs.x86_64.gz quiet >>>>> xcatd=152.48.13.231:3001 <http://152.48.13.231:3001> >>>>> > DEFAULT xCAT >>>>> > LABEL xCAT >>>>> > KERNEL xcat/nbk.x86_64 >>>>> > APPEND initrd=xcat/nbfs.x86_64.gz quiet >>>>> xcatd=192.168.107.10:3001 <http://192.168.107.10:3001> >>>>> > [root@ncsu-hn pxelinux.cfg]# cd ../xcat/ >>>>> > [root@ncsu-hn xcat]# ls -la >>>>> > total 21528 >>>>> > drwxr-xr-x 4 root root 4096 Jan 17 13:06 . >>>>> > drwxr-xr-x. 7 root root 4096 Jan 18 22:02 .. >>>>> > -rwxr-xr-x 1 root root 242929 Jan 15 2012 elilo-x64.efi >>>>> > -rw-r--r-- 1 root root 17573621 Jan 18 22:03 >>>>> genesis.fs.x86_64.lzma >>>>> > -rwxr-xr-x 1 root root 3986608 Aug 9 06:29 >>>>> genesis.kernel.x86_64 >>>>> > drwxr-xr-x 3 root root 4096 Jan 17 13:06 osimage >>>>> > drwxr-xr-x 3 root root 4096 Dec 23 07:42 xnba >>>>> > -rw-r--r-- 1 root root 139200 Oct 28 16:16 xnba.efi >>>>> > -rw-r--r-- 1 root root 74792 Oct 28 16:16 xnba.kpxe >>>>> > >>>>> > >>>>> > >>>>> > As you can see....it ought to be netbooting the genesis kernel, >>>>> but >>>>> > instead all my pxelinux.cfg/* files are instructing clients to >>>>> boot the >>>>> > non-existent "nbk.x86_64" image. >>>>> > >>>>> > Your advice is appreciated. >>>>> > >>>>> > -- >>>>> > Jonathan Mills >>>>> > Systems Administrator >>>>> > Renaissance Computing Institute >>>>> > UNC-Chapel Hill >>>>> > >>>>> > >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>>>> > Learn Why More Businesses Are Choosing CenturyLink Cloud For >>>>> > Critical Workloads, Development Environments & Everything In >>>>> Between. >>>>> > Get a Quote or Start a Free Trial Today. >>>>> > >>>>> >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>>>> > _______________________________________________ >>>>> > xCAT-user mailing list >>>>> > xCAT-user@lists.sourceforge.net >>>>> <mailto:xCAT-user@lists.sourceforge.net> >>>>> > https://lists.sourceforge.net/lists/listinfo/xcat-user >>>>> > >>>>> > >>>>> >>>>> -- >>>>> Jonathan Mills >>>>> Systems Administrator >>>>> Renaissance Computing Institute >>>>> UNC-Chapel Hill >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>>>> Critical Workloads, Development Environments & Everything In Between. >>>>> Get a Quote or Start a Free Trial Today. >>>>> >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>>>> _______________________________________________ >>>>> xCAT-user mailing list >>>>> xCAT-user@lists.sourceforge.net >>>>> <mailto:xCAT-user@lists.sourceforge.net> >>>>> https://lists.sourceforge.net/lists/listinfo/xcat-user >>>>> >>>>> >>>> -- >>>> Jonathan Mills >>>> Systems Administrator >>>> Renaissance Computing Institute >>>> UNC-Chapel Hill >>>> >>>> ------------------------------------------------------------------------------ >>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>>> Critical Workloads, Development Environments & Everything In Between. >>>> Get a Quote or Start a Free Trial Today. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> xCAT-user mailing list >>>> xCAT-user@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/xcat-user >>> ------------------------------------------------------------------------------ >>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>> Critical Workloads, Development Environments & Everything In Between. >>> Get a Quote or Start a Free Trial Today. >>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> xCAT-user mailing list >>> xCAT-user@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/xcat-user >> >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xcat-user ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user