Hi all,
Finally had a chance to take another look at the setup.

After messing around with bios (uefi/legacy) settings, I finally decided to try and look at what actually happens during deployment. I've set everything for pxe boot (easier to troubleshoot) and modified the /tftpboot/pxelinux.cfg/n1 file to NOT redirect console to tty (simply deleted 'quiet' and everything after 'cmdline').
Installation went perfectly ok, until reboot.
Final boot enters pxe boot loop.
Solution is fairly simple
1. cp /usr/share/syslinux/chain.c32 /tftpboot/ # (I have tried using /opt/xcat/share/xcat/netboot/syslinux/chain.c32 but that one failed)
2. sed -i 's/LOCALBOOT 0/KERNEL chain.c32' /tftpboot/pxelinux.cfg/n1

Now deployment works perfectly.
So I "nodeset n1 install" again, and again no go. Repeat the above procedure and everything works.
Does anyone have any idea why it behaves in this manner?
I believe the following changes will resolve this issue for me, as long as I don't update xcat:

# diff /opt/xcat/lib/perl/xCAT_plugin/{pxe.pm.orig,pxe.pm}
166c166
<     print $pcfg "LOCALBOOT 0\n";
---
>     print $pcfg "KERNEL chain.c32\n";

and

# diff /opt/xcat/lib/perl/xCAT_plugin/{anaconda.pm.orig,anaconda.pm}
1132,1155d1131
<             if (defined($sent->{serialport}))
<             {
<                 unless ($sent->{serialspeed})
<                 {
<                     $callback->(
<                         {
<                          error => [
< "serialport defined, but no serialspeed for $node in nodehm table"
<                          ],
<                          errorcode => [1]
<                         }
<                         );
<                     next;
<                 }
< #go cmdline if serial console is requested, the shiny ansi is just impractical
<                 $kcmdline .=
<                     " cmdline console=tty0 console=ttyS"
<                   . $sent->{serialport} . ","
<                   . $sent->{serialspeed};
<                 if ($sent->{serialflow} =~ /(hard|cts|ctsrts)/)
<                 {
<                     $kcmdline .= "n8r";
<                 }
<             }

I realize that I can just remove the serial definitions from nodehm, but then I would loose sol, no? Is there a better approach? have I discovered the correct files to modify pxe files ?

any help would be appreciated,
Thanks in advance

On 24/06//2012 11:38, Sten Wolf wrote:
Hi all,
I'm having a very strange issue - can't get any node to deploy an OS - either an install profile (diskfull installation) or shell.

The setup - A single management node and several x3550 M3 compute nodes.
All nodes accessible via ipmi/bmc (asu is also working correctly) - wcons is working great, MAC addresses defined correctly, dns and dhcp work correctly.
on issuing
nodeset n1 install
OR
nodeset n1 shell

and then
rpower n1 boot

the correct tftp files are deployed as far as I can tell, but then the deployment just hangs forever

I have tried setting noderes primary and install nics to blank, mac, eth0 , I have tried using either pxe or xnba - all produce the same result: for pxe - nothing is done (initial tftp transfer ok, no further files transfered)
for xnba - from /var/log/httpd/access_log:
10.5.4.1 - - [24/Jun/2012:11:20:29 +0300] "GET /tftpboot/xcat/xnba/nodes/n1 HTTP/1.1" 200 419 "-" "iPXE/1.0.3-7" 10.5.4.1 - - [24/Jun/2012:11:20:29 +0300] "GET /tftpboot/xcat/centos6.2/x86_64/vmlinuz HTTP/1.1" 200 3938288 "-" "iPXE/1.0.3-7" 10.5.4.1 - - [24/Jun/2012:11:20:30 +0300] "GET /tftpboot/xcat/centos6.2/x86_64/initrd.img HTTP/1.1" 200 30754349 "-" "iPXE/1.0.3-7"

and as viewd in wcons - the 3 files transfered ok.


# tabdump noderes
#node,servicenode,netboot,tftpserver,tftpdir,nfsserver,monserver,nfsdir,installnic,primarynic,discoverynics,cmdinterface,xcatmaster,current_osimage,next_osimage,nimserver,routenames,comments,disable
"compute",,"xnba","10.5.5.254",,"10.5.5.254",,,"eth0","eth0",,,"10.5.5.254",,,,,,

No matter what I tried, the deployment just hangs forever after initial files are transfered.

The same happens with nodeset n1 shell ; rpower n1 boot:
10.5.4.1 - - [24/Jun/2012:11:28:24 +0300] "GET /tftpboot/xcat/xnba/nodes/n1 HTTP/1.1" 200 308 "-" "iPXE/1.0.3-7" 10.5.4.1 - - [24/Jun/2012:11:28:24 +0300] "GET /tftpboot/xcat/genesis.kernel.x86_64 HTTP/1.1" 200 3942032 "-" "iPXE/1.0.3-7" 10.5.4.1 - - [24/Jun/2012:11:28:25 +0300] "GET /tftpboot/xcat/genesis.fs.x86_64.lzma HTTP/1.1" 200 14479256 "-" "iPXE/1.0.3-7"

n1 def:
# lsdef --osimage n1
Object name: n1
    arch=x86_64
    bmc=ipmi1
    bmcpassword=PASSW0RD
    bmcusername=USERID
    conserver=0
    currchain=boot
    currstate=shell
    groups=ipmi,compute,all
    hostnames=n1.aero
    initrd=xcat/genesis.fs.x86_64.lzma
    installnic=eth0
    interface=eth0
    ip=10.5.4.1
kcmdline=quiet console=tty0 console=ttyS115200,hard xcatd=10.5.5.254:3001 destiny=shell
    kernel=xcat/genesis.kernel.x86_64
    mac=34:40:B5:9D:17:88
    mgt=ipmi
    netboot=xnba
    nfsserver=10.5.5.254
    os=centos6.2
    otherinterfaces=ipmi1:10.5.5.1,ib1:10.7.7.1
    postbootscripts=otherpkgs
    postscripts=syslog,remoteshell,syncfiles
    primarynic=eth0
    profile=compute
    provmethod=install
    serialport=115200
    serialspeed=hard
    status=booting
    statustime=06-24-2012 11:25:14
    tftpserver=10.5.5.254
    xcatmaster=10.5.5.254
template=/opt/xcat/share/xcat/install/centos/compute.centos6.tmpl
    imagetype=linux
    otherpkgdir=/install/post/otherpkgs/centos6.2/x86_64
pkglist=/opt/xcat/share/xcat/install/centos/compute.centos6.pkglist
    pkgdir=/install/centos6.2/x86_64

I can provide any further info required
Any help would be greatly appreciated,
Thanks in advance


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to