Hi all,
Finally had a chance to take another look at the setup.
After messing around with bios (uefi/legacy) settings, I finally decided
to try and look at what actually happens during deployment.
I've set everything for pxe boot (easier to troubleshoot) and modified
the /tftpboot/pxelinux.cfg/n1 file to NOT redirect console to tty
(simply deleted 'quiet' and everything after 'cmdline').
Installation went perfectly ok, until reboot.
Final boot enters pxe boot loop.
Solution is fairly simple
1. cp /usr/share/syslinux/chain.c32 /tftpboot/ # (I have tried using
/opt/xcat/share/xcat/netboot/syslinux/chain.c32 but that one failed)
2. sed -i 's/LOCALBOOT 0/KERNEL chain.c32' /tftpboot/pxelinux.cfg/n1
Now deployment works perfectly.
So I "nodeset n1 install" again, and again no go. Repeat the above
procedure and everything works.
Does anyone have any idea why it behaves in this manner?
I believe the following changes will resolve this issue for me, as long
as I don't update xcat:
# diff /opt/xcat/lib/perl/xCAT_plugin/{pxe.pm.orig,pxe.pm}
166c166
< print $pcfg "LOCALBOOT 0\n";
---
> print $pcfg "KERNEL chain.c32\n";
and
# diff /opt/xcat/lib/perl/xCAT_plugin/{anaconda.pm.orig,anaconda.pm}
1132,1155d1131
< if (defined($sent->{serialport}))
< {
< unless ($sent->{serialspeed})
< {
< $callback->(
< {
< error => [
< "serialport defined, but no serialspeed
for $node in nodehm table"
< ],
< errorcode => [1]
< }
< );
< next;
< }
< #go cmdline if serial console is requested, the shiny ansi is
just impractical
< $kcmdline .=
< " cmdline console=tty0 console=ttyS"
< . $sent->{serialport} . ","
< . $sent->{serialspeed};
< if ($sent->{serialflow} =~ /(hard|cts|ctsrts)/)
< {
< $kcmdline .= "n8r";
< }
< }
I realize that I can just remove the serial definitions from nodehm, but
then I would loose sol, no?
Is there a better approach? have I discovered the correct files to
modify pxe files ?
any help would be appreciated,
Thanks in advance
On 24/06//2012 11:38, Sten Wolf wrote:
Hi all,
I'm having a very strange issue - can't get any node to deploy an OS -
either an install profile (diskfull installation) or shell.
The setup - A single management node and several x3550 M3 compute nodes.
All nodes accessible via ipmi/bmc (asu is also working correctly) -
wcons is working great, MAC addresses defined correctly, dns and dhcp
work correctly.
on issuing
nodeset n1 install
OR
nodeset n1 shell
and then
rpower n1 boot
the correct tftp files are deployed as far as I can tell, but then the
deployment just hangs forever
I have tried setting noderes primary and install nics to blank, mac,
eth0 , I have tried using either pxe or xnba - all produce the same
result:
for pxe - nothing is done (initial tftp transfer ok, no further files
transfered)
for xnba - from /var/log/httpd/access_log:
10.5.4.1 - - [24/Jun/2012:11:20:29 +0300] "GET
/tftpboot/xcat/xnba/nodes/n1 HTTP/1.1" 200 419 "-" "iPXE/1.0.3-7"
10.5.4.1 - - [24/Jun/2012:11:20:29 +0300] "GET
/tftpboot/xcat/centos6.2/x86_64/vmlinuz HTTP/1.1" 200 3938288 "-"
"iPXE/1.0.3-7"
10.5.4.1 - - [24/Jun/2012:11:20:30 +0300] "GET
/tftpboot/xcat/centos6.2/x86_64/initrd.img HTTP/1.1" 200 30754349 "-"
"iPXE/1.0.3-7"
and as viewd in wcons - the 3 files transfered ok.
# tabdump noderes
#node,servicenode,netboot,tftpserver,tftpdir,nfsserver,monserver,nfsdir,installnic,primarynic,discoverynics,cmdinterface,xcatmaster,current_osimage,next_osimage,nimserver,routenames,comments,disable
"compute",,"xnba","10.5.5.254",,"10.5.5.254",,,"eth0","eth0",,,"10.5.5.254",,,,,,
No matter what I tried, the deployment just hangs forever after
initial files are transfered.
The same happens with nodeset n1 shell ; rpower n1 boot:
10.5.4.1 - - [24/Jun/2012:11:28:24 +0300] "GET
/tftpboot/xcat/xnba/nodes/n1 HTTP/1.1" 200 308 "-" "iPXE/1.0.3-7"
10.5.4.1 - - [24/Jun/2012:11:28:24 +0300] "GET
/tftpboot/xcat/genesis.kernel.x86_64 HTTP/1.1" 200 3942032 "-"
"iPXE/1.0.3-7"
10.5.4.1 - - [24/Jun/2012:11:28:25 +0300] "GET
/tftpboot/xcat/genesis.fs.x86_64.lzma HTTP/1.1" 200 14479256 "-"
"iPXE/1.0.3-7"
n1 def:
# lsdef --osimage n1
Object name: n1
arch=x86_64
bmc=ipmi1
bmcpassword=PASSW0RD
bmcusername=USERID
conserver=0
currchain=boot
currstate=shell
groups=ipmi,compute,all
hostnames=n1.aero
initrd=xcat/genesis.fs.x86_64.lzma
installnic=eth0
interface=eth0
ip=10.5.4.1
kcmdline=quiet console=tty0 console=ttyS115200,hard
xcatd=10.5.5.254:3001 destiny=shell
kernel=xcat/genesis.kernel.x86_64
mac=34:40:B5:9D:17:88
mgt=ipmi
netboot=xnba
nfsserver=10.5.5.254
os=centos6.2
otherinterfaces=ipmi1:10.5.5.1,ib1:10.7.7.1
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles
primarynic=eth0
profile=compute
provmethod=install
serialport=115200
serialspeed=hard
status=booting
statustime=06-24-2012 11:25:14
tftpserver=10.5.5.254
xcatmaster=10.5.5.254
template=/opt/xcat/share/xcat/install/centos/compute.centos6.tmpl
imagetype=linux
otherpkgdir=/install/post/otherpkgs/centos6.2/x86_64
pkglist=/opt/xcat/share/xcat/install/centos/compute.centos6.pkglist
pkgdir=/install/centos6.2/x86_64
I can provide any further info required
Any help would be greatly appreciated,
Thanks in advance
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user