Looks like genesis does not get start the chain task, it might hang in certain
place.
Victor, could someone in Pok take look of this issue.
Using IBM Verse, send from my iPhone.
在 2016年2月3日,12:52:16,"John Westlund" <[email protected]> 写道:
I get into the same state regardless of whether I’m bringing the node up
with auto-discovery or I’ve manually defined it.
Here are the processes of a node that’s been up a few minutes:
[xCAT Genesis running on (none) /]# ps -elf
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY
TIME CMD
4 S root 1 0 1 80 0 - 2869 wait 18:59 ?
00:00:08 /bin/sh /init
1 S root 2 0 0 80 0 - 0 kthrea 18:59 ?
00:00:00 [kthreadd]
…
1 S root 446 2 0 80 0 - 0 worker 18:59 ?
00:00:00 [kthrotld/31]
1 S root 456 2 0 80 0 - 0 worker 18:59 ?
00:00:00 [kpsmoused]
1 S root 457 2 0 80 0 - 0 worker 18:59 ?
00:00:00 [usbhid_resumer]
1 S root 458 2 0 80 0 - 0 worker 18:59 ?
00:00:00 [deferwq]
5 S root 481 1 0 76 -4 - 2720 poll_s 18:59 ?
00:00:01 udevd --daemon
5 S root 563 481 0 78 -2 - 2695 poll_s 18:59 ?
00:00:00 udevd --daemon
1 S root 567 1 0 80 0 - 2869 wait 18:59 ?
00:00:00 /bin/sh /init
4 S root 569 567 0 80 0 - 5499 pause 18:59 ?
00:00:00 screen -ln
5 S root 570 569 0 80 0 - 5499 poll_s 18:59 ?
00:00:00 SCREEN -ln
4 S root 571 570 0 80 0 - 2835 n_tty_ 18:59 pts/0
00:00:00 /bin/sh
1 S root 576 2 0 80 0 - 0 worker 18:59 ?
00:00:00 [mlx4]
1 S root 640 2 0 80 0 - 0 scsi_e 18:59 ?
00:00:00 [scsi_eh_0]
1 S root 641 2 0 80 0 - 0 scsi_e 18:59 ?
00:00:00 [scsi_eh_1]
1 S root 642 2 0 80 0 - 0 scsi_e 18:59 ?
00:00:00 [scsi_eh_2]
1 S root 643 2 0 80 0 - 0 scsi_e 18:59 ?
00:00:00 [scsi_eh_3]
1 S root 644 2 0 80 0 - 0 scsi_e 18:59 ?
00:00:00 [scsi_eh_4]
1 S root 645 2 0 80 0 - 0 scsi_e 18:59 ?
00:00:00 [scsi_eh_5]
1 S root 707 2 0 99 19 - 0 ipmi_t 18:59 ?
00:00:00 [kipmi0]
4 S root 855 1 0 80 0 - 5499 pause 18:59 ?
00:00:00 screen -L -ln doxcat
5 S root 856 855 0 80 0 - 5500 poll_s 18:59 ?
00:00:00 SCREEN -L -ln doxcat
4 S root 857 856 0 80 0 - 2309 wait 18:59 pts/1
00:00:00 /bin/sh /bin/doxcat
1 S root 860 2 0 80 0 - 0 worker 18:59 ?
00:00:00 [kondemand/0]
…
1 S root 891 2 0 80 0 - 0 worker 18:59 ?
00:00:00 [kondemand/31]
5 S rpc 923 1 0 80 0 - 4744 poll_s 18:59 ?
00:00:00 rpcbind
5 S root 925 1 0 80 0 - 5837 poll_s 18:59 ?
00:00:00 rpc.statd
5 S root 930 1 0 80 0 - 16672 poll_s 18:59 ?
00:00:00 /usr/sbin/sshd
5 S root 953 1 0 80 0 - 3396 poll_s 18:59 ?
00:00:00 lldpad -d
4 S root 982 857 0 80 0 - 2280 poll_s 18:59 pts/1
00:00:00 dhclient -6 -pf /var/run/dhclient6.eth0.pid eth0 -lf
/var/lib/dhclient/dhclient6.
1 S root 994 857 0 80 0 - 2309 wait 18:59 pts/1
00:00:00 /bin/sh /bin/doxcat
1 S root 995 857 0 80 0 - 2309 wait 18:59 pts/1
00:00:00 /bin/sh /bin/doxcat
1 S root 1759 1 0 80 0 - 2280 poll_s 19:00 ?
00:00:00 dhclient -cf /etc/dhclient.conf -pf /var/run/dhclient.eth0.pid eth0
5 S root 1773 1 0 80 0 - 6627 poll_s 19:00 ?
00:00:00 ntpd -g -x
5 S root 1787 481 0 78 -2 - 2719 poll_s 19:00 ?
00:00:00 udevd --daemon
1 S root 1807 2 0 80 0 - 0 kaudit 19:00 ?
00:00:00 [kauditd]
5 S root 1834 1 0 80 0 - 31077 poll_s 19:00 ?
00:00:00 /sbin/rsyslogd -c4
4 S root 2896 930 0 80 0 - 17830 - 19:06 ?
00:00:00 sshd: root@pts/2
4 S root 2924 2896 0 80 0 - 2835 wait 19:06 pts/2
00:00:00 -bash
0 S root 2959 994 0 80 0 - 1018 hrtime 19:07 pts/1
00:00:00 sleep 5
0 S root 2960 995 0 80 0 - 1018 hrtime 19:07 pts/1
00:00:00 sleep 5
0 S root 2961 857 0 80 0 - 1018 hrtime 19:07 pts/1
00:00:00 sleep 1
4 R root 2962 2924 2 80 0 - 3344 - 19:07 pts/2
00:00:00 ps -elf
From: Xiao Peng Wang [mailto:[email protected]]Sent: Tuesday, February 2,
2016 2:17 AMTo: [email protected]:
[email protected]: Re: [xcat-user] Failure booting genesis
kernel
It's possible that genesis is waiting for the tasks to run instead of
dead. Could show out the out of 'ps -elf' in the genesis to see what processes
are running.
How did you get your node into genesis? A new node which got into
genesis for discovery, or you run the 'nodeset' to force the node got into
genesis to run certain task?
ThanksBest
Regards----------------------------------------------------------------------Wang
Xiaopeng (王晓朋)IBM China System Technology LaboratoryTel: 86-10-82453455Email:
[email protected]: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West
Road, Haidian District Beijing P.R.China 100193
----- Original message -----From: "Westlund, John A"
<[email protected]>To: xCAT Users Mailing list
<[email protected]>, Er Tao Zhao/China/IBM@IBMCNCc:Subject: Re:
[xcat-user] Failure booting genesis kernelDate: Tue, Feb 2, 2016 3:04 PM
I can ping and get into the node:
[xCAT Genesis running on (none) /]# ls
bin debian emergency init initqueue-finished
initqueue-timeout lib64 netroot pre-pivot pre-udev root
screenlog.0 sysroot usr
cmdline dev etc initqueue initqueue-settled lib
mount pre-mount pre-trigger proc sbin
sys tmp var
This is what is running:
# lsxcatd -a
Version 2.11 (git commit 9ea36ca6163392bf9ab684830217f017193815be,
built Mon Nov 30 05:43:11 EST 2015)
This is a Management Node
dbengine=SQLite
John
From: Xiao Peng Wang [mailto:[email protected]]Sent: Monday, February
1, 2016 10:23 PMTo: [email protected]; Er Tao ZhaoCc:
[email protected]: Re: [xcat-user] Failure booting genesis
kernel
You mentioned the genesis got a dead end, could you ping to the
compute node or try to login the compute node? Please run the 'lsxcatd -a' to
show the xcat version.
ThanksBest
Regards----------------------------------------------------------------------Wang
Xiaopeng (王晓朋)IBM China System Technology LaboratoryTel: 86-10-82453455Email:
[email protected]: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West
Road, Haidian District Beijing P.R.China 100193
----- Original message -----From: "Westlund, John A"
<[email protected]>To: "[email protected]"
<[email protected]>Cc:Subject: [xcat-user] Failure booting
genesis kernelDate: Tue, Feb 2, 2016 12:07 PM
I’m trying to bring up a new system, but have run into a dead end.
I get no error message other than a blinking question mark in a diamond (some
un-assigned UTF character):
CLIENT MAC ADDR: 84 8F 69 FD 4F 28 GUID: 44454C4C 4300 1042 8054
B6C04F355631
CLIENT IP: 192.168.91.9 MASK: 255.255.240.0 DHCP IP: 10.10.1.167
GATEWAY IP: 192.168.92.54
PXE->EB:�P: 192.168.92.54
PXE->EB: !PXE at 98D2:0070, entry point at 98D2:0106
UNDI code segment 98D2:5210, data segment 9297:63B0
(586-632kB)
UNDI device is PCI 02:00.0, type DIX+802.3
546kB free base memory after PXE unload
xNBA initialising devices...ok
xCAT Network Boot Agent
iPXE 1.0.3-131028 (d603e) -- Open Source Network Boot Firmware
--http://ipxe.or
g
Features: HTTP HTTPS iSCSI DNS TFTP bzImage ELF PXE PXEXT
net0: 84:8f:69:fd:4f:28 using undionly on UNDI-PCI02:00.0 (open)
[Link:up, TX:0 TXE:0 RX:0 RXE:0]
DHCP (net0 84:8f:69:fd:4f:28)... ok
net0: 192.168.91.9/255.255.240.0 gw 192.168.92.54
Next server: 192.168.92.53
Filename:
http://192.168.92.53/tftpboot/xcat/xnba/nets/192.168.80.0_20
http://192.168.92.53/tftpboot/xcat/xnba/nets/192.168.80.0_20... ok
http://192.168.92.53/tftpboot/xcat/genesis.kernel.x86_64... ok
http://192.168.92.53/tftpboot/xcat/genesis.fs.x86_64.lzma... 74%
I’m assuming the genesis.fs finishes loading even though it read
“74%,” and a CSI code bounces the cursor up the screen before failing.
Where should I be looking for debug this?
Thanks,
John
------------------------------------------------------------------------------Site24x7
APM Insight: Get Deep Visibility into Application PerformanceAPM + Mobile APM
+ RUM: Monitor 3 App instances at just $35/MonthMonitor end-to-end web
transactions and take corrective actions nowTroubleshoot faster and improve
end-user experience. Signup
Now!http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________xCAT-user mailing
[email protected]https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------Site24x7
APM Insight: Get Deep Visibility into Application PerformanceAPM + Mobile APM
+ RUM: Monitor 3 App instances at just $35/MonthMonitor end-to-end web
transactions and take corrective actions nowTroubleshoot faster and improve
end-user experience. Signup
Now!http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________xCAT-user mailing
[email protected]https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user