I've ran into this problem before, and it was caused by 2 different
issues:
* DNS resolution problem
* A secondary NIC on the node was getting an IP address, and taking
over communication back to the xcat daemon.
Make sure that neither of those are your issue.
On 5/28/2013 5:55 AM, Qamar Nazir
wrote:
Hi List,
I couldn't find the solution for this issue.
I have resolved it by the following the below steps:
- killed the process 'xcatd: install monitor' manually
- Restarted xcatd on the master node.
Before when I was trying to run the command '/usr/bin/awk -f
/xcatpost/updateflag.awk <master node IP> 3002' manually
it wasn't returning the prompt back. Once I did the above steps
it returned the prompt in a second.
Best Regards,
Qamar Nazir
HPC Software Engineer
On 12/09/2011 03:27 AM, Jing CDL Sun wrote:
OK, seems not a
name resoluion issue. then, maybe you need to follow
xiaopeng's suggestion for more debugging.
OR, another straight forward debugging method is to start the
xcatd in front, it will show some message about the
communication between mn and cn. I used to debug with it, for
example:
service xcatd stop
/opt/xcat/sbin/xcatd -f
Best Regards,
-----------------------------
Sun Jing(Ëᄌ)
IBM China Software Development Laboratory
Tel: (86-10) 82453625 E-mail: [email protected]
Address: Building 28, ZhongGuanCun Software Park,
No.8, Dong Bei Wang West Road, Haidian District
Beijing 100193, PRC
±±¾©Êк£µíÇø¶«±±ÍúÎ÷·8ºÅÖйشåÈí¼þÔ°28ºÅÂ¥
Óʱà: 100193
Hi Jing,
Yes, the node can resolve cmgmt1, and cmgmt1 can resolve the
node. I've checked the resolv.conf on the node's console while
it is hung, as well as was able to ping cmgmt1.
That's why this is so strange (and has me pulling my hair
out!). If I set the node to install a different profile it
installs and runs the postscripts fine, no hang, and reboots
without an issue. It's just this specific workstation profile
that it is having trouble with, but I don't see anything in
the postscripts of this profile that should cause this
behavior.
2011/12/8 Jing CDL Sun <[email protected]>
Hi Dave,
Another thing I can think of is, have you check if cmgmt1 can be resolved on your compute
node? Basically you need to set site.nameservers=<mn's
ip>, site.domain=<your domain name>, then after
makedhcp, the nameserver/domain value will be set in your dhcp
server configuration, so after the compute node is installed,
the dhcp server will create /etc/resolv.conf on your compute
node so that the compute node will know the mn is its name
server, and the search path is your domain.
Best Regards,
-----------------------------
Sun Jing(Ëᄌ)
IBM China Software Development Laboratory
Tel: (86-10) 82453625 E-mail: [email protected]
Address: Building 28, ZhongGuanCun Software Park,
No.8, Dong Bei Wang West Road, Haidian District
Beijing 100193, PRC
±±¾©Êк£µíÇø¶«±±ÍúÎ÷·8ºÅÖйشåÈí¼þÔ°28ºÅÂ¥
Óʱà: 100193
Hi Xiao,
Yes this is a diskfull installation. I will take a look at
the syslog tomorrow when I am back in the office to see if I
see anything that could be helpful that I may have missed.
Given that the node can install fine when being set to one
profile, but not another, what other things outside of DNS
and iptables could cause the management node to not be able
to receive (or reply) to the flag my node is apparently
failing to send?
2011/12/8 Xiao Peng Wang <[email protected]>
updateflag.awk is used to send a request to xcatd to
indicate that installation/netboot has been finished.
The 'updateflag.awk MN 3002' will be called for
diskfull installation and 'updateflag.awk $MASTER 3002 "installstatus booted"' should be for the diskless
boot. So you case was a diskfull installation, right?
For the debugging, you need to check whether the process
'xcatd: install monitor' has been started on MN, it is used
to handle the request from the updateflag.awk.
Also you can try to get some hints from syslog: 1. whether
'nodeset next' command was called? 2. Search the message
from node with tag 'xcat'.
You also could try to debug into the do_installm_service in
the xcatd. See the code to handle the 'ready', 'next' ...
Thanks
Best Regards
----------------------------------------------------------------------
Wang Xiaopeng (ÍõÏþÅó)
IBM China System Technology Laboratory
Tel: 86-10-82453455
Email: [email protected]
Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang
West Road, Haidian District Beijing P.R.China 100193
Dave Barry
---2011-12-09 07:35:12---Hi *, Can't seem to figure this
issue out. I have a node who is running it's
From: Dave Barry
<[email protected]>
To: xCAT Users
Mailing list <[email protected]>
Date: 2011-12-09
07:35
Subject: [xcat-user]
Installing node hanging at updateflag.awk
Hi *,
Can't seem to figure this issue out. I have a node who is
running it's postscripts properly (as far as I can tell), but
then hangs at updateflag.awk. The specific line that it seems
to hang at and is showing in ps xf is:
/bin/awk -f updateflag.awk cmgmt1 3002
That's all that is in the processes line, there is no actual
command after the 3002. Even more puzzling is in
/tmp/mypostscript.post, the following line does not exist at
the very end, while it does on other nodes who installed
properly:
updateflag.awk $MASTER 3002 "installstatus booted"
I can resolve both the node and it's master forwards and
backwards. This node also installs just fine when I give it a
different profile, so there is either something in the OS it
is installing (centos 5.4) or one of my postscripts in this
profile that is causing the issue, but I don't know how to
continue troubleshooting this problem when the issue does not
appear to be DNS related. Usually problems like this are
caused by DNS.
What would cause mypostscript.post to not have the
installstatus line at the bottom? Does this line get written
to that file after a certain "something" happens? Any thoughts
on logs or something I can look at that would cause this
behavior?
Thanks!------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference,
checklist and point of
discussion for anyone considering optimizing the pricing and
packaging model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference,
checklist and point of
discussion for anyone considering optimizing the pricing and
packaging model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference,
checklist and point of
discussion for anyone considering optimizing the pricing and
packaging model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference,
checklist and point of
discussion for anyone considering optimizing the pricing and
packaging model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference,
checklist and point of
discussion for anyone considering optimizing the pricing and
packaging model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
|