Was there any strange log in the access_log or error_log for httpd? You might use tcpdump to monitor whether the network transmission is OK.
 
You also could try to use 'wget' to download the image file in parallel to verify whether there were problem in your network or httpd.

Thanks
Best Regards
----------------------------------------------------------------------
Wang Xiaopeng (王晓朋)
IBM China System Technology Laboratory
Tel: 86-10-82453455
Email: w...@cn.ibm.com
Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193
 
 
----- Original message -----
From: "Lopilato, John" <john.lopil...@lmco.com>
To: "xcat-user@lists.sourceforge.net" <xcat-user@lists.sourceforge.net>
Cc:
Subject: [xcat-user] Stateless node randomly fails to get initrd-stateless.gz
Date: Tue, Apr 5, 2016 10:45 PM
 

I’m seeing an intermittent problem where xCat randomly fails to boot a sateless node, hanging up after the DHCP info has been passed to the node, but it fails to download initrd-stateless.gz.  I’m using xCat 2.10, but I’ve seen this issue with previous versions (2.9.x).  Our nodes (HP BL460c Gen9) are using UEFI, so I’m set to use xnba for our boot method rather than legacy PXE.  This ends up with httpd being used to serve out almost all the initial files beyond the first one (xnba.efi).   

 

I’ve tried increasing the amount of concurrent file access via config options in httpd (setting MaxClients 1000 & ServerLimit 1000), but nothing has seemed to help so far. It also happens with even small amounts of concurrent boots (as low as 12 simultaneous boots).  When I get in this failed state I can simply power cycle the node and have it come up fine, so I don’t think it’s a configuration issue with xCat.   This issue occurs randomly and doesn’t follow any pattern that I can tell based on nodes or switches.

 

I also don’t think it’s an issue with throughput since the xCat master is using two 10Gb NIC’s in a bonded configuration to serve out images.

 

You can see where it hangs in the attached image.  I’d appreciate any help on tracking down what’s going on here.

 

Thanks,

 

John Lopilato

 

------------------------------------------------------------------------------
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

------------------------------------------------------------------------------
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to