Hi Kevin,
 
     Just got from the code, the error code is returned from BMC.
```
my %rmcp_codes = (    #human friendly translations of rmcp+ code numbers
    1 => "Insufficient resources to create new session (wait for existing sessions to timeout)",
    2 => "Invalid Session ID",      #this shouldn't occur...
    3 => "Invalid payload type",    #shouldn't occur..
    4 => "Invalid authentication algorithm", #if this happens, we need to enhance our mechanism for detecting supported auth algorithms
    5 => "Invalid integrity algorithm",          #same as above
    6 => "No matching authentication payload",
    7 => "No matching integrity payload",
    8 => "Inactive Session ID", #this suggests the session was timed out while trying to negotiate, shouldn't happen
    9    => "Invalid role",
    0xa  => "Unauthorised role or privilege level requested",
```
 
    Normally it means the BMC is too busy to serve the request at that moment or exhausted the sessions. When xCAT got this error, it will not retry but just failed.
 
    And just as David pointed, most of the time to add a delay and retry could workaround this. But I've no idea why it does not work for you. 
 
    BTW:  Did you try using `ipmitool` or `ipmitool-xcat` CLI directly? to replace the `rpower` in your script?
 
 
Bin Xu
HPC Software Development
Software Defined Infrastructure, IBM Systems
Phone: 86-010-82454067
 
 
----- Original message -----
From: "Kevin Keane (USD)" <kke...@sandiego.edu>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Cc:
Subject: Re: [xcat-user] [External] Re: Understanding the BMC networking
Date: Mon, May 28, 2018 5:55 AM
 

Thank you! Actually, I had already added a 10-second delay before the rpower, so it’s not merely a delay issue.

 

Thanks for pointing me to automatically running tasks with the chain table. I’ve experimented with that, but didn’t quite get it to work (and haven’t yet investigated enough to even ask intelligent questions about it).

 

Sent from Mail for Windows 10

 

From: David Rajendra
Sent: Sunday, May 27, 2018 2:24 PM
To: xCAT Users Mailing list
Subject: Re: [xcat-user] [External] Re: Understanding the BMC networking

 

Hello Kevin,

 

Normally I split the discvoery part from the installation part when I set up clusters.

This is all a matter of personal choice.


My guess is you might need a slight pause between commands which are talking to the the BMC device.

Maybe try a “sleep 2” between the rsetboot and rpower commands?

Particularly with rpower the BMC can take a little while to respond to sucessive comnmands if run quickly.

You can sometimes see this if you run “rpower <node> on” then run “rpower <node> stat” immediately after.

 

You can of course configure the xCAT chain table to automatically install your nodes once they are discovered if that is something you want to do:

https://xcat-docs.readthedocs.io/en/stable/advanced/chain/run_tasks_during_discovery.html

 

There could be other vendor-specific reasons you could see the insufficient resources message but see how you get on with that.

 

Regards,

 

David

 

From: Kevin Keane <kke...@sandiego.edu>
Sent: Friday, May 25, 2018 1:23 AM
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Subject: [External] Re: [xcat-user] Understanding the BMC networking

 

Thanks to your help, discovery is now working well, thank you! I had been pulled off this project for a while.

Now for the next issue. Discovery works beautifully on the command line, but a few of the commands fail when I try to script the discovery.

I perform the following steps:

- pre-create my nodes

- run bmcdiscover

- makehosts, makedns, makedhcp.
- rsetboot <nodename> net

- rpower <nodename> reset

- wait until discovery is complete

- nodeset <nodename> osimage=<...>

And my node boots!

Now my problem is that I want to do this in a script. The commands rsetboot and rpower work fine when I run them manually on a console, but don't work when I run them via a script - the only difference I can think of is that the script does not have a tty. The actual error message I am getting is "Error: ERROR: Insufficient resources to create new session (wait for existing sessions to timeout)" Note: I am reasonably sure the error message is misleading. I can't think of any "resource" that would really be "insufficient" and there are no other "sessions" either.

 

How can I go about troubleshooting this?

 


_______________________________________________________________________
Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu
Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 |
619.260.6859

REMEMBER! No one from IT at USD will ever ask to confirm or supply your password.
These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them!

 

On Mon, Mar 19, 2018 at 7:29 PM, Yuan Y Bai <by...@cn.ibm.com> wrote:

Hi Kevin,

 

In xCAT discovery docs,  there are 2 subnets for recommended way of setting up a cluster with multiple networks to make discovery example easy.

 

1, In site table, you should configure master using xCAT MN ip in the provision network.

 

2, In networks table, here gives you 2 cases:

 

In case 1: if there are 2 subnets, one is for hardware control, one is for provision ip:

 

Hardware control network: 50.0.0.0/255.255.0.0 ——> you can assign 50.0.100.1-50.0.100.200 in the dynamicrange in networks table, this dynamicrange is for DHCP server during discovery, you can see 'tabdump -d networks'

Provision network: 10.0.0.0/255.0.0.0

 

You can add 2 networks entries in networks table, like:
“50_0_0_0-255_255_0_0”,”50.0.0.0”,”255.255.0.0”,,,,,,,,”50.0.100.1-50.0.100.200",,,,,,,,,
"10_0_0_0-255_0_0_0","10.0.0.0","255.0.0.0",,,,"<xcatmaster>",,,,,,,,,,,,,

 

In case 2: if there is only one network 50.0.0.0/16 for hardware control and provision ip, you can add dynamicrange in networks table during discovery process, after BMC is discovered and configured static ip, you can remove the dynamic range from the networks table in case it affects provision process.

 

Before and during discovery process, networks table:
 “50_0_0_0-255_255_0_0”,”50.0.0.0”,”255.255.0.0”,,,,"<xcatmaster>",,,,”50.0.100.1-50.0.100.200",,,,,,,,,

 

After discovery is finished, you should remove dynamic range from networks table:
“50_0_0_0-255_255_0_0”,”50.0.0.0”,”255.255.0.0”,,,,"<xcatmaster>",,,,,,,,,,,,,

 

3, After better understanding how configure site table and networks table, you can re-read the discovery doc to define the node based on the doc and have a try to discovery process, have a fun with it.

 

Best Regards
--------------------------------------------------
Yuan Bai (
白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM
环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

 

 

----- Original message -----
From: Kevin Keane <kke...@sandiego.edu>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Cc:
Subject: [xcat-user] Understanding the BMC networking
Date: Tue, Mar 20, 2018 4:55 AM
 

Hi,
 

I'm trying to understand the recommended way of setting up a cluster with multiple networks.
 

Specifically, I note the recommendation that the BMC should have a static IP address *in a different subnet* from the node's main IP address, and also in a different subnet from the DHCP-assigned BMC address during discovery. Does that imply that the same physical network should have two, or even three, different subnets running on it?

http://xcat-docs.readthedocs.io/en/stable/guides/admin-guides/manage_clusters/ppc64le/discovery/mtms/discovery_using_defined.html
 

In this example, there are actually three subnets involved.

 

The DHCP server apparently serves the subnet 50.0.100.0/24

The BMC will be configured with a static IP on 50.0.101.0/24

And the node's IP will be on the 10.0.100.0/24 subnet.

 

Assuming that the BMC does not have a dedicated NIC, but is shared with the first NIC, how would I configure the various tables in xCAT?

 

It looks like I would need to touch the network table, the site table, and the node object for the management node?
 

Thanks!

 

--

_______________________________________________________________________
Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu
Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

 



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

 

 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to