Hi Andreas, You can use the NodeAddr parameter of the Node configuration to specify the name of the IB addr, and leave the NodeNames as before.
NodeName=abc-[001-004] NodeAddr=abc-[001-004]-ib I'm not sure if abc-[001-004]-ib is an accepted syntax. If it is not, then you can at least use the one node per line configuration, and sinfo and other commands should groups the nodenames as before. Regards, Carles Fenoy On Wed, Apr 17, 2013 at 2:49 PM, Loong, Andreas < [email protected]> wrote: > > Hello, > > We have a somewhat different host name schema it seems. Originally we > had configured slurm.conf like this (trimmed down); > > NodeName=abc-[001-004] > PartitionName=part1 Nodes=abc-[001-004] > > However, we noticed some problems and wanted to put in the dns names > which corresponded to the IB interfaces instead. > Now > NodeName=abc-[001-004]-ib > PartitionName=part1 Nodes=abc-[001-004]-ib > > This seems to completely break the slurmctld (it died when doing > scontrol reconfig), and in the message file we got > > slurmctld[11738]: fatal: Unable to create NodeAddr list from > abc-[001-004]-ib > > Creating one line per host, and then a comma-separated list for the > Nodes-section for PartitionName worked fine, ie; > NodeName=abc-001-ib ... > NodeName=abc-002-ib .. > PartitionName=part1 Nodes=abc-001-ib,abc-002-ib,abc-003-ib,abc-004-ib > ... > > However, sinfo and other commands doesn't group nodes anymore in their > output. > > Should this work, or do we need to fix our host configuration? > > Wbr > Andreas > > > -------------------------------------------------------------------------- > Confidentiality Notice: This message is private and may contain > confidential and proprietary information. If you have received this message > in error, please notify us and remove it from your system and note that you > must not copy, distribute or take any action in reliance on it. Any > unauthorized use or disclosure of the contents of this message is not > permitted and may be unlawful. > -- -- Carles Fenoy
