[slurm-dev] Re: "Super node" Integration with SLURM.

Moe Jette Fri, 21 Sep 2012 06:29:11 -0700

To the best of my (incomplete) knowledge SLURM has not been used with  
XCPU in many years, probably before the select/cons_res plugin was  
written. There would be some effort required to insure that  
select/linear works with XCPU now and a porting effort to make  
select/cons_res work.



Quoting amjad syed <[email protected]>:

> I have configured slurm with frontend configuration mode. I have chosen
> cons_res as my resource selection algorithm .
> Now my front end node has xcpufs  (XCPU agent) running which is required
> for VERTEX. Hence when i try to start slurmd on front end node i get
> following error:
>
>
> slurmd: error: select/cons_res is incompatible with XCPU use
> "slurmd: fatal: Use SelectType=select/linear"
>
> So my question is why can i not use cons_res (allocating individual
> resources)  when XCPUfs is running on a  compute/front end node.?
>
>
>
>
>
> On Wed, Sep 5, 2012 at 11:29 AM, Alejandro Lucero Palau <
> [email protected]> wrote:
>
>> **
>> Hi Amjad,
>>
>> As Moe commented, SLURM has a frontend configuration mode that can help
>> you. However, there are code related to resources using local functions
>> along with data created during initialization. For example, you can not use
>> affinity plugin with frontend mode due to code mixing real hardware
>> information from the node running slurmd with "virtual" node information
>> received from the slurmctld. And I guess this dependency is not just at the
>> affinity plugin.
>>
>> We are using frontend mode for splitting a NUMA machine in virtual nodes
>> and I've been working on the affinity plugin problem lately. I've a patch
>> for solving  this issue but it requires more testing.
>>
>>
>> On 09/05/2012 03:35 AM, amjad syed wrote:
>>
>> Hello,
>>
>> We are working on concept of "super node" which transparently connects
>> heterogeneous light weight compute nodes to storage and services subsystem.
>> The light weight compute nodes will be used exclusively for computational
>> purposes and no service daemons will be running on these light weight
>> compute nodes.
>> We have open source implementation of this product is hosted on github.
>> ( https://github.com/HPCLinks/Open-Vertex)
>>
>> So in terms of SLURM, the light weight compute nodes will not have slurmd
>> daemons running on it. The management node  daemon (slurmctld) will only
>> communicate with "super node" daemon (slurmd). This slurmd daemon should be
>> able to get dynamic resource information from light weight compute nodes
>> attached to "super node" and pass that information to management node. We
>> are looking at maximum 10 light weight compute nodes  attached to  one
>> "super node".
>>
>>
>>
>>
>> Can slurmd running on compute node manage  remote resources (such as
>> memory) ?
>>
>> What is the best way forward to integrate VERTEX with  SLURM ?
>>
>> Sincerely,
>> Amjad
>>
>>
>>
>>
>> WARNING / LEGAL TEXT: This message is intended only for the use of the
>> individual or entity to which it is addressed and may contain information
>> which is privileged, confidential, proprietary, or exempt from disclosure
>> under applicable law. If you are not the intended recipient or the person
>> responsible for delivering the message to the intended recipient, you are
>> strictly prohibited from disclosing, distributing, copying, or in any way
>> using this message. If you have received this communication in error,
>> please notify the sender and destroy and delete any copies you may have
>> received.
>>
>> http://www.bsc.es/disclaimer <http://www.bsc.es/disclaimer.htm>
>>
>

[slurm-dev] Re: "Super node" Integration with SLURM.

Reply via email to