To the best of my (incomplete) knowledge SLURM has not been used with XCPU in many years, probably before the select/cons_res plugin was written. There would be some effort required to insure that select/linear works with XCPU now and a porting effort to make select/cons_res work.
Quoting amjad syed <[email protected]>: > I have configured slurm with frontend configuration mode. I have chosen > cons_res as my resource selection algorithm . > Now my front end node has xcpufs (XCPU agent) running which is required > for VERTEX. Hence when i try to start slurmd on front end node i get > following error: > > > slurmd: error: select/cons_res is incompatible with XCPU use > "slurmd: fatal: Use SelectType=select/linear" > > So my question is why can i not use cons_res (allocating individual > resources) when XCPUfs is running on a compute/front end node.? > > > > > > On Wed, Sep 5, 2012 at 11:29 AM, Alejandro Lucero Palau < > [email protected]> wrote: > >> ** >> Hi Amjad, >> >> As Moe commented, SLURM has a frontend configuration mode that can help >> you. However, there are code related to resources using local functions >> along with data created during initialization. For example, you can not use >> affinity plugin with frontend mode due to code mixing real hardware >> information from the node running slurmd with "virtual" node information >> received from the slurmctld. And I guess this dependency is not just at the >> affinity plugin. >> >> We are using frontend mode for splitting a NUMA machine in virtual nodes >> and I've been working on the affinity plugin problem lately. I've a patch >> for solving this issue but it requires more testing. >> >> >> On 09/05/2012 03:35 AM, amjad syed wrote: >> >> Hello, >> >> We are working on concept of "super node" which transparently connects >> heterogeneous light weight compute nodes to storage and services subsystem. >> The light weight compute nodes will be used exclusively for computational >> purposes and no service daemons will be running on these light weight >> compute nodes. >> We have open source implementation of this product is hosted on github. >> ( https://github.com/HPCLinks/Open-Vertex) >> >> So in terms of SLURM, the light weight compute nodes will not have slurmd >> daemons running on it. The management node daemon (slurmctld) will only >> communicate with "super node" daemon (slurmd). This slurmd daemon should be >> able to get dynamic resource information from light weight compute nodes >> attached to "super node" and pass that information to management node. We >> are looking at maximum 10 light weight compute nodes attached to one >> "super node". >> >> >> >> >> Can slurmd running on compute node manage remote resources (such as >> memory) ? >> >> What is the best way forward to integrate VERTEX with SLURM ? >> >> Sincerely, >> Amjad >> >> >> >> >> WARNING / LEGAL TEXT: This message is intended only for the use of the >> individual or entity to which it is addressed and may contain information >> which is privileged, confidential, proprietary, or exempt from disclosure >> under applicable law. If you are not the intended recipient or the person >> responsible for delivering the message to the intended recipient, you are >> strictly prohibited from disclosing, distributing, copying, or in any way >> using this message. If you have received this communication in error, >> please notify the sender and destroy and delete any copies you may have >> received. >> >> http://www.bsc.es/disclaimer <http://www.bsc.es/disclaimer.htm> >> >
