Dear all,
I still have the here mentioned problem. Did someone of you experience similar 
problems with disk related gres? Is there a trivial point which I missed so far?

Thanks in advance.

greetings

Jan




On Jun 15, 2015, at 10:06 AM,  wrote:

> Hi Aaron,
> thanks for the quick response. You are right, I'd like to provide some 
> scratch space by means of a filesystem. So I guess your 'recipe' should 
> perfectly work. I'm currently playing around with a test configuration and 
> adjusted the gres.conf accordingly:
> 
> 
>               cat gres.conf
>               Name=disk Type=fast Count=48G
>               Name=disk Type=data Count=147G
> 
>               cat nodenames.conf
>               NodeName=compute-0-0 Gres=disk:fast:48G,disk:data:147G 
> NodeAddr=192.168.255.253 CPUs=4 Weight=20484100 Feature=rack-0,4CPUs
> 
> 
> Unfortunately I stuck already when trying to restart the slurmd, it doesnt 
> come up and complains in the log file:
> 
>               fatal: Gres disk has invalid count value 51539607552
> 
> (slurmctld comes up without any troubles)
> 
> As both, slurmd and slurmctld, are properly come up when I change the Count 
> field to Count=1G (up to 3G), I figured that it is a problem of the 32-bit 
> nature of the count field. However, I thought that this issue would be 
> circumvented by the suffix K,M and G. 
> 
> 
> 
> What am I missing?
> 
> 
> Thanks.
> 
> greetings
> 
> 
> Jan
> 
> 
> 
> 
> On Jun 12, 2015, at 2:44 PM, Aaron Knister wrote:
> 
>> 
>> Hi Jan,
>> 
>> Are you looking to make raw block devices assessable to jobs or a file 
>> system?
>> 
>> The term "running on"  can mean different things-- it could be where the 
>> application binary lives, or where input and or output files live, or maybe 
>> some other things too. I'll figure you're looking to provide scratch space 
>> on the node by means of a filesystem. 
>> 
>> If you'd like to hand out filesystem access let's say each disk is mounted 
>> at /local_disk/sata and /local_disk/sas, respectively, you could define the 
>> GRES as:
>> 
>> Name=local_disk Type=sata Count=3800G
>> Name=local_disk Type=sas Count=580G
>> 
>> (You'll probably want to adjust the value of Count depending on what size 
>> the drives format out to). 
>> 
>> You could then write some prolog magic to actually allocate that space on 
>> the nodes (if you're sharing nodes between jobs) via quotas (or maybe 
>> something more fancy if you have say ZFS or btrfs) and creates a 
>> job-specific directory under the mount point.  In addition you could set an 
>> environment variable via the prolog that points to the path for the storage 
>> so users can reference it in their jobs regardless of disk type. A single 
>> SLURM_LOCAL_DISK variable might do the job. The last piece is an epilog job 
>> to delete the job-specific directory and unset any quotas along with a cron 
>> job to periodically check that the directories and quotas have been cleaned 
>> up on each node in case there's an issue with the SLURM epilog (e.g. A nodes 
>> reboots during the job)
>> 
>> I hope that helps and isn't overwhelming. If you have questions about any of 
>> the parts I'm happy to explain more. 
>> 
>> Best,
>> Aaron
>> 
>> 
>> Sent from my iPhone
>> 
>>> On Jun 12, 2015, at 8:18 AM, Jan Schulze <[email protected]> wrote:
>>> 
>>> 
>>> Dear all,
>>> 
>>> this is slurm 14.11.6 on a ROCKS 6.2 cluster. 
>>> 
>>> We'are currently planing to build a cluster out of computing nodes each 
>>> having one SAS(600GB) and one SATA(4TB) hard drive. Is there a way that one 
>>> can configure the nodes such that the user can specify on which kind of 
>>> disk the job is supposed to run? So in the gres.conf file something like 
>>> 
>>>  Name=storage Type=SATA File=/dev/sda1 Count=4000G
>>>  Name=fast Type=SAS File=/dev/sdb1 Count=600G
>>> 
>>> ?
>>> 
>>> 
>>> Thanks in advance.
>>> 
>>> 
>>> greetings
>>> 
>>> Jan Schulze=
> 

Reply via email to