Hi Jan, Apologies for the delay. It looks like SLURM versions less than 15.08 only support 32-bit gres values as you noticed (I took a peek at the code). Perhaps a work around would be to do away with the suffixes and append "_gb" to the GRES name (e.g disk_gb). Once you have support for 64-but counters you could always change this later and use a submission filter to provide backwards compatibility.
Hope that helps! -Aaron Sent from my iPhone > On Jun 23, 2015, at 5:08 AM, Jan Schulze <[email protected]> wrote: > > > Dear all, > I still have the here mentioned problem. Did someone of you experience > similar problems with disk related gres? Is there a trivial point which I > missed so far? > > Thanks in advance. > > greetings > > Jan > > > > >> On Jun 15, 2015, at 10:06 AM, wrote: >> >> Hi Aaron, >> thanks for the quick response. You are right, I'd like to provide some >> scratch space by means of a filesystem. So I guess your 'recipe' should >> perfectly work. I'm currently playing around with a test configuration and >> adjusted the gres.conf accordingly: >> >> >> cat gres.conf >> Name=disk Type=fast Count=48G >> Name=disk Type=data Count=147G >> >> cat nodenames.conf >> NodeName=compute-0-0 Gres=disk:fast:48G,disk:data:147G >> NodeAddr=192.168.255.253 CPUs=4 Weight=20484100 Feature=rack-0,4CPUs >> >> >> Unfortunately I stuck already when trying to restart the slurmd, it doesnt >> come up and complains in the log file: >> >> fatal: Gres disk has invalid count value 51539607552 >> >> (slurmctld comes up without any troubles) >> >> As both, slurmd and slurmctld, are properly come up when I change the Count >> field to Count=1G (up to 3G), I figured that it is a problem of the 32-bit >> nature of the count field. However, I thought that this issue would be >> circumvented by the suffix K,M and G. >> >> >> >> What am I missing? >> >> >> Thanks. >> >> greetings >> >> >> Jan >> >> >> >> >>> On Jun 12, 2015, at 2:44 PM, Aaron Knister wrote: >>> >>> >>> Hi Jan, >>> >>> Are you looking to make raw block devices assessable to jobs or a file >>> system? >>> >>> The term "running on" can mean different things-- it could be where the >>> application binary lives, or where input and or output files live, or maybe >>> some other things too. I'll figure you're looking to provide scratch space >>> on the node by means of a filesystem. >>> >>> If you'd like to hand out filesystem access let's say each disk is mounted >>> at /local_disk/sata and /local_disk/sas, respectively, you could define the >>> GRES as: >>> >>> Name=local_disk Type=sata Count=3800G >>> Name=local_disk Type=sas Count=580G >>> >>> (You'll probably want to adjust the value of Count depending on what size >>> the drives format out to). >>> >>> You could then write some prolog magic to actually allocate that space on >>> the nodes (if you're sharing nodes between jobs) via quotas (or maybe >>> something more fancy if you have say ZFS or btrfs) and creates a >>> job-specific directory under the mount point. In addition you could set an >>> environment variable via the prolog that points to the path for the storage >>> so users can reference it in their jobs regardless of disk type. A single >>> SLURM_LOCAL_DISK variable might do the job. The last piece is an epilog job >>> to delete the job-specific directory and unset any quotas along with a cron >>> job to periodically check that the directories and quotas have been cleaned >>> up on each node in case there's an issue with the SLURM epilog (e.g. A >>> nodes reboots during the job) >>> >>> I hope that helps and isn't overwhelming. If you have questions about any >>> of the parts I'm happy to explain more. >>> >>> Best, >>> Aaron >>> >>> >>> Sent from my iPhone >>> >>>> On Jun 12, 2015, at 8:18 AM, Jan Schulze <[email protected]> >>>> wrote: >>>> >>>> >>>> Dear all, >>>> >>>> this is slurm 14.11.6 on a ROCKS 6.2 cluster. >>>> >>>> We'are currently planing to build a cluster out of computing nodes each >>>> having one SAS(600GB) and one SATA(4TB) hard drive. Is there a way that >>>> one can configure the nodes such that the user can specify on which kind >>>> of disk the job is supposed to run? So in the gres.conf file something >>>> like >>>> >>>> Name=storage Type=SATA File=/dev/sda1 Count=4000G >>>> Name=fast Type=SAS File=/dev/sdb1 Count=600G >>>> >>>> ? >>>> >>>> >>>> Thanks in advance. >>>> >>>> >>>> greetings >>>> >>>> Jan Schulze= >>
