Thanks for accepting this patch.
Jerome
On 07/12/2010 03:47 PM, Ralph Castain wrote:
Thanks for the explanation, Ken! I'll take care of the patch.
On Jul 12, 2010, at 6:40 AM, Matney Sr, Kenneth D. wrote:
Hi Ralph,
I think that it would be overstating the case to say that I am re-assum
Thanks for the explanation, Ken! I'll take care of the patch.
On Jul 12, 2010, at 6:40 AM, Matney Sr, Kenneth D. wrote:
> Hi Ralph,
>
> I think that it would be overstating the case to say that I am re-assuming
> those duties. Rather, I am trying to fill the gap in a minimal sense while
> we l
Hi Ralph,
I think that it would be overstating the case to say that I am re-assuming
those duties. Rather, I am trying to fill the gap in a minimal sense while
we locate a replacement for Rainer. I expect to help our replacement
get up to speed on portals and ALPS; but, I have too many other dut
Sounds good then.
I only got into this thread because (a) the reference to slurm, and (b) with
Rainer's departure, I wasn't sure if someone else was going to pickup the alps
support. Since you are re-assuming those latter duties (yes?), and since this
actually has nothing to do with slurm itsel
I would prefer the first patch though so that we get rid of scripts and
of another env variable but well, I let you choose.
Jerome
On 07/09/2010 06:27 PM, Jerome Soumagne wrote:
Hi Ken,
That's interesting, setting the OMPI_ALPS_RESID in the modules so that
it executes the ras-alps-command.sh
Hi Ken,
That's interesting, setting the OMPI_ALPS_RESID in the modules so that
it executes the ras-alps-command.sh is a good idea. In this case another
way would be to add an extra line in this script with the
BASIL_RESERVATION_ID as you did for the BATCH_PARTITION_ID.
I have another possible
Ralph,
His patch only modifies the ALPS RAS mca. And, it causes the environmental
variable BASIL_RESERVATION_ID to be a synonym for OMPI_ALPS_RESID.
It makes it convenient for the version of SLURM that they are proposing. But,
it does not invoke any side-effects.
--
Ken Matney, Sr.
Oak Ridge Nat
Actually, this patch doesn't have anything to do with slurm according to the
documentation in the links. It has to do with Cray's batch allocator system,
which slurm is just interfacing to. So what you are really saying is that you
want the alps ras to run if we either detect the presence of alp
another link which can be worth mentioning:
https://computing.llnl.gov/linux/slurm/cray.html
it says at the top of the page *NOTE: As of January 2009, the SLURM
interface to Cray systems is incomplete.
*but what we have now on our system is something which is reasonably
stable and a good part
Hi Jerome,
I am in part responsible for the current incarnation of the ALPS support in
OMPI. We use the
modules environment to set OMPI_ALPS_RESID to the ALPS reservation ID, the
pertinent
parts of which are:
set ridpath ${basedir}/share/openmpi
set
It's not invented, it's a SLURM standard name. Sorry for not having said
that, my first e-mail was really too short.
http://manpages.ubuntu.com/manpages/lucid/man1/sbatch.1.html
http://slurm-llnl.sourcearchive.com/documentation/2.1.1/basil__interface_8c-source.html
...
google could have been you
Appreciate your explanation, but it doesn't align with your patch. Your patch
doesn't do anything because it patches the slurm ras module, but the system is
selecting the alps ras module - so your patch never runs.
What am I missing?
On Jul 9, 2010, at 8:08 AM, Jerome Soumagne wrote:
> Ok I ma
To clarify: what I'm trying to understand is what the heck a
"BASIL_RESERVATION_ID" is - it isn't a standard slurm thing, nor can I find it
defined in alps, so it appears to just be a local name you invented. True?
If so, I would rather see some standard name instead of something local to one
o
My bad - I see that you actually do patch the alps ras. Is BASIL_RESERVATION_ID
something included in alps, or is this just a name you invented?
On Jul 9, 2010, at 8:08 AM, Jerome Soumagne wrote:
> Ok I may have not explained very clearly. In our case we only use SLURM for
> the resource manag
Ok I may have not explained very clearly. In our case we only use SLURM
for the resource manager.
The difference here is that the SLURM version that we use has support
for ALPS. Therefore when we run our job using the mpirun command, since
we have the alps environment loaded, it's the ALPS RAS w
Afraid I'm now even more confused. You use SLURM to do the allocation, and then
use ALPS to launch the job?
I'm just trying to understand because I'm the person who generally maintains
this code area. We have two frameworks involved here:
1. RAS - determines what nodes were allocated to us. The
Well we actually use a patched version of SLURM, 2.2.0-pre8. It is
planned to submit the modifications made internally at CSCS for the next
SLURM release in November. We implement ALPS support based on the basic
architecture of SLURM.
SLURM is only used to do the ALPS ressource allocation. We th
Forgive my confusion, but could you please clarify something? You are using
ALPS as the resource manager doing the allocation, and then using SLURM as the
launcher (instead of ALPS)?
That's a combination we've never seen or heard about. I suspect our module
selection logic would be confused by
Hi,
We've recently installed OpenMPI on one of our Cray XT5 machines, here
at CSCS. This machine uses SLURM for launching jobs.
Doing an salloc defines this environment variable:
BASIL_RESERVATION_ID
The reservation ID on Cray systems running ALPS/BASIL only.
Since
19 matches
Mail list logo