Woah man! That's impressive!
I had to make a couple of adjustments to the script. Mainly checking
first if $JOB_ID existed (to instantly exclude qsub jobs), and adding an
if to the pid check because non-root qlogins needed an additional "step"
in the search of the shepherd pid.
For the record, this is my final script:
#cat /etc/profile.d/qlogin_timelimit_message.sh
#!/bin/bash
if [ ! -n "$JOB_ID" ]; then
GO="false";
MYPARENT=`ps -p $$ -o ppid --no-header`
# echo $MYPARENT
MYPARENT=`ps -p $MYPARENT -o ppid --no-header`
# echo $MYPARENT
MYSTARTUP=`ps -p $MYPARENT -o command --no-header`
# echo $MYSTARTUP
if [ "${MYSTARTUP:0:13}" = "sge_shepherd-" ]; then
GO="true";
else
MYPARENT=`ps -p $MYPARENT -o ppid --no-header`
# echo $MYPARENT
MYSTARTUP=`ps -p $MYPARENT -o command --no-header`
# echo $MYSTARTUP
if [ "${MYSTARTUP:0:13}" = "sge_shepherd-" ]; then
GO="true";
fi
fi
# if [ "${MYSTARTUP:0:13}" = "sge_shepherd-" ]; then
if [ "$GO" = "true" ]; then
# echo "Running inside SGE";
MYJOBID=${MYSTARTUP:13}
MYJOBID=${MYJOBID% -bg}
# echo "Job $MYJOBID"
if [ -n "$MYJOBID" ]; then
. /opt/gridengine/default/common/settings.sh
TIMELIMIT=`qstat -j $MYJOBID | sed -n -e
"/^context/s/^context: *//p" | tr "," "\n" | sed -n -e
"s/^QLOGIN_TIMELIMIT=//p"`
# echo $TIMELIMIT
if [ -n "$TIMELIMIT" ]; then
echo -e "\n\n"
echo -e
"\t\x1b\x5b1;31;49m#################################################\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\t\t* W A R N I N G
*\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e
"\t\x1b\x5b1;31;49m#################################################\x1b\x5b0;39;49m"
echo -e
"\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m\t\t\t\t\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
##print ("\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m | | |
| | \x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m The qlogin job you
submitted did not request\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m any time
duration.\t\t\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e
"\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m\t\t\t\t\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m This qlogin session has
been assigned a\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m duration of\x1b\x5b1;32;49m
${TIMELIMIT}\x1b\x5b0;39;49m. After this time
expires,\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
#echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m duration of
\x1b\x5b1;32;49m2 hours\x1b\x5b0;39;49m. After this time
expires,\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m the qlogin session will
close.\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e
"\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m\t\t\t\t\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m If you want to submit a
qlogin session with\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m longer duration, please add
to your resource\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m request a time petition by
adding the\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m following parameter to your
qlogin command:\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m \t-l
h_rt=hh:mm:ss\t\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m as
in\t\t\t\t\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e "\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m \t-l h_rt=02:30:00 (2 hours
30 minutes)\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e
"\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m\t\t\t\t\t\t\x1b\x5b1;31;49m#\x1b\x5b0;39;49m"
echo -e
"\t\x1b\x5b1;31;49m#################################################\x1b\x5b0;39;49m"
echo -e "\n"
fi
fi
fi
fi
And this is the message it produces (in full blown color).
$ qlogin -l h_vmem=500M
Your job 4550191 ("QLOGIN") has been submitted
waiting for interactive job to be scheduled ...
Your interactive job 4550191 has been successfully scheduled.
Establishing /opt/gridengine/bin/rocks-qlogin.sh session to host
compute-1-2.local ...
Warning: Permanently added '[compute-1-2.local]:53921' (RSA) to the list
of known hosts.
Last login: Wed Jun 18 18:55:04 2014 from floquet.local
Rocks Compute Node
Rocks 6.0 (Mamba)
Profile built 15:58 25-Sep-2013
Kickstarted 16:05 25-Sep-2013
#################################################
# * W A R N I N G * #
#################################################
# #
# The qlogin job you submitted did not request #
# any time duration. #
# #
# This qlogin session has been assigned a #
# duration of 2 hours. After this time expires, #
# the qlogin session will close. #
# #
# If you want to submit a qlogin session with #
# longer duration, please add to your resource #
# request a time petition by adding the #
# following parameter to your qlogin command: #
# -l h_rt=hh:mm:ss #
# as in #
# -l h_rt=02:30:00 (2 hours 30 minutes) #
# #
#################################################
[theredia@compute-1-2 ~]$
Thank you very much,
Txema
El 18/06/14 17:37, Reuti escribió:
Am 18.06.2014 um 17:01 schrieb Txema Heredia:
El 17/06/14 17:23, Reuti escribió:
Am 17.06.2014 um 15:46 schrieb Txema Heredia:
Basically, the JSV checks the CLIENT parameter. If it is equal to "qlogin",
then checks if there is any h_rt, and sets a limit on 2h:10min if there is none, while
showing a colorful warning message to the user so they know that there is a time limit.
Then, it also sets all.q as a hard queue if there is none (or *) and also sets
the core binding policy.
This works wonders when I run it as a client jsv by using "qlogin -jsv
/opt/gridengine/default/common/jsv.pl". The message appears, the limit is set, and
the job runs fine.
But, once I set this as a server JSV (by qconf -mconf global), the time limit
no longer applies.
As far as I've been able to find the following behaviours differ from running
it as client or server jsv:
- The 'CLIENT' parameter changed, from 'qlogin' to 'qmaster'. This skips all my
"if" in the jsv and stops checking for time limits. Can I trust this? Why is
this 'qmaster' appearing? Now both qsub's and qlogin's show the same command. How can I
distinguish them?
I found the same:
http://gridengine.org/pipermail/users/2012-September/004808.html
You can check for QRSH_PORT port according to William's post.
Thank you as usual, Reuti.
That QRSH_PORT env variable allows to differentiate between qlogin and qsub
commands. But I am still having some problems.
- The jsv_show_params() command shows nothing. Neither on stdout nor in
/opt/gridengine/default/spool/qmaster/messages This makes debugging really
cumbersome
For me it's working, being it Bash or Perl.
06/17/2014 17:13:30|worker|pc15370|I|got param: A='sge'
06/17/2014 17:13:30|worker|pc15370|I|got param: GROUP='users'
06/17/2014 17:13:30|worker|pc15370|I|got param: N='test.sh'
06/17/2014 17:13:30|worker|pc15370|I|got param: CMDNAME='test.sh'
06/17/2014 17:13:30|worker|pc15370|I|got param: CMDARGS='0'
06/17/2014 17:13:30|worker|pc15370|I|got param: JOB_ID='11553'
06/17/2014 17:13:30|worker|pc15370|I|got param: M='reuti@pc15370'
06/17/2014 17:13:30|worker|pc15370|I|got param: CLIENT='qmaster'
06/17/2014 17:13:30|worker|pc15370|I|got param: VERSION='1.0'
06/17/2014 17:13:30|worker|pc15370|I|got param: USER='reuti'
06/17/2014 17:13:30|worker|pc15370|I|got param: CONTEXT='server'
My bad. I had my loglevel set to log_warning instead of log_info. Now I can see
these messages.
- No message can be sent to the user. Being it info, warning or error. The
user won't know if I have set a time limit to his session
Yep, only for "jsv_reject_wait" a message can be displayed. Despite the fact that also for
"jsv_correct" and "jsv_accept" a message can be specified too.
-- Reuti
Trying to overcome this, I thought of making my JSV add an environment variable
"IS_QLOGIN=true" whenever it detects the QRSH_PORT. Then, a prolog script in
the execution host would check that environment variable and print, if needed, the
timelimit message to the user.
BUT, the prolog script cannot print anything to the standard output. (¿because
it is run before the actual session begins?)
Yep, its stdout is not connected to the terminal yet.
So, I thought about modifying the .bashrc file (or any other of the several scripts under /etc/profile.d/),
and make it read that "IS_QLOGIN=true" environment variable. But, again, the fates are against me,
and qlogin commands cannot use the "-v" parameter. Even if I use the jsv_add_env() command in the
JSV script, that environment variable is passed to the prolog script, but is nowhere to be found once the
"real" qlogin session starts.
I could also ignore both the JSV and the prolog scripts, go directly to the
.bashrc script and check there for the presence of (or lack thereof) variables
like JOB_ID or JOB_NAME. That would allow me to distinguish between qlogin and
qsub sessions (qlogins and ssh environments are identical). But then, again, I
won't have access to anything able to tell the script if there is a timelimit
set.
The only solution to all this mess that comes to my mind would be to make
/etc/motd writable by all users, have the prolog script to modify it with the
timelimit message, and then have some sort of contraption in /etc/profile.d/
that resets the motd back to its previous non-qlogin version.
Does anyone have a better (or less-prone-to-failure) idea?
In the JSV you can also add some job context and fill it with a proper message.
This could then be output:
$ qrsh -ac "MESSAGE=Time limit of 12 hrs set."
Running inside SGE
Job 11562
Time limit of 12 hrs set.
The necessary profile for bash (depending on builtin/ssh/rsh the parent must be
looked up more than once):
MYPARENT=`ps -p $$ -o ppid --no-header`
#MYPARENT=`ps -p $MYPARENT -o ppid --no-header`
#MYPARENT=`ps -p $MYPARENT -o ppid --no-header`
MYSTARTUP=`ps -p $MYPARENT -o command --no-header`
if [ "${MYSTARTUP:0:13}" = "sge_shepherd-" ]; then
echo "Running inside SGE"
MYJOBID=${MYSTARTUP:13}
MYJOBID=${MYJOBID% -bg}
echo "Job $MYJOBID"
if [ -n "$MYJOBID" ]; then
. /usr/sge/default/common/settings.sh
qstat -j $MYJOBID | sed -n -e "/^context/s/^context: *//p" | tr "," "\n" | sed -n
-e "s/^MESSAGE=//p"
fi
fi
HTH - Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users