Does the target user have any gram log files in $HOME?
Charles
On Oct 6, 2008, at 10:13 PM, Yoichi Takayama wrote:
$ globus-job-run grid2.ramscommunity.org/jobmanager-fork /bin/hostname
GRAM Job submission failed because data transfer to the server
failed (error code 10)
$ globus-job-run grid2.ramscommunity.org/jobmanager-condor /bin/
hostname
GRAM Job submission failed because data transfer to the server
failed (error code 10)
Although auth is still successful???
$ globusrun -a -r grid2.ramscommunity.org/jobmanager-condor
GRAM Authentication test successful
$ cat globus-gatekeeper.log
...
...
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 6: globus-gatekeeper pid=17159 starting at Tue
Oct 7 14:04:30 2008
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 6: Got connection 137.111.246.176 at Tue Oct
7 14:04:30 2008
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 5: Authenticated globus user: /O=Grid/
OU=GlobusTest/OU=simpleCA-grid2.ramscommunity.org/
OU=ramscommunity.org/CN=Yoichi Takayama
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=6
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 5: Requested service: jobmanager-condor
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 5: Authorized as local user: yoichi
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 5: Authorized as local uid: 500
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 5: and local gid: 500
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 0: executing /usr/local/globus/libexec/globus-
job-manager
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 0: GRID_SECURITY_CONTEXT_FD=9
TIME: Tue Oct 7 14:04:30 2008
PID: 17159 -- Notice: 0: Child 17160 started
Warning: Ignoring unknown argument -seg
(I will remove -seg later, but this seems to be harmless)
It does not tell me what exactly may be wrong.
$ cat globus-condor.log
<c>
<a n="MyType"><s>SubmitEvent</s></a>
<a n="EventTypeNumber"><i>0</i></a>
<a n="MyType"><s>SubmitEvent</s></a>
<a n="EventTime"><s>2008-10-07T14:04:31</s></a>
<a n="Cluster"><i>29</i></a>
<a n="Proc"><i>0</i></a>
<a n="Subproc"><i>0</i></a>
<a n="SubmitHost"><s><137.111.246.176:9670></s></a>
</c>
<c>
<a n="MyType"><s>ShadowExceptionEvent</s></a>
<a n="EventTypeNumber"><i>7</i></a>
<a n="MyType"><s>ShadowExceptionEvent</s></a>
<a n="EventTime"><s>2008-10-07T14:04:45</s></a>
<a n="Cluster"><i>29</i></a>
<a n="Proc"><i>0</i></a>
<a n="Subproc"><i>0</i></a>
<a n="Message"><s>Error from starter on grid4.ramscommunity.org:
Failed to open '/home/yoichi/.globus/job/grid2.ramscommunity.org/
17160.1223348670/stdout' as standard output: No such file or
directory (errno 2)</s></a>
<a n="SentBytes"><r>0.000000000000000E+00</r></a>
<a n="ReceivedBytes"><r>0.000000000000000E+00</r></a>
</c>
<c>
<a n="MyType"><s>JobHeldEvent</s></a>
<a n="EventTypeNumber"><i>12</i></a>
<a n="MyType"><s>JobHeldEvent</s></a>
<a n="EventTime"><s>2008-10-07T14:04:45</s></a>
<a n="Cluster"><i>29</i></a>
<a n="Proc"><i>0</i></a>
<a n="Subproc"><i>0</i></a>
<a n="HoldReason"><s>Error from starter on
grid4.ramscommunity.org: Failed to open '/home/yoichi/.globus/job/
grid2.ramscommunity.org/17160.1223348670/stdout' as standard output:
No such file or directory (errno 2)</s></a>
<a n="HoldReasonCode"><i>7</i></a>
<a n="HoldReasonSubCode"><i>7</i></a>
</c>
Either I have to create the stdout or jobmanager is having a
difficulty creating it???
--------------------------------------------------------------------------
Yoichi Takayama, PhD
Senior Research Fellow
RAMP Project
MELCOE (Macquarie E-Learning Centre of Excellence)
MACQUARIE UNIVERSITY
Phone: +61 (0)2 9850 9073
Fax: +61 (0)2 9850 6527
www.mq.edu.au
www.melcoe.mq.edu.au/projects/RAMP/
--------------------------------------------------------------------------
MACQUARIE UNIVERSITY: CRICOS Provider No 00002J
This message is intended for the addressee named and may contain
confidential information. If you are not the intended recipient,
please delete it and notify the sender. Views expressed in this
message are those of the individual sender, and are not necessarily
the views of Macquarie E-Learning Centre Of Excellence (MELCOE) or
Macquarie University.
On 07/10/2008, at 11:55 AM, Charles Bacon wrote:
Using xinetd to start the gatekeeper is fine; nothing in that page
tells you to put "-xinetd" in the globus-gatekeeper.conf. Valid
settings are either -inetd or nothing. You want -inetd.
Charles
On Oct 6, 2008, at 7:40 PM, Yoichi Takayama wrote:
I thought that it was your own manual!
http://www.globus.org/toolkit/docs/4.2/4.2.0/execution/gram2/admin/gram2-admin-configuring.html#gram2-admin-starting
This page states both as options and I have copied the /etc/
xinetd.d/globus-gatekeeper example from it.
i.e.
2. Configure Inetd and Xinetd
While running globus-personal-gatekeeper as a user is a good test,
you will want to configure your machine to run globus-gatekeeper
as root, so that other people will be able to use your gatekeeper.
If you just run the personal gatekeeper, you won't have authority
to su to other user accounts. To setup a full gatekeeper, you will
need to make the following modifications as root:
In /etc/services, add the service name "gsigatekeeper" to port 2119.
gsigatekeeper 2119/tcp # Globus Gatekeeper
Depending on whether your host is running inetd or xinetd, you
will need to modify its configuration. If the directory /etc/
xinetd.d/ exists, then your host is likely running xinetd. If the
directory doesn't exist, your host is likely running inetd. Follow
the appropriate instructions below according to what your host is
running.
etc. etc.
Xinetd
For xinetd, add a file called "globus-gatekeeper" to the /etc/
xinetd.d/ directory that has the following contents. Be sure to
replace GLOBUS_LOCATION below with the actual value of
$GLOBUS_LOCATION in your environment.
service gsigatekeeper
{
socket_type = stream
protocol = tcp
wait = no
user = root
env = LD_LIBRARY_PATH=GLOBUS_LOCATION/lib
server = GLOBUS_LOCATION/sbin/globus-gatekeeper
server_args = -conf GLOBUS_LOCATION/etc/globus-gatekeeper.conf
disable = no
}
In general, I am running xinetd than inetd. for GridFTP and MyProxy:
# ls -l /etc/xinetd.d
total 168
-rw-r--r-- 1 root root 333 Oct 5 00:48 globus-gatekeeper
-rw-r--r-- 1 root root 495 Sep 30 21:19 gridftp
-rw-r--r-- 1 root root 326 Sep 9 2004 gssftp
-rw-r--r-- 1 root root 310 Sep 9 2004 klogin
...
-rw-r--r-- 1 root root 279 Sep 24 08:56 myproxy
...
Although it does not seem to be wrong to be using xinetd, should I
also install inetd???
Thanks,
Yoichi
--------------------------------------------------------------------------
Yoichi Takayama, PhD
Senior Research Fellow
RAMP Project
MELCOE (Macquarie E-Learning Centre of Excellence)
MACQUARIE UNIVERSITY
Phone: +61 (0)2 9850 9073
Fax: +61 (0)2 9850 6527
www.mq.edu.au
www.melcoe.mq.edu.au/projects/RAMP/
--------------------------------------------------------------------------
MACQUARIE UNIVERSITY: CRICOS Provider No 00002J
This message is intended for the addressee named and may contain
confidential information. If you are not the intended recipient,
please delete it and notify the sender. Views expressed in this
message are those of the individual sender, and are not
necessarily the views of Macquarie E-Learning Centre Of Excellence
(MELCOE) or Macquarie University.
On 07/10/2008, at 6:45 AM, Charles Bacon wrote:
xinetd does not appear to be a legal option; what document
instructed you to use that? I believe it should just be "-inetd".
Charles
On Oct 6, 2008, at 10:45 AM, Yoichi Takayama wrote:
Trying the real gatekeeper 2119(tcp):
$ globus-job-run "grid2.ramscommunity.org:2119:/O=Grid/
OU=GlobusTest/OU=simpleCA-grid2.ramscommunity.org/
OU=ramscommunity.org/CN=Yoichi Takayama" /bin/date
GRAM Job submission failed because the connection to the server
failed (check host and port) (error code 12)
Trying the real gatekeeper 2119(tcp) with telnet:
$ telnet -l '/O=Grid/OU=GlobusTest/OU=simpleCA-
grid2.ramscommunity.org/OU=ramscommunity.org/CN=Yoichi Takayama'
grid2.ramscommunity.org 2119
Trying 137.111.246.176...
Connected to grid2.ramscommunity.org (137.111.246.176).
Escape character is '^]'.
Unknown argument -xinetd
Usage: globus-gatekeeper {-conf parmfile [-test]} | {[-d[ebug] [-
inetd | -f] [-p[ort] port] [-home path] [-l[ogfile] logfile] [-
e path] [-grid_services file] [-globusid globusid] [-gridmap
file] [-globuspwd file] [-x509_cert_dir path] [-x509_cert_file
file] [-x509_user_cert file] [-x509_user_key file] [-
x509_user_proxy file] [-k] [-globuskmap file] [-test]}
Connection closed by foreign host.
Yoichi
--------------------------------------------------------------------------
Yoichi Takayama, PhD
Senior Research Fellow
RAMP Project
MELCOE (Macquarie E-Learning Centre of Excellence)
MACQUARIE UNIVERSITY
Phone: +61 (0)2 9850 9073
Fax: +61 (0)2 9850 6527
www.mq.edu.au
www.melcoe.mq.edu.au/projects/RAMP/
--------------------------------------------------------------------------
MACQUARIE UNIVERSITY: CRICOS Provider No 00002J
This message is intended for the addressee named and may contain
confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views
expressed in this message are those of the individual sender,
and are not necessarily the views of Macquarie E-Learning Centre
Of Excellence (MELCOE) or Macquarie University.
On 07/10/2008, at 1:49 AM, Charles Bacon wrote:
If you have a real gatekeeper on 2119, you can submit to that
as a test also, and get a log in the normal location.
globus-personal-gatekeeper also has logs. See the -help for
the -list and -directory options to find the temporary
directory used.
Charles
On Oct 6, 2008, at 9:16 AM, Yoichi Takayama wrote:
Hi
Thanks for the reply, but this is the test (personal-
gatekeeper) and as I said it does not leave any log entry in
the real log: $GLOBUS_LOCATION/var/globus-gatekeeper.log. No
new entry is there around the time the error occurred. (I will
check earlier error - maybe at the start up time errors).
---------------------------------------------------------
$ myproxy-logon -s grid2 (or grid-prpxy-init)
$ globus-personal-gatekeeper -start
GRAM contact: grid2.ramscommunity.org:37335:/O=Grid/
OU=GlobusTest/OU=simpleCA-grid2.ramscommunity.org/
OU=ramscommunity.org/CN=Yoichi Takayama
$ globus-job-run "grid2.ramscommunity.org:37335:/O=Grid/
OU=GlobusTest/OU=simpleCA-grid2.ramscommunity.org/
OU=ramscommunity.org/CN=Yoichi Takayama" /bin/date
GRAM Job submission failed because data transfer to the server
failed (error code 10)
(just trying single quote for the user QN in case)
$ globus-job-run grid2.ramscommunity.org:37335:'/O=Grid/
OU=GlobusTest/OU=simpleCA-grid2.ramscommunity.org/
OU=ramscommunity.org/CN=Yoichi Takayama' /bin/date
GRAM Job submission failed because the connection to the
server failed (check host and port) (error code 12)
$ globus-personal-gatekeeper -killall
killing gatekeeper: "grid2.ramscommunity.org:37335:/O=Grid/
OU=GlobusTest/OU=simpleCA-grid2.ramscommunity.org/
OU=ramscommunity.org/CN=Yoichi Takayama"
---------------------------------------------------------
The entry seems to be correct in the /etc/grid-security/grid-
mapfile
---------------------------------------------------------
# cat /etc/grid-security/grid-mapfile
"/O=Grid/OU=GlobusTest/OU=simpleCA-grid2.ramscommunity.org/
OU=ramscommunity.org/CN=Yoichi Takayama" yoichi
---------------------------------------------------------
Also, $GLOBUS_LOCATION/etc/globus-gatekeeper.conf seems OK
---------------------------------------------------------
# cat $GLOBUS_LOCATION/etc/globus-gatekeeper.conf
-x509_cert_dir /etc/grid-security/certificates
-x509_user_cert /etc/grid-security/hostcert.pem
-x509_user_key /etc/grid-security/hostkey.pem
-gridmap /etc/grid-security/grid-mapfile
-home /usr/local/globus
-e libexec
-logfile var/globus-gatekeeper.log
-port 2119
-grid_services etc/grid-services
-xinetd
-seg
---------------------------------------------------------
xintet.d for the gatekeeper is set up.
---------------------------------------------------------
# cat /etc/xinetd.d/globus-gatekeeper
service gsigatekeeper
{
socket_type = stream
protocol = tcp
wait = no
user = root
env = LD_LIBRARY_PATH=/usr/local/globus/lib
server = /usr/local/globus/sbin/globus-gatekeeper
server_args = -conf /usr/local/globus/etc/globus-
gatekeeper.conf
disable = no
}
---------------------------------------------------------
Port 2119 is in the /etc/services and it is LISTENning.
Thanks,
Yoichi
--------------------------------------------------------------------------
Yoichi Takayama, PhD
Senior Research Fellow
RAMP Project
MELCOE (Macquarie E-Learning Centre of Excellence)
MACQUARIE UNIVERSITY
Phone: +61 (0)2 9850 9073
Fax: +61 (0)2 9850 6527
www.mq.edu.au
www.melcoe.mq.edu.au/projects/RAMP/
--------------------------------------------------------------------------
MACQUARIE UNIVERSITY: CRICOS Provider No 00002J
This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views
expressed in this message are those of the individual sender,
and are not necessarily the views of Macquarie E-Learning
Centre Of Excellence (MELCOE) or Macquarie University.
On 07/10/2008, at 12:33 AM, Charles Bacon wrote:
Googling for that error string returns a copy of the old GT2
GRAM error FAQ:
http://drupal.star.bnl.gov/STAR/?q=node/424#transfer
Try following the advice in that entry.
Charles
On Oct 6, 2008, at 7:49 AM, Yoichi Takayama wrote:
Hi Charles,
I am trying to install Pegasus with Globus 4.2.0 and Condor
7.0.1.
Apparently Pegasus submits a job via port 2119, I think that
it is the gatekeeper (GRAM2).
Since GT 4.2.0 contains GRAM2, I have configured the
gatekeeper and jobmanager as your instructions:
GT 4.2.0 GRAM2: Admin Guide:
http://www.globus.org/toolkit/docs/4.2/4.2.0/execution/gram2/admin/index.html
Although it uses grid-proxy-init, I think that myproxy
should also work. (I have also installed certs etc. for grid-
proxy-init as the instructions told me).
However, the test described in the instructions (http://www.globus.org/toolkit/docs/4.2/4.2.0/execution/gram2/admin/gram2-admin-testing.html
) fails with:
------------------------------------------------------------------------------
$ globus-job-run "grid2.ramscommunity.org:42762:/O=Grid/
OU=GlobusTest/OU=simpleCA-grid2.ramscommunity.org/
OU=ramscommunity.org/CN=Yoichi Takayama" /bin/date
Mon Oct 6 23:00:16 EST 2008
GRAM Job submission failed because data transfer to the
server failed (error code 10)
------------------------------------------------------------------------------
Since the normal log file for gatekeeper (var/globus-
gatekeeper.log) does not seem to record the globus-personal-
gatekeeper, I cannot tell more than this.
My steps are described at:
http://wiki.ramp.org.au/display/vmware/4.9+Globus+-+Node+2+-+GRAM2+(gsigatekeeper%2C+jobmanager)
Can you think of some possible causes for this?
Your help would be greatly appreciated.
Regards,
Yoichi
--------------------------------------------------------------------------
Yoichi Takayama, PhD
Senior Research Fellow
RAMP Project
MELCOE (Macquarie E-Learning Centre of Excellence)
MACQUARIE UNIVERSITY
Phone: +61 (0)2 9850 9073
Fax: +61 (0)2 9850 6527
www.mq.edu.au
www.melcoe.mq.edu.au/projects/RAMP/
--------------------------------------------------------------------------
MACQUARIE UNIVERSITY: CRICOS Provider No 00002J
This message is intended for the addressee named and may
contain confidential information. If you are not the
intended recipient, please delete it and notify the sender.
Views expressed in this message are those of the individual
sender, and are not necessarily the views of Macquarie E-
Learning Centre Of Excellence (MELCOE) or Macquarie
University.