From the last message you sent yesterday, it looked to me like you
weren't setting GLOBUS_TCP_PORT_RANGE in your client's environment.
That's a requirement too.
No, the host/ in the DN is irrelevant.
You got back grid2 because you submitted to fork, which always runs on
the same machine as the gatekeeper.
Charles
On Oct 22, 2008, at 1:54 AM, Yoichi Takayama wrote:
Hi Charles,
Thanks for the help you have been giving.
I don't think that it has anything to do with GLOBUS_TIC_PORT_RANGE
after it has been fixed on /etc/xinetd.d/globus-gatekeeper
It seem it is something else which stops the jobmanager (?) to write
back the results.
I found something weird.
------------------------------------------------------------------------------
$ globus-job-run grid2.ramscommunity.org/jobmanager-fork /bin/hostname
grid2.ramscommunity.org
GRAM Job submission failed because data transfer to the server
failed (error code 10)
------------------------------------------------------------------------------
As you can see, it returned the results but also reported the error.
Also, this is not the same as condor_submit results. It is grid1 or
grid4 (i.e. they are Execute nodes and returns those host names,
since grid2 is not an Execute node).
But this happened only the first time.
From the 2nd time onwards and also any other commands (eg. /bin/
date) all returns the same error code 10 and no results is printed
out.
It means the normal STDOUT is blocked after it is used once????
If I use jobmanager-condor, although it reports the error, it goes
ahead and submits the job, keep polling the progress, and the job
returns with success, only to fail to write out the results again.
Why do you think this happens? How to eradicate it? Is that some
lock problem? The gram log seems to show that NFS sync has been
attempted. Is it an NFS problem? Do I need to remove sync,no_wdelay
for example? It does not seem to be Permission problem.
/home
*.ramscommunity
.org
(rw,insecure,sync,no_wdelay,no_subtree_check,nohide,mp,no_root_squash)
If it is some kind of bug, should I upgrade it to GT4.2.1 in case?
Can I use it from the binary on CentOS 5.2? Do I have to use the
source?
Also, I noticed that the MyProxy perl script generate the CA certs
with CN=host/grid2..... on the MyProxy server, but following the
QuickStart, the other host certs are issued with CN=grid1.... or
CN=grid4... (i.e. without the host/). Does this matter?
Thanks,
Yoichi