Hi Charles,

I have added the GLOBUS_TCP_PORT_RANGE to gatekeeper but still no success.

The gram log changed but I am not what the error exactly is.

It is using the shared $HOME, so the output file and error file may be still open from the submitter side as well as for the Executing node who wants to overwrite them???

Thanks for your help.

Yoichi



On grid2: (Globus service)

# cat /etc/xinetd.d/globus-gatekeeper
service gsigatekeeper
{
   socket_type  = stream
   protocol     = tcp
   wait         = no
   user         = root
   env          = LD_LIBRARY_PATH=/usr/local/globus/lib
   env         += GLOBUS_TCP_PORT_RANGE=40000,41000
   server       = /usr/local/globus/sbin/globus-gatekeeper
   server_args  = -conf /usr/local/globus/etc/globus-gatekeeper.conf
   disable      = no
}

# service xinetd restart
Stopping xinetd:                                           [  OK  ]
Starting xinetd:                                           [  OK  ]


On grid1: (Globus client)

# su - yoichi

$ globus-job-run grid2.ramscommunity.org/jobmanager-condor /bin/hostname
GRAM Job submission failed because data transfer to the server failed (error code 10)

$ globus-job-status https://grid1.ramscommunity.org:36296/
DONE

$ globus-job-get-output https://grid1.ramscommunity.org:36296/
GRAM Job submission failed because the connection to the server failed (check host and port) (error code 12)



$ cat gram_job_mgr_17566.log
10/22 02:32:24 JM: TARGET_GLOBUS_LOCATION = /usr/local/globus
10/22 02:32:24 JM: Security context imported
10/22 02:32:24 JM: Adding new callback contact (url=https://grid1.ramscommunity.org:36296/ , mask=1048575)
10/22 02:32:24 JM: Added successfully
10/22 02:32:24 Pre-parsed RSL string: &("rsl_substitution" = ("GLOBUSRUN_GASS_URL" "https://grid1.ramscommunity.org:50351"; ) ) ("stderr" = $("GLOBUSRUN_GASS_URL") # "/dev/stderr" )("stdout" = $ ("GLOBUSRUN_GASS_URL") # "/dev/stdout" )("executable" = "/bin/ hostname" )
10/22 02:32:24
<<<<<Job Request RSL
&("rsl_substitution" = ("GLOBUSRUN_GASS_URL" "https://grid1.ramscommunity.org:50351 " ) )("stderr" = $("GLOBUSRUN_GASS_URL") # "/dev/stderr" )("stdout" = $ ("GLOBUSRUN_GASS_URL") # "/dev/stdout" )("executable" = "/bin/ hostname" )
>>>>>Job Request RSL
10/22 02:32:24
<<<<<Job Request RSL (canonical)
&("rslsubstitution" = ("GLOBUSRUN_GASS_URL" "https://grid1.ramscommunity.org:50351 " ) )("stderr" = $("GLOBUSRUN_GASS_URL") # "/dev/stderr" )("stdout" = $ ("GLOBUSRUN_GASS_URL") # "/dev/stdout" )("executable" = "/bin/ hostname" )
>>>>>Job Request RSL (canonical)
10/22 02:32:24 JM: Evaluating RSL Value10/22 02:32:24 JM: Evaluated RSL Value to GLOBUSRUN_GASS_URL10/22 02:32:24 JM: Evaluating RSL Value10/22 02:32:24 JM: Evaluated RSL Value to https://grid1.ramscommunity.org:5035110/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_MAKE_SCRATCHDIR
10/22 02:32:24
<<<<<Job RSL
&("environment" = ("HOME" "/home/yoichi" ) ("LOGNAME" "yoichi" ) ) ("rslsubstitution" = ("GLOBUSRUN_GASS_URL" "https://grid1.ramscommunity.org:50351 " ) )("stderr" = $("GLOBUSRUN_GASS_URL") # "/dev/stderr" )("stdout" = $ ("GLOBUSRUN_GASS_URL") # "/dev/stdout" )("executable" = "/bin/ hostname" )
>>>>>Job RSL
10/22 02:32:24
<<<<<Job RSL (post-eval)
&("environment" = ("HOME" "/home/yoichi" ) ("LOGNAME" "yoichi" ) ) ("rslsubstitution" = ("GLOBUSRUN_GASS_URL" "https://grid1.ramscommunity.org:50351 " ) )("stderr" = "https://grid1.ramscommunity.org:50351/dev/stderr"; ) ("stdout" = "https://grid1.ramscommunity.org:50351/dev/stdout"; ) ("executable" = "/bin/hostname" )
>>>>>Job RSL (post-eval)
Adding default RSL of proxy_timeout = 60
Adding default RSL of dry_run = no
Adding default RSL of gram_my_job = collective
Adding default RSL of job_type = multiple
Adding default RSL of count = 1
Adding default RSL of stdin = /dev/null
Adding default RSL of directory = $(HOME)
10/22 02:32:24
<<<<<Job RSL (post-validation)
&("directory" = $("HOME") )("stdin" = "/dev/null" )("count" = "1" ) ("job_type" = "multiple" )("gram_my_job" = "collective" )("dry_run" = "no" )("proxy_timeout" = "60" )("environment" = ("HOME" "/home/ yoichi" ) ("LOGNAME" "yoichi" ) )("rslsubstitution" = ("GLOBUSRUN_GASS_URL" "https://grid1.ramscommunity.org:50351"; ) ) ("stderr" = "https://grid1.ramscommunity.org:50351/dev/stderr"; ) ("stdout" = "https://grid1.ramscommunity.org:50351/dev/stdout"; ) ("executable" = "/bin/hostname" )
>>>>>Job RSL (post-validation)
10/22 02:32:24
<<<<<Job RSL (post-validation-eval)
&("directory" = "/home/yoichi" )("stdin" = "/dev/null" )("count" = "1" )("job_type" = "multiple" )("gram_my_job" = "collective" ) ("dry_run" = "no" )("proxy_timeout" = "60" )("environment" = ("HOME" "/ home/yoichi" ) ("LOGNAME" "yoichi" ) )("rslsubstitution" = ("GLOBUSRUN_GASS_URL" "https://grid1.ramscommunity.org:50351"; ) ) ("stderr" = "https://grid1.ramscommunity.org:50351/dev/stderr"; ) ("stdout" = "https://grid1.ramscommunity.org:50351/dev/stdout"; ) ("executable" = "/bin/hostname" )
>>>>>Job RSL (post-validation-eval)
10/22 02:32:24 JMI: Getting RSL output value
10/22 02:32:24 JMI: Processing output positions
10/22 02:32:24 JMI: Getting RSL output value
10/22 02:32:24 JMI: Processing output positions
10/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_REMOTE_IO_FILE_CREATE
10/22 02:32:24 JM: Opening output destinations
10/22 02:32:24 JM: stdout goes to /home/yoichi/.globus/job/ grid2.ramscommunity.org/17566.1224603144/stdout 10/22 02:32:24 JM: stderr goes to /home/yoichi/.globus/job/ grid2.ramscommunity.org/17566.1224603144/stderr
10/22 02:32:24 JM: Opening https://grid1.ramscommunity.org:50351/dev/stdout
10/22 02:32:24 JM: Opened GASS handle 1.
10/22 02:32:24 JM: exiting globus_l_gram_job_manager_output_destination_open()
10/22 02:32:24 JM: Opening https://grid1.ramscommunity.org:50351/dev/stderr
10/22 02:32:24 JM: Opened GASS handle 2.
10/22 02:32:24 JM: exiting globus_l_gram_job_manager_output_destination_open()
10/22 02:32:24 stdout or stderr is being used, starting to poll
10/22 02:32:24 JM: Finished opening output destinations
10/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED 10/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED_CLOSE_OUTPUT 10/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED_PRE_FILE_CLEAN_UP 10/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED_FILE_CLEAN_UP 10/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED_SCRATCH_CLEAN_UP 10/22 02:32:24 JMI: testing job manager scripts for type condor exist and permissions are ok. 10/22 02:32:24 JMI: completed script validation: job manager type is condor.
10/22 02:32:24 JMI: cmd = cache_cleanup
Wed Oct 22 02:32:24 2008 JM_SCRIPT: New Perl JobManager created.
Wed Oct 22 02:32:24 2008 JM_SCRIPT: Using jm supplied job dir: /home/ yoichi/.globus/job/grid2.ramscommunity.org/17566.1224603144 Wed Oct 22 02:32:24 2008 JM_SCRIPT: Using jm supplied job dir: /home/ yoichi/.globus/job/grid2.ramscommunity.org/17566.1224603144
Wed Oct 22 02:32:24 2008 JM_SCRIPT: cache_cleanup(enter)
Wed Oct 22 02:32:24 2008 JM_SCRIPT: Cleaning files in job dir /home/ yoichi/.globus/job/grid2.ramscommunity.org/17566.1224603144 Wed Oct 22 02:32:24 2008 JM_SCRIPT: Removed 3 files from /home/ yoichi/.globus/job/grid2.ramscommunity.org/17566.1224603144
Wed Oct 22 02:32:24 2008 JM_SCRIPT: cache_cleanup(exit)
10/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED_CACHE_CLEAN_UP 10/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED_RESPONSE
10/22 02:32:24 JM: before sending to client: rc=0 (Success)
10/22 02:32:24 Job Manager State Machine (exiting): GLOBUS_GRAM_JOB_MANAGER_STATE_FAILED_DONE
10/22 02:32:24 JM: in globus_gram_job_manager_reporting_file_remove()
10/22 02:32:24 Job Manager State Machine (entering): GLOBUS_GRAM_JOB_MANAGER_STATE_FAILED_DONE
10/22 02:32:24 JM: in globus_gram_job_manager_reporting_file_remove()
10/22 02:32:24 JM: exiting globus_gram_job_manager.

--------------------------------------------------------------------------
Yoichi Takayama, PhD
Senior Research Fellow
RAMP Project
MELCOE (Macquarie E-Learning Centre of Excellence)
MACQUARIE UNIVERSITY

Phone: +61 (0)2 9850 9073
Fax: +61 (0)2 9850 6527
www.mq.edu.au
www.melcoe.mq.edu.au/projects/RAMP/
--------------------------------------------------------------------------
MACQUARIE UNIVERSITY: CRICOS Provider No 00002J

This message is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender, and are not necessarily the views of Macquarie E-Learning Centre Of Excellence (MELCOE) or Macquarie University.


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to