Hi Esteban,

Ok, that makes more sense. You had mentioned "globusrun" not "globusrun-ws" in your original email - globusrun-ws is correct. Also while the 'sge.pm' file is used for job submission in GT4/WS, the "poll()" function is not used. In WS, the scheduler event generator process reads sge's reporting file for all changes to job state information. You shuold see the process running via ps as:

path-to-globus/globus/libexec/globus-scheduler-event-generator -s sge -t 1213124041

where '-t xxxxxxxxxx' is the timestamp of the last time stopped. There's one for 'fork' as well. So based on your symptoms, it seems to me that either the sge job-id isn't being correctly registered in the jobmanager or the scheduler-event-generator is having problems processing the reporting file. I've not built directly from the LeSC distribution before. I've been reviewing it recently and may try to replicate your problem this week.

A couple of things you might try. - make sure that the sge data is getting into the reporting file. e.g do a tail on the reporting file while you do normal sge qsubs and see the job appear and cycle through it's stages.

- run the globus-scheduler-event-generator by hand (cut & paste from your ps result as above) but with a debugging flag set on your shell environment: export SEG_SGE_DEBUG=15 . For this to work you probably need to have built the 'dbg' libs. This will provide output of the scheduler-event-generator scanning the reporting file.

thanks, Jeff


Esteban Freire wrote:
Hi Jeff,

Thanks for answering me. I think I have misunderstood something. Yes, I'm interested in deploying GT4 + SGE, and actually, I have compiled/installed Globus-4.0.7. I mentioned " turning on SGE's reporting file"/globusrun-ws command and poll function(sge.pm file), because I'm following the globus + sge integration from link, http://www.lesc.ic.ac.uk/projects/SGE-GT4.html, maybe, should I use another command to send the jobs instead of globusrun-ws? or Should I follow other globus + sge integration tutorial?

I would appreciate your help because I'm a bit lost about this right now..

Thanks,
Esteban

Jeff Porter wrote:

Hi Esteban,

I am a bit confused. You mention turning on SGE's reporting file which indeed is needed for GT4 (WS GRAM) but then you discuss running with "globusrun" and looking at the "poll()" function in the sge.pm file. Both of those are in the GT2 (pre-WS) framework. From our other discussions you were interested in deploying GT4. Is that what you're trying to do? e.g. run with globusrun-ws?

Thanks, Jeff

Esteban Freire wrote:
Hello all,

I'm trying the integration of SGE (GE 6.1u3) + Globus (globus-4.0.7), but I still have the same old problem which I had in previous attempts. I'm trying the Globus + SGE integration provide by the LESC, http://www.lesc.ic.ac.uk/projects/SGE-GT4.html

I can send the jobs with Fork correctly and I can send jobs with *qsub* correctly too, and besides I have enabled *reporting_params reporting=true* and accessible for globus.

I attach on this e-mail the outputs that I considerer more important. I send the job with *globusrun* command to SGE, the job enters in execution under SGE correctly and it finish well (according to SGE). The files *.stdout and *.stderr are generated correctly in the user Home, and *.stdout file contains the correct output for the job, but for some reason the jobManager doesn't see that the job has finished, and it remains on *Current job state: Unsubmitted* without finish until I execute [ctrl + c].

I have been looking */usr/local/globus-4.0.7/lib/perl/Globus/GRAM/JobManager/sge.pm*, and in the function in which check if the job has fineshed with command qstat -j, *sub poll* function, doing a debug it isn't doing the qstat, it execute the qsub correctly and it gets the jobID, but I don't know in what step, it stops and don't execute poll function.

On the other hand, we have configured 'sge_qstat' in order to don't be necessary execute qstat -u '*' to see the all jobs running/queued, therefore the difference with previous versions of SGE is minimum.

[EMAIL PROTECTED] ~]$ cat /usr/local/sge/pro/default/common/sge_qstat
-u *

I would appreciate any help, and comments are welcome.


Thanks in advance,
Esteban





Reply via email to