Hi,
I encountered a strange problem recently.
After i submitted one job continuously for several times with the same user
,I found that most of my jobs finished successfully,but the rest of
them keep the status of "stageIn"/"Active" for hours ,and it's far beyond
the job's execution time. I can not figure out what happend because the
container.log contains no error information.
I hope someone can tell me how to handle this problem or how did this
happen?
Thanks
here is my job file:
=============================================
<job>
<factoryEndpoint xmlns:gram="
http://www.globus.org/namespaces/2004/10/gram/job" xmlns:wsa="
http://schemas.xmlsoap.org/ws/2004/03/addressing">
<wsa:Address>
https://serverIP:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
<wsa:ReferenceProperties>
<gram:ResourceID>Fork</gram:ResourceID>
</wsa:ReferenceProperties>
</factoryEndpoint>
<executable>/home/job/seqret/seqret</executable>
<directory>/home/job/seqret/</directory>
<argument>fasta::job136.stgin</argument>
<argument>phylip::job136.stgout</argument>
<stdout>job136.stdout</stdout>
<stderr>job136.stderr</stderr>
<count>3000</count>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<fileStageIn>
<transfer>
<sourceUrl>gsiftp://ServerIP:2811/tmp/workload/Outfiles/outseq.stgin</sourceUrl>
<destinationUrl>file:///home/job/seqret/job136.stgin</destinationUrl>
</transfer>
</fileStageIn>
<fileStageOut>
<transfer>
<sourceUrl>file:///home/job/seqret/job136.stdout</sourceUrl>
<destinationUrl>gsiftp://ServerIP:2811/tmp/workload/Infiles/job136.stdout</destinationUrl>
</transfer>
<transfer>
<sourceUrl>file:///home/job/seqret/job136.stderr</sourceUrl>
<destinationUrl>gsiftp://ServerIP:2811/tmp/workload/Infiles/job136.stderr</destinationUrl>
</transfer>
<transfer>
<sourceUrl>file:///home/job/seqret/job136.stgout</sourceUrl>
<destinationUrl>gsiftp://ServerIP:2811/tmp/workload/Infiles/job136.answer</destinationUrl>
</transfer>
</fileStageOut>
</job>