are you using mgiza? it may be the file name for the executable. This 
has wavered to-and-fro but the EMS only calls 'mgiza'
   grep "not found"  steps/1/TRAINING_run-giza.1.STDERR
to see exactly what the error is

On 20/09/2012 18:52, Lane Schwartz wrote:
> The relevant digest files (steps/1/TRAINING_run-giza.1.STDERR.digest
> and steps/1/TRAINING_run-giza-inverse.1.STDERR.digest) each contain
> one line:
>
> not found
>
> The STDERR files for run-giza and run-giza-inverse when EMS crashes
> while running via SGE are (modulo time-stamp messages) identical to
> the respective STDERR files created for those steps when it
> successfully executes when run locally (without the -cluster flag).
>
> I did a grep in the ems scripts directory for the message "not found"
> - it appears in experiment.meta under the run-giza and
> run-giza-inverse steps, but I don't know enough about EMS to know why
> that error is being triggered.
>
> Any ideas for what else I should look for?
>
> Thanks,
> Lane
>
>
> On Thu, Sep 20, 2012 at 2:09 AM, Barry Haddow
> <[email protected]> wrote:
>> Hi Lane
>>
>> If ems failed on a given step, then there should be a message in the digest
>> file for that step. What exactly does ems report?
>>
>> Cheers - Barry
>>
>>
>>
>> Sent from my ZX81
>>
>>
>> ----- Reply message -----
>> From: "Lane Schwartz" <[email protected]>
>> Date: Wed, Sep 19, 2012 20:18
>> Subject: [Moses-support] EMS, mgiza, and SGE
>> To: <[email protected]>
>>
>> I'm trying to get up to speed using EMS. I have a small dataset (IWSLT
>> 2008) that I am using to train, tune, and test using EMS.
>>
>> I am able to reliably run EMS on my data on a single machine.
>>
>> My config file specifies jobs=10 and qsub-settings="-l
>> hostname=*machinesA*|*machinesB*|*machinesC*" where the hostname
>> patterns match machine names in my grid.
>>
>> When I run experiment.perl with the -cluster flag, the experiment
>> runs, but it consistently dies while running run-giza and
>> run-giza-inverse. Strangely, when I look in the steps directory and
>> the training directory, it appears that mgiza has run successfully in
>> both directions. I don't see any error messages. Does anyone have any
>> idea what might be going on here?
>>
>> I am using the exact same config file, and it runs successfully when I
>> launch experiment.perl without the -cluster flag. When I use the
>> -cluster flag, everything runs successfully until it gets to the giza
>> steps, which it appears to run, and then EMS dies.
>>
>> Thanks,
>> Lane Schwartz
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>>
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to