are you using mgiza? it may be the file name for the executable. This has wavered to-and-fro but the EMS only calls 'mgiza' grep "not found" steps/1/TRAINING_run-giza.1.STDERR to see exactly what the error is
On 20/09/2012 18:52, Lane Schwartz wrote: > The relevant digest files (steps/1/TRAINING_run-giza.1.STDERR.digest > and steps/1/TRAINING_run-giza-inverse.1.STDERR.digest) each contain > one line: > > not found > > The STDERR files for run-giza and run-giza-inverse when EMS crashes > while running via SGE are (modulo time-stamp messages) identical to > the respective STDERR files created for those steps when it > successfully executes when run locally (without the -cluster flag). > > I did a grep in the ems scripts directory for the message "not found" > - it appears in experiment.meta under the run-giza and > run-giza-inverse steps, but I don't know enough about EMS to know why > that error is being triggered. > > Any ideas for what else I should look for? > > Thanks, > Lane > > > On Thu, Sep 20, 2012 at 2:09 AM, Barry Haddow > <[email protected]> wrote: >> Hi Lane >> >> If ems failed on a given step, then there should be a message in the digest >> file for that step. What exactly does ems report? >> >> Cheers - Barry >> >> >> >> Sent from my ZX81 >> >> >> ----- Reply message ----- >> From: "Lane Schwartz" <[email protected]> >> Date: Wed, Sep 19, 2012 20:18 >> Subject: [Moses-support] EMS, mgiza, and SGE >> To: <[email protected]> >> >> I'm trying to get up to speed using EMS. I have a small dataset (IWSLT >> 2008) that I am using to train, tune, and test using EMS. >> >> I am able to reliably run EMS on my data on a single machine. >> >> My config file specifies jobs=10 and qsub-settings="-l >> hostname=*machinesA*|*machinesB*|*machinesC*" where the hostname >> patterns match machine names in my grid. >> >> When I run experiment.perl with the -cluster flag, the experiment >> runs, but it consistently dies while running run-giza and >> run-giza-inverse. Strangely, when I look in the steps directory and >> the training directory, it appears that mgiza has run successfully in >> both directions. I don't see any error messages. Does anyone have any >> idea what might be going on here? >> >> I am using the exact same config file, and it runs successfully when I >> launch experiment.perl without the -cluster flag. When I use the >> -cluster flag, everything runs successfully until it gets to the giza >> steps, which it appears to run, and then EMS dies. >> >> Thanks, >> Lane Schwartz >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >> >> >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. >> > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
