Re: Pipeline Mystery
yes mert must be dying. Can you post the contents of the tune/ directory? and tail mert.log? matt (from my phone) > Le 27 oct. 2016 à 00:49, John Hewitta écrit : > > It seems like MERT isn't writing it's final config file (which is typical > of MERT, in my experience). I recall giving up and using kbmira. This final > config file is the one used in test, so I can see why skipping to test ends > up failing pretty quick. > > To answer your question, though, I haven't tried. Not in my bandwidth right > now. > > -John > > On Thu, Oct 27, 2016 at 12:44 AM, lewis john mcgibbney > wrote: > >> Hi Folks, >> So I've been plodding away again and feel i am very close to generating my >> first language pack, however I've arrived at the following fankle!!! >> If I run a pipeline from start to finish it fails at the 'test-bundle-1' >> phase as below stating " [Errno 2] No such file or directory: >> '/usr/local/joshua_resources/russian_experiments/exp3/tune/ >> joshua.config.final'" >> >> lmcgibbn@LMC-056430 /usr/local/joshua_resources/russian_experiments/exp3 $ >> /usr/local/incubator-joshua/bin/pipeline.pl --rundir . --type hiero >> --corpus >> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en >> --tune >> /usr/local/joshua_resources/russian_experiments/data/ >> commoncrawl.ru-en.tune >> --test >> /usr/local/joshua_resources/russian_experiments/data/ >> commoncrawl.ru-en.test >> --source en --target ru --readme "Experiment 3 Run 1 of ru --> en model >> training" --aligner berkeley --hadoop-mem 10g --tmp >> /usr/local/hadoop-2.5.2/hadoop_tmp_dir >> [train-copy-and-filter] cached, skipping... >> [train-tokenize-en] cached, skipping... >> [train-tokenize-ru] cached, skipping... >> [train-trim] cached, skipping... >> [train-lowercase-en] cached, skipping... >> [train-lowercase-ru] cached, skipping... >> [train-vocab-en] cached, skipping... >> [train-vocab-ru] cached, skipping... >> [tune-copy-and-filter] cached, skipping... >> [tune-tokenize-en] cached, skipping... >> [tune-tokenize-ru] cached, skipping... >> [tune-lowercase-en] cached, skipping... >> [tune-lowercase-ru] cached, skipping... >> [tune-vocab-en] cached, skipping... >> [tune-vocab-ru] cached, skipping... >> [test-copy-and-filter] cached, skipping... >> [test-tokenize-en] cached, skipping... >> [test-tokenize-ru] cached, skipping... >> [test-lowercase-en] cached, skipping... >> [test-lowercase-ru] cached, skipping... >> [test-vocab-en] cached, skipping... >> [test-vocab-ru] cached, skipping... >> [lm-sort-uniq] cached, skipping... >> [kenlm] cached, skipping... >> [compile-kenlm] cached, skipping... >> [glue-tune] cached, skipping... >> [tune-bundle] cached, skipping... >> [mert-1] rebuilding... >> >> dep=/usr/local/joshua_resources/russian_experiments/ >> exp3/data/tune/corpus.en >> >> dep=/usr/local/joshua_resources/russian_experiments/ >> exp3/tune/joshua.config >> [CHANGED] >> dep=tune/model/grammar.gz.packed/slice_0.source >> >> dep=/usr/local/joshua_resources/russian_experiments/ >> exp3/tune/joshua.config.final >> [NOT FOUND] >> cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py >> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en >> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru >> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune >> --tuner >> mert --decoder >> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command >> --decoder-config >> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config >> --decoder-output-file >> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest >> --decoder-log-file >> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log >> --iterations 10 --metric 'BLEU 4 closest' >> took 27 seconds (27s) >> [test-bundle-1] rebuilding... >> >> dep=/usr/local/joshua_resources/russian_experiments/ >> exp3/tune/joshua.config.final >> [NOT FOUND] >> dep=grammar.gz >> >> dep=/usr/local/joshua_resources/russian_experiments/ >> exp3/test/1/model/joshua.config >> cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force >> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir >> /usr/local/joshua_resources/russian_experiments/exp3/tune/ >> joshua.config.final >> /usr/local/joshua_resources/russian_experiments/exp3/test/1/model >> --copy-config-options '-top-n 300 -pop-limit 5000 -output-format "%i ||| %s >> ||| %f ||| %c" -mark-oovs false' --pack-tm grammar.gz --tm >> /usr/local/joshua_resources/russian_experiments/exp3/data/ >> tune/grammar.glue >> JOB FAILED (return code 2) >> ERROR:root:ERROR: argument config: can't open >> '/usr/local/joshua_resources/russian_experiments/exp3/tune/ >> joshua.config.final': >> [Errno 2] No such file or directory: >> '/usr/local/joshua_resources/russian_experiments/exp3/tune/ >> joshua.config.final' >> >> However, if I run the pipeline with the
Re: Pipeline Mystery
Hi John, Thanks for your response. Replies inline... On Wed, Oct 26, 2016 at 9:49 PM, < dev-digest-h...@joshua.incubator.apache.org> wrote: > > From: John Hewitt <john...@seas.upenn.edu> > To: dev@joshua.incubator.apache.org > Cc: > Date: Thu, 27 Oct 2016 00:49:34 -0400 > Subject: Re: Pipeline Mystery > It seems like MERT isn't writing it's final config file (which is typical > of MERT, in my experience). I recall giving up and using kbmira. This final > config file is the one used in test, so I can see why skipping to test ends > up failing pretty quick. > >From my understanding, in order to use --tuner kbmira, I need to download, configure and run Moses. Is this correct? I would REALLY prefer not to do this if at all possible. In the meantime, it looks like I'm going to try another fresh pipeline run and see where I get. Sometimes starting afresh has lead to surprising and delightful results :) > > To answer your question, though, I haven't tried. Not in my bandwidth right > now. No problems. In all honesty, an entire pipeline execution on a small parallel dataset would be a killer smoke test(s) for any contributions coming into Joshua. Language pack creation is so important and having confidence in the overall process is something which I really look forward to building over the next while. Thanks
Re: Pipeline Mystery
It seems like MERT isn't writing it's final config file (which is typical of MERT, in my experience). I recall giving up and using kbmira. This final config file is the one used in test, so I can see why skipping to test ends up failing pretty quick. To answer your question, though, I haven't tried. Not in my bandwidth right now. -John On Thu, Oct 27, 2016 at 12:44 AM, lewis john mcgibbneywrote: > Hi Folks, > So I've been plodding away again and feel i am very close to generating my > first language pack, however I've arrived at the following fankle!!! > If I run a pipeline from start to finish it fails at the 'test-bundle-1' > phase as below stating " [Errno 2] No such file or directory: > '/usr/local/joshua_resources/russian_experiments/exp3/tune/ > joshua.config.final'" > > lmcgibbn@LMC-056430 /usr/local/joshua_resources/russian_experiments/exp3 $ > /usr/local/incubator-joshua/bin/pipeline.pl --rundir . --type hiero > --corpus > /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en > --tune > /usr/local/joshua_resources/russian_experiments/data/ > commoncrawl.ru-en.tune > --test > /usr/local/joshua_resources/russian_experiments/data/ > commoncrawl.ru-en.test > --source en --target ru --readme "Experiment 3 Run 1 of ru --> en model > training" --aligner berkeley --hadoop-mem 10g --tmp > /usr/local/hadoop-2.5.2/hadoop_tmp_dir > [train-copy-and-filter] cached, skipping... > [train-tokenize-en] cached, skipping... > [train-tokenize-ru] cached, skipping... > [train-trim] cached, skipping... > [train-lowercase-en] cached, skipping... > [train-lowercase-ru] cached, skipping... > [train-vocab-en] cached, skipping... > [train-vocab-ru] cached, skipping... > [tune-copy-and-filter] cached, skipping... > [tune-tokenize-en] cached, skipping... > [tune-tokenize-ru] cached, skipping... > [tune-lowercase-en] cached, skipping... > [tune-lowercase-ru] cached, skipping... > [tune-vocab-en] cached, skipping... > [tune-vocab-ru] cached, skipping... > [test-copy-and-filter] cached, skipping... > [test-tokenize-en] cached, skipping... > [test-tokenize-ru] cached, skipping... > [test-lowercase-en] cached, skipping... > [test-lowercase-ru] cached, skipping... > [test-vocab-en] cached, skipping... > [test-vocab-ru] cached, skipping... > [lm-sort-uniq] cached, skipping... > [kenlm] cached, skipping... > [compile-kenlm] cached, skipping... > [glue-tune] cached, skipping... > [tune-bundle] cached, skipping... > [mert-1] rebuilding... > > dep=/usr/local/joshua_resources/russian_experiments/ > exp3/data/tune/corpus.en > > dep=/usr/local/joshua_resources/russian_experiments/ > exp3/tune/joshua.config > [CHANGED] > dep=tune/model/grammar.gz.packed/slice_0.source > > dep=/usr/local/joshua_resources/russian_experiments/ > exp3/tune/joshua.config.final > [NOT FOUND] > cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py > /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en > /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru > --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune > --tuner > mert --decoder > /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command > --decoder-config > /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config > --decoder-output-file > /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest > --decoder-log-file > /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log > --iterations 10 --metric 'BLEU 4 closest' > took 27 seconds (27s) > [test-bundle-1] rebuilding... > > dep=/usr/local/joshua_resources/russian_experiments/ > exp3/tune/joshua.config.final > [NOT FOUND] > dep=grammar.gz > > dep=/usr/local/joshua_resources/russian_experiments/ > exp3/test/1/model/joshua.config > cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force > --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir > /usr/local/joshua_resources/russian_experiments/exp3/tune/ > joshua.config.final > /usr/local/joshua_resources/russian_experiments/exp3/test/1/model > --copy-config-options '-top-n 300 -pop-limit 5000 -output-format "%i ||| %s > ||| %f ||| %c" -mark-oovs false' --pack-tm grammar.gz --tm > /usr/local/joshua_resources/russian_experiments/exp3/data/ > tune/grammar.glue > JOB FAILED (return code 2) > ERROR:root:ERROR: argument config: can't open > '/usr/local/joshua_resources/russian_experiments/exp3/tune/ > joshua.config.final': > [Errno 2] No such file or directory: > '/usr/local/joshua_resources/russian_experiments/exp3/tune/ > joshua.config.final' > > However, if I run the pipeline with the --first-step test flag, then I get > the following where the 'test-bundle-1' phase executes and completes > flawlessly however the pipeline then goes on to die at the 'test-decode-1' > phase!!! > > lmcgibbn@LMC-056430 /usr/local/joshua_resources/russian_experiments/exp3 $ > /usr/local/incubator-joshua/bin/pipeline.pl
Pipeline Mystery
Hi Folks, So I've been plodding away again and feel i am very close to generating my first language pack, however I've arrived at the following fankle!!! If I run a pipeline from start to finish it fails at the 'test-bundle-1' phase as below stating " [Errno 2] No such file or directory: '/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final'" lmcgibbn@LMC-056430 /usr/local/joshua_resources/russian_experiments/exp3 $ /usr/local/incubator-joshua/bin/pipeline.pl --rundir . --type hiero --corpus /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en --tune /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune --test /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test --source en --target ru --readme "Experiment 3 Run 1 of ru --> en model training" --aligner berkeley --hadoop-mem 10g --tmp /usr/local/hadoop-2.5.2/hadoop_tmp_dir [train-copy-and-filter] cached, skipping... [train-tokenize-en] cached, skipping... [train-tokenize-ru] cached, skipping... [train-trim] cached, skipping... [train-lowercase-en] cached, skipping... [train-lowercase-ru] cached, skipping... [train-vocab-en] cached, skipping... [train-vocab-ru] cached, skipping... [tune-copy-and-filter] cached, skipping... [tune-tokenize-en] cached, skipping... [tune-tokenize-ru] cached, skipping... [tune-lowercase-en] cached, skipping... [tune-lowercase-ru] cached, skipping... [tune-vocab-en] cached, skipping... [tune-vocab-ru] cached, skipping... [test-copy-and-filter] cached, skipping... [test-tokenize-en] cached, skipping... [test-tokenize-ru] cached, skipping... [test-lowercase-en] cached, skipping... [test-lowercase-ru] cached, skipping... [test-vocab-en] cached, skipping... [test-vocab-ru] cached, skipping... [lm-sort-uniq] cached, skipping... [kenlm] cached, skipping... [compile-kenlm] cached, skipping... [glue-tune] cached, skipping... [tune-bundle] cached, skipping... [mert-1] rebuilding... dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config [CHANGED] dep=tune/model/grammar.gz.packed/slice_0.source dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final [NOT FOUND] cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner mert --decoder /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command --decoder-config /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config --decoder-output-file /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest --decoder-log-file /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log --iterations 10 --metric 'BLEU 4 closest' took 27 seconds (27s) [test-bundle-1] rebuilding... dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final [NOT FOUND] dep=grammar.gz dep=/usr/local/joshua_resources/russian_experiments/exp3/test/1/model/joshua.config cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final /usr/local/joshua_resources/russian_experiments/exp3/test/1/model --copy-config-options '-top-n 300 -pop-limit 5000 -output-format "%i ||| %s ||| %f ||| %c" -mark-oovs false' --pack-tm grammar.gz --tm /usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue JOB FAILED (return code 2) ERROR:root:ERROR: argument config: can't open '/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final': [Errno 2] No such file or directory: '/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final' However, if I run the pipeline with the --first-step test flag, then I get the following where the 'test-bundle-1' phase executes and completes flawlessly however the pipeline then goes on to die at the 'test-decode-1' phase!!! lmcgibbn@LMC-056430 /usr/local/joshua_resources/russian_experiments/exp3 $ /usr/local/incubator-joshua/bin/pipeline.pl --rundir . --type hiero --corpus /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en --tune /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune --test /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test --source en --target ru --readme "Experiment 3 Run 1 of ru --> en model training" --aligner berkeley --hadoop-mem 10g --tmp /usr/local/hadoop-2.5.2/hadoop_tmp_dir --first-step test --grammar /usr/local/joshua_resources/russian_experiments/exp3/grammar.gz --joshua-mem 10g [train-copy-and-filter] cached, skipping... [train-tokenize-en] cached, skipping... [train-tokenize-ru]