Re: Pipeline Mystery

2016-10-27 Thread Matt Post
yes mert must be dying. Can you post the contents of the tune/ directory? and 
tail mert.log?

matt (from my phone)

> Le 27 oct. 2016 à 00:49, John Hewitt  a écrit :
> 
> It seems like MERT isn't writing it's final config file (which is typical
> of MERT, in my experience). I recall giving up and using kbmira. This final
> config file is the one used in test, so I can see why skipping to test ends
> up failing pretty quick.
> 
> To answer your question, though, I haven't tried. Not in my bandwidth right
> now.
> 
> -John
> 
> On Thu, Oct 27, 2016 at 12:44 AM, lewis john mcgibbney 
> wrote:
> 
>> Hi Folks,
>> So I've been plodding away again and feel i am very close to generating my
>> first language pack, however I've arrived at the following fankle!!!
>> If I run a pipeline from start to finish it fails at the 'test-bundle-1'
>> phase as below stating " [Errno 2] No such file or directory:
>> '/usr/local/joshua_resources/russian_experiments/exp3/tune/
>> joshua.config.final'"
>> 
>> lmcgibbn@LMC-056430 /usr/local/joshua_resources/russian_experiments/exp3 $
>> /usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero
>> --corpus
>> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en
>> --tune
>> /usr/local/joshua_resources/russian_experiments/data/
>> commoncrawl.ru-en.tune
>> --test
>> /usr/local/joshua_resources/russian_experiments/data/
>> commoncrawl.ru-en.test
>> --source en --target ru --readme "Experiment 3 Run 1 of ru --> en model
>> training" --aligner berkeley --hadoop-mem 10g --tmp
>> /usr/local/hadoop-2.5.2/hadoop_tmp_dir
>> [train-copy-and-filter] cached, skipping...
>> [train-tokenize-en] cached, skipping...
>> [train-tokenize-ru] cached, skipping...
>> [train-trim] cached, skipping...
>> [train-lowercase-en] cached, skipping...
>> [train-lowercase-ru] cached, skipping...
>> [train-vocab-en] cached, skipping...
>> [train-vocab-ru] cached, skipping...
>> [tune-copy-and-filter] cached, skipping...
>> [tune-tokenize-en] cached, skipping...
>> [tune-tokenize-ru] cached, skipping...
>> [tune-lowercase-en] cached, skipping...
>> [tune-lowercase-ru] cached, skipping...
>> [tune-vocab-en] cached, skipping...
>> [tune-vocab-ru] cached, skipping...
>> [test-copy-and-filter] cached, skipping...
>> [test-tokenize-en] cached, skipping...
>> [test-tokenize-ru] cached, skipping...
>> [test-lowercase-en] cached, skipping...
>> [test-lowercase-ru] cached, skipping...
>> [test-vocab-en] cached, skipping...
>> [test-vocab-ru] cached, skipping...
>> [lm-sort-uniq] cached, skipping...
>> [kenlm] cached, skipping...
>> [compile-kenlm] cached, skipping...
>> [glue-tune] cached, skipping...
>> [tune-bundle] cached, skipping...
>> [mert-1] rebuilding...
>> 
>> dep=/usr/local/joshua_resources/russian_experiments/
>> exp3/data/tune/corpus.en
>> 
>> dep=/usr/local/joshua_resources/russian_experiments/
>> exp3/tune/joshua.config
>> [CHANGED]
>>  dep=tune/model/grammar.gz.packed/slice_0.source
>> 
>> dep=/usr/local/joshua_resources/russian_experiments/
>> exp3/tune/joshua.config.final
>> [NOT FOUND]
>>  cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py
>> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
>> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru
>> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune
>> --tuner
>> mert --decoder
>> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command
>> --decoder-config
>> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
>> --decoder-output-file
>> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest
>> --decoder-log-file
>> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log
>> --iterations 10 --metric 'BLEU 4 closest'
>>  took 27 seconds (27s)
>> [test-bundle-1] rebuilding...
>> 
>> dep=/usr/local/joshua_resources/russian_experiments/
>> exp3/tune/joshua.config.final
>> [NOT FOUND]
>>  dep=grammar.gz
>> 
>> dep=/usr/local/joshua_resources/russian_experiments/
>> exp3/test/1/model/joshua.config
>>  cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force
>> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir
>> /usr/local/joshua_resources/russian_experiments/exp3/tune/
>> joshua.config.final
>> /usr/local/joshua_resources/russian_experiments/exp3/test/1/model
>> --copy-config-options '-top-n 300 -pop-limit 5000 -output-format "%i ||| %s
>> ||| %f ||| %c" -mark-oovs false' --pack-tm grammar.gz --tm
>> /usr/local/joshua_resources/russian_experiments/exp3/data/
>> tune/grammar.glue
>>  JOB FAILED (return code 2)
>> ERROR:root:ERROR: argument config: can't open
>> '/usr/local/joshua_resources/russian_experiments/exp3/tune/
>> joshua.config.final':
>> [Errno 2] No such file or directory:
>> '/usr/local/joshua_resources/russian_experiments/exp3/tune/
>> joshua.config.final'
>> 
>> However, if I run the pipeline with the 

Re: [jira] [Closed] (JOSHUA-100) Add Shen et al. (2008) dependency LM

2016-10-27 Thread Matt Post
Lewis — why are you marking these as fixed? This is better classified as 
dropped or no longer needed.



> On Oct 26, 2016, at 3:28 AM, Lewis John McGibbney (JIRA)  
> wrote:
> 
> 
> [ 
> https://issues.apache.org/jira/browse/JOSHUA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>  ]
> 
> Lewis John McGibbney closed JOSHUA-100.
> ---
>Resolution: Fixed
> 
>> Add Shen et al. (2008) dependency LM
>> 
>> 
>>Key: JOSHUA-100
>>URL: https://issues.apache.org/jira/browse/JOSHUA-100
>>Project: Joshua
>> Issue Type: New Feature
>>   Reporter: Matt Post
>>   Assignee: Matt Post
>>Fix For: 6.1
>> 
>> 
> 
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)



Re: [jira] [Created] (JOSHUA-320) --joshua-mem pipeline parameter is not populated to mert processes

2016-10-27 Thread Matt Post
Hi Lewis,

You are confusing two things.

MERT calls Joshua, and passes it however much memory you set with --joshua-mem. 
It doesn't this by writing (see pipeline.pl line 1550) 
$tunedir/decoder_command, which is what Z-MERT calls to run Joshua.

Z-MERT is itself a Java program that also gets 4g. There is no option to change 
this and I don't think there needs to be, although if you disagree, it wouldn't 
hurt to add it.


> On Oct 27, 2016, at 3:21 PM, Lewis John McGibbney (JIRA)  
> wrote:
> 
> Lewis John McGibbney created JOSHUA-320:
> ---
> 
> Summary: --joshua-mem pipeline parameter is not populated to mert 
> processes
> Key: JOSHUA-320
> URL: https://issues.apache.org/jira/browse/JOSHUA-320
> Project: Joshua
>  Issue Type: Bug
>  Components: mert, pipeline
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.2
> 
> 
> As we've discussed on the Joshua mailing list at 
> http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01765.html
> it is not realistic to reserve only 4g for several tasks which are executed 
> as part of a typical pipeline line.
> In particular, MERT runs with 4g which is not enough. We should increase this 
> to something like 8g or more.
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)



[jira] [Created] (JOSHUA-320) --joshua-mem pipeline parameter is not populated to mert processes

2016-10-27 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-320:
---

 Summary: --joshua-mem pipeline parameter is not populated to mert 
processes
 Key: JOSHUA-320
 URL: https://issues.apache.org/jira/browse/JOSHUA-320
 Project: Joshua
  Issue Type: Bug
  Components: mert, pipeline
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.2


As we've discussed on the Joshua mailing list at 
http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01765.html
it is not realistic to reserve only 4g for several tasks which are executed as 
part of a typical pipeline line.
In particular, MERT runs with 4g which is not enough. We should increase this 
to something like 8g or more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)