[jira] [Commented] (JOSHUA-318) scripts/training/run_tuner.py should enable configurable memory usage when invioking joshua-decoder

2016-10-30 Thread Matt Post (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620936#comment-15620936
 ] 

Matt Post commented on JOSHUA-318:
--

Ah, I see what happens here. To be compatible with Moses, we query the decoder 
for its list of dense features. But that loads all the models and so on, so you 
might need lots of memory. This will be rewritten in 7 since we no longer have 
dense features there. I don't think I'll fix it before the release.

> scripts/training/run_tuner.py should enable configurable memory usage when 
> invioking joshua-decoder
> ---
>
> Key: JOSHUA-318
> URL: https://issues.apache.org/jira/browse/JOSHUA-318
> Project: Joshua
>  Issue Type: Improvement
>  Components: tuner
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
> Fix For: 6.2
>
>
> When I run the run_tuner.py script I can easily run into the following
> {code}
> [mert-1] rebuilding...
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.gz.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.initializeFeatureStructures(PackedGrammar.java:385)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.(PackedGrammar.java:368)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:153)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:458)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 553, 
> in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 536, 
> in main
> run_zmert(opts.tunedir, opts.source, opts.target, opts.decoder, 
> opts.decoder_config, opts.decoder_output_file, opts)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 417, 
> in run_zmert
> opts.metric, opts.iterations or 10)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 399, 
> in setup_configs
> for feature,weight in get_features(config):
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 351, 
> in get_features
> output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" % 
> (JOSHUA, config_file), shell=True)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 626, in 
> check_output
> **kwargs).stdout
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 708, in 
> run
> output=stdout, stderr=stderr)
> subprocess.CalledProcessError: Command 
> '/usr/local/incubator-joshua/bin/joshua-decoder -c 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> -show-weights -v 0' returned non-zero exit status 1
> {code}
> This is because, by default the joshua-decoder script runs with 4g of memory. 
> The run_runer.py script should be flexible enough to continue with the memory 
> allocation provided when a pipe was initially invoked. This value should then 
> be passed to the joshua-decoder script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-317) SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391

2016-10-30 Thread Matt Post (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620904#comment-15620904
 ] 

Matt Post commented on JOSHUA-317:
--

I don't have any trouble running this. What version of Python are you using?

> SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391
> 
>
> Key: JOSHUA-317
> URL: https://issues.apache.org/jira/browse/JOSHUA-317
> Project: Joshua
>  Issue Type: Bug
>  Components: tuner
>Affects Versions: 6.0.5
> Environment: Python 3.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> {code}
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp3/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue
>   took 0 seconds (0s)
> [mert-1] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> [CHANGED]
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 391
> 'ITERATIONS': `iterations`,
>   ^
> SyntaxError: invalid syntax
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)