[
https://issues.apache.org/jira/browse/JOSHUA-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620936#comment-15620936
]
Matt Post commented on JOSHUA-318:
--
Ah, I see what happens here. To be compatible with Moses, we query the decoder
for its list of dense features. But that loads all the models and so on, so you
might need lots of memory. This will be rewritten in 7 since we no longer have
dense features there. I don't think I'll fix it before the release.
> scripts/training/run_tuner.py should enable configurable memory usage when
> invioking joshua-decoder
> ---
>
> Key: JOSHUA-318
> URL: https://issues.apache.org/jira/browse/JOSHUA-318
> Project: Joshua
> Issue Type: Improvement
> Components: tuner
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
> Fix For: 6.2
>
>
> When I run the run_tuner.py script I can easily run into the following
> {code}
> [mert-1] rebuilding...
> dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
> [CHANGED]
> dep=tune/model/grammar.gz.packed/slice_0.source [CHANGED]
>
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
> [NOT FOUND]
> cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner
> mert --decoder
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command
> --decoder-config
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
> --decoder-output-file
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest
> --decoder-log-file
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log
> --iterations 10 --metric 'BLEU 4 closest'
> JOB FAILED (return code 1)
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.initializeFeatureStructures(PackedGrammar.java:385)
> at
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.(PackedGrammar.java:368)
> at
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:153)
> at
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:458)
> at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
> at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
> at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> Traceback (most recent call last):
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 553,
> in
> main(sys.argv)
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 536,
> in main
> run_zmert(opts.tunedir, opts.source, opts.target, opts.decoder,
> opts.decoder_config, opts.decoder_output_file, opts)
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 417,
> in run_zmert
> opts.metric, opts.iterations or 10)
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 399,
> in setup_configs
> for feature,weight in get_features(config):
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 351,
> in get_features
> output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" %
> (JOSHUA, config_file), shell=True)
> File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 626, in
> check_output
> **kwargs).stdout
> File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 708, in
> run
> output=stdout, stderr=stderr)
> subprocess.CalledProcessError: Command
> '/usr/local/incubator-joshua/bin/joshua-decoder -c
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
> -show-weights -v 0' returned non-zero exit status 1
> {code}
> This is because, by default the joshua-decoder script runs with 4g of memory.
> The run_runer.py script should be flexible enough to continue with the memory
> allocation provided when a pipe was initially invoked. This value should then
> be passed to the joshua-decoder script.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)