[jira] [Updated] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a bytes-like object is required, not 'str'
[ https://issues.apache.org/jira/browse/JOSHUA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated JOSHUA-316: Summary: run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a bytes-like object is required, not 'str' (was: run_bundler.py returning JOB FAILED (return code 1)) > run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a > bytes-like object is required, not 'str' > - > > Key: JOSHUA-316 > URL: https://issues.apache.org/jira/browse/JOSHUA-316 > Project: Joshua > Issue Type: Bug > Components: bundler >Affects Versions: 6.0.5 >Reporter: Lewis John McGibbney >Priority: Critical > Fix For: 6.2 > > > {code} > [glue-tune] rebuilding... > > dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source > [CHANGED] > > dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue > [NOT FOUND] > cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh > /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > > /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue > took 1 seconds (1s) > [tune-bundle] rebuilding... > > dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config > [CHANGED] > > dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source > [CHANGED] > > dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh > [NOT FOUND] > cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force > --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir > /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config > /usr/local/joshua_resources/russian_experiments/exp2/tune/model > --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" > -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 > tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function > "StateMinimizingLanguageModel -lm_order 5 -lm_file > /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm" -tm0/type > hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm > /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm > /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue > JOB FAILED (return code 1) > * Running the copy-config.pl script with the command: > /usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format > "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 > tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " > -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file > /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm" -tm0/type > hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue > Traceback (most recent call last): > File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line > 748, in main > operations = collect_operations(opts) > File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line > 637, in collect_operations > opts.copy_config_options > File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line > 202, in filter_through_copy_config_script > result, err = p.communicate(config_text) > File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, > in communicate > stdout, stderr = self._communicate(input, endtime, timeout) > File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, > in _communicate > input_view = memoryview(self._input) > TypeError: memoryview: a bytes-like object is required, not 'str' > During handling of the above exception, another exception occurred: > Traceback (most recent call last): > File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line > 760, in > main(sys.argv) > File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line > 751, in main > error_quit(e.message) > AttributeError: 'TypeError' object has no attribute 'message' > * WARNING: no key 'outputformat' found in config file (appending to end) > * WARNING: no key 'search' found in config file (appending to end) > * WARNING: no key 'topn' found in config file (appending to end) > * WARNING: no key 'markoovs' found in config file (appending to end) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1)
Lewis John McGibbney created JOSHUA-316: --- Summary: run_bundler.py returning JOB FAILED (return code 1) Key: JOSHUA-316 URL: https://issues.apache.org/jira/browse/JOSHUA-316 Project: Joshua Issue Type: Bug Components: bundler Affects Versions: 6.0.5 Reporter: Lewis John McGibbney Priority: Critical Fix For: 6.2 {code} [glue-tune] rebuilding... dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source [CHANGED] dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue [NOT FOUND] cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue took 1 seconds (1s) [tune-bundle] rebuilding... dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config [CHANGED] dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source [CHANGED] dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh [NOT FOUND] cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config /usr/local/joshua_resources/russian_experiments/exp2/tune/model --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm" -tm0/type hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue JOB FAILED (return code 1) * Running the copy-config.pl script with the command: /usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm" -tm0/type hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue Traceback (most recent call last): File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 748, in main operations = collect_operations(opts) File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 637, in collect_operations opts.copy_config_options File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 202, in filter_through_copy_config_script result, err = p.communicate(config_text) File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, in communicate stdout, stderr = self._communicate(input, endtime, timeout) File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, in _communicate input_view = memoryview(self._input) TypeError: memoryview: a bytes-like object is required, not 'str' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 760, in main(sys.argv) File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 751, in main error_quit(e.message) AttributeError: 'TypeError' object has no attribute 'message' * WARNING: no key 'outputformat' found in config file (appending to end) * WARNING: no key 'search' found in config file (appending to end) * WARNING: no key 'topn' found in config file (appending to end) * WARNING: no key 'markoovs' found in config file (appending to end) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Thrax Error in WordLexicalProbabilityCalculator - Word id 2146928632 out of range 0 1727042
This is strange. I haven't looked into this again but don't have any insights. Thanks for the followup. > On Oct 21, 2016, at 3:35 PM, lewis john mcgibbneywrote: > > Hi Folks, > Follow up. > It seems that when I clean the .cachepipe as well as all of the existing > alignments, etc from the previous run and re-run the entire pipeline then > this issue disappears. > I have no real reason why this happened. All i can say is that it is of > course best to run experiments in different directories when you make a > tweak to a pipeline. > Lewis > > On Thu, Oct 20, 2016 at 12:20 AM, lewis john mcgibbney > wrote: > >> Hi dev@, >> >> Sitting facing some issues with Thrax using Joshua master branch. >> I invoke Joshua as follows >> >> /usr/local/incubator-joshua/bin/pipeline.pl --rundir . --type hiero >> --corpus >> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en >> --tune >> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune >> --test >> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test >> --source en --target ru --readme "Experiment 1 Run 1 of ru --> en model >> training" --aligner berkeley --tmp /usr/local/hadoop-2.5.2/hadoop_tmp_dir >> --first-step thrax --no-prepare --alignment alignments/training.align >> --hadoop-mem 10g >> >> I make the first step thrax as I have previously computed my alignment as >> indicated by the arguments. >> My Thrax log is available at https://www.dropbox.com/s/ >> pxld70ki656fn13/thrax.log?dl=0. In the log you will see an exception as >> follows >> >> 16/10/19 22:56:59 WARN mapred.LocalJobRunner: job_local1314413872_0002 >> java.lang.Exception: java.lang.RuntimeException: Word id 2146928632 out >> of range 0 1727042 >>at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks( >> LocalJobRunner.java:462) >>at org.apache.hadoop.mapred.LocalJobRunner$Job.run( >> LocalJobRunner.java:522) >> Caused by: java.lang.RuntimeException: Word id 2146928632 out of range 0 >> 1727042 >>at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat >> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:133) >>at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat >> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:121) >>at org.apache.hadoop.mapred.MapTask$NewOutputCollector. >> write(MapTask.java:692) >>at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write( >> TaskInputOutputContextImpl.java:89) >>at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context. >> write(WrappedMapper.java:112) >>at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat >> or$Map.map(WordLexicalProbabilityCalculator.java:82) >>at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat >> or$Map.map(WordLexicalProbabilityCalculator.java:28) >>at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) >>at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) >>at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) >>at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run( >> LocalJobRunner.java:243) >>at java.util.concurrent.Executors$RunnableAdapter. >> call(Executors.java:511) >>at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>at java.util.concurrent.ThreadPoolExecutor.runWorker( >> ThreadPoolExecutor.java:1142) >>at java.util.concurrent.ThreadPoolExecutor$Worker.run( >> ThreadPoolExecutor.java:617) >>at java.lang.Thread.run(Thread.java:745) >> >> I see no other issues until the end of the Thrax log where I see >> >> class edu.jhu.thrax.hadoop.jobs.TargetWordGivenSourceWordProbabilityJob >> FAILED >> class edu.jhu.thrax.hadoop.jobs.OutputJobPREREQ_FAILED >> class edu.jhu.thrax.hadoop.features.annotation.AnnotationFeatureJob >> PREREQ_FAILED >> class edu.jhu.thrax.hadoop.features.mapred.TargetPhraseGivenSourceFeature >> SUCCESS >> class edu.jhu.thrax.hadoop.jobs.ExtractionJobSUCCESS >> class edu.jhu.thrax.hadoop.features.mapred.SourcePhraseGivenTargetFeature >> SUCCESS >> class edu.jhu.thrax.hadoop.jobs.VocabularyJobSUCCESS >> class edu.jhu.thrax.hadoop.jobs.SourceWordGivenTargetWordProbabilityJob >> FAILED >> >> This issue has previously been reported by Matt over on >> https://github.com/joshua-decoder/thrax/issues/10 >> >> Debugging right now folks. >> Lewis >> >> -- >> http://home.apache.org/~lewismc/ >> @hectorMcSpector >> http://www.linkedin.com/in/lmcgibbney >> > > > > -- > http://home.apache.org/~lewismc/ > @hectorMcSpector > http://www.linkedin.com/in/lmcgibbney