[jira] [Updated] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a bytes-like object is required, not 'str'

2016-10-21 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-316:

Summary: run_bundler.py returning JOB FAILED (return code 1) TypeError: 
memoryview: a bytes-like object is required, not 'str'  (was: run_bundler.py 
returning JOB FAILED (return code 1))

> run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a 
> bytes-like object is required, not 'str'
> -
>
> Key: JOSHUA-316
> URL: https://issues.apache.org/jira/browse/JOSHUA-316
> Project: Joshua
>  Issue Type: Bug
>  Components: bundler
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.2
>
>
> {code}
> [glue-tune] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   took 1 seconds (1s)
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp2/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   JOB FAILED (return code 1)
> * Running the copy-config.pl script with the command: 
> /usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
> "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 
> tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
> -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 748, in main
> operations = collect_operations(opts)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 637, in collect_operations
> opts.copy_config_options
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 202, in filter_through_copy_config_script
> result, err = p.communicate(config_text)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, 
> in communicate
> stdout, stderr = self._communicate(input, endtime, timeout)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, 
> in _communicate
> input_view = memoryview(self._input)
> TypeError: memoryview: a bytes-like object is required, not 'str'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 760, in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 751, in main
> error_quit(e.message)
> AttributeError: 'TypeError' object has no attribute 'message'
> * WARNING: no key 'outputformat' found in config file (appending to end)
> * WARNING: no key 'search' found in config file (appending to end)
> * WARNING: no key 'topn' found in config file (appending to end)
> * WARNING: no key 'markoovs' found in config file (appending to end)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1)

2016-10-21 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-316:
---

 Summary: run_bundler.py returning JOB FAILED (return code 1)
 Key: JOSHUA-316
 URL: https://issues.apache.org/jira/browse/JOSHUA-316
 Project: Joshua
  Issue Type: Bug
  Components: bundler
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Critical
 Fix For: 6.2


{code}
[glue-tune] rebuilding...
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
 [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue 
[NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
  took 1 seconds (1s)
[tune-bundle] rebuilding...
  dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
[CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
 [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
 [NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
--symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
/usr/local/joshua_resources/russian_experiments/exp2/tune/model 
--copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
-mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
"StateMinimizingLanguageModel -lm_order 5 -lm_file 
/usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type hiero 
-tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
  JOB FAILED (return code 1)
* Running the copy-config.pl script with the command: 
/usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
"%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 
1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
-feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
/usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type hiero 
-tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
Traceback (most recent call last):
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 748, 
in main
operations = collect_operations(opts)
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 637, 
in collect_operations
opts.copy_config_options
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 202, 
in filter_through_copy_config_script
result, err = p.communicate(config_text)
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, in 
communicate
stdout, stderr = self._communicate(input, endtime, timeout)
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, in 
_communicate
input_view = memoryview(self._input)
TypeError: memoryview: a bytes-like object is required, not 'str'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 760, 
in 
main(sys.argv)
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 751, 
in main
error_quit(e.message)
AttributeError: 'TypeError' object has no attribute 'message'
* WARNING: no key 'outputformat' found in config file (appending to end)
* WARNING: no key 'search' found in config file (appending to end)
* WARNING: no key 'topn' found in config file (appending to end)
* WARNING: no key 'markoovs' found in config file (appending to end)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Thrax Error in WordLexicalProbabilityCalculator - Word id 2146928632 out of range 0 1727042

2016-10-21 Thread Matt Post
This is strange. I haven't looked into this again but don't have any insights. 
Thanks for the followup.


> On Oct 21, 2016, at 3:35 PM, lewis john mcgibbney  wrote:
> 
> Hi Folks,
> Follow up.
> It seems that when I clean the .cachepipe as well as all of the existing
> alignments, etc from the previous run and re-run the entire pipeline then
> this issue disappears.
> I have no real reason why this happened. All i can say is that it is of
> course best to run experiments in different directories when you make a
> tweak to a pipeline.
> Lewis
> 
> On Thu, Oct 20, 2016 at 12:20 AM, lewis john mcgibbney 
> wrote:
> 
>> Hi dev@,
>> 
>> Sitting facing some issues with Thrax using Joshua master branch.
>> I invoke Joshua as follows
>> 
>> /usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero
>> --corpus 
>> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en
>> --tune 
>> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune
>> --test 
>> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test
>> --source en --target ru --readme "Experiment 1 Run 1 of ru --> en model
>> training" --aligner berkeley --tmp /usr/local/hadoop-2.5.2/hadoop_tmp_dir
>> --first-step thrax --no-prepare --alignment alignments/training.align
>> --hadoop-mem 10g
>> 
>> I make the first step thrax as I have previously computed my alignment as
>> indicated by the arguments.
>> My Thrax log is available at https://www.dropbox.com/s/
>> pxld70ki656fn13/thrax.log?dl=0. In the log you will see an exception as
>> follows
>> 
>> 16/10/19 22:56:59 WARN mapred.LocalJobRunner: job_local1314413872_0002
>> java.lang.Exception: java.lang.RuntimeException: Word id 2146928632 out
>> of range 0 1727042
>>at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(
>> LocalJobRunner.java:462)
>>at org.apache.hadoop.mapred.LocalJobRunner$Job.run(
>> LocalJobRunner.java:522)
>> Caused by: java.lang.RuntimeException: Word id 2146928632 out of range 0
>> 1727042
>>at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
>> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:133)
>>at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
>> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:121)
>>at org.apache.hadoop.mapred.MapTask$NewOutputCollector.
>> write(MapTask.java:692)
>>at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(
>> TaskInputOutputContextImpl.java:89)
>>at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.
>> write(WrappedMapper.java:112)
>>at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
>> or$Map.map(WordLexicalProbabilityCalculator.java:82)
>>at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
>> or$Map.map(WordLexicalProbabilityCalculator.java:28)
>>at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>>at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(
>> LocalJobRunner.java:243)
>>at java.util.concurrent.Executors$RunnableAdapter.
>> call(Executors.java:511)
>>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>>at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>>at java.lang.Thread.run(Thread.java:745)
>> 
>> I see no other issues until the end of the Thrax log where I see
>> 
>> class edu.jhu.thrax.hadoop.jobs.TargetWordGivenSourceWordProbabilityJob
>> FAILED
>> class edu.jhu.thrax.hadoop.jobs.OutputJobPREREQ_FAILED
>> class edu.jhu.thrax.hadoop.features.annotation.AnnotationFeatureJob
>> PREREQ_FAILED
>> class edu.jhu.thrax.hadoop.features.mapred.TargetPhraseGivenSourceFeature
>> SUCCESS
>> class edu.jhu.thrax.hadoop.jobs.ExtractionJobSUCCESS
>> class edu.jhu.thrax.hadoop.features.mapred.SourcePhraseGivenTargetFeature
>> SUCCESS
>> class edu.jhu.thrax.hadoop.jobs.VocabularyJobSUCCESS
>> class edu.jhu.thrax.hadoop.jobs.SourceWordGivenTargetWordProbabilityJob
>> FAILED
>> 
>> This issue has previously been reported by Matt over on
>> https://github.com/joshua-decoder/thrax/issues/10
>> 
>> Debugging right now folks.
>> Lewis
>> 
>> --
>> http://home.apache.org/~lewismc/
>> @hectorMcSpector
>> http://www.linkedin.com/in/lmcgibbney
>> 
> 
> 
> 
> -- 
> http://home.apache.org/~lewismc/
> @hectorMcSpector
> http://www.linkedin.com/in/lmcgibbney