Hi dev@,

Sitting facing some issues with Thrax using Joshua master branch.
I invoke Joshua as follows

/usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero
--corpus
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en
--tune
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune
--test
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test
--source en --target ru --readme "Experiment 1 Run 1 of ru --> en model
training" --aligner berkeley --tmp /usr/local/hadoop-2.5.2/hadoop_tmp_dir
--first-step thrax --no-prepare --alignment alignments/training.align
--hadoop-mem 10g

I make the first step thrax as I have previously computed my alignment as
indicated by the arguments.
My Thrax log is available at
https://www.dropbox.com/s/pxld70ki656fn13/thrax.log?dl=0. In the log you
will see an exception as follows

16/10/19 22:56:59 WARN mapred.LocalJobRunner: job_local1314413872_0002
java.lang.Exception: java.lang.RuntimeException: Word id 2146928632 out of
range 0 1727042
    at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Word id 2146928632 out of range 0
1727042
    at
edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculator$Partition.getPartition(WordLexicalProbabilityCalculator.java:133)
    at
edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculator$Partition.getPartition(WordLexicalProbabilityCalculator.java:121)
    at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
    at
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
    at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
    at
edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculator$Map.map(WordLexicalProbabilityCalculator.java:82)
    at
edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculator$Map.map(WordLexicalProbabilityCalculator.java:28)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

I see no other issues until the end of the Thrax log where I see

class edu.jhu.thrax.hadoop.jobs.TargetWordGivenSourceWordProbabilityJob
FAILED
class edu.jhu.thrax.hadoop.jobs.OutputJob    PREREQ_FAILED
class edu.jhu.thrax.hadoop.features.annotation.AnnotationFeatureJob
PREREQ_FAILED
class
edu.jhu.thrax.hadoop.features.mapred.TargetPhraseGivenSourceFeature
SUCCESS
class edu.jhu.thrax.hadoop.jobs.ExtractionJob    SUCCESS
class
edu.jhu.thrax.hadoop.features.mapred.SourcePhraseGivenTargetFeature
SUCCESS
class edu.jhu.thrax.hadoop.jobs.VocabularyJob    SUCCESS
class edu.jhu.thrax.hadoop.jobs.SourceWordGivenTargetWordProbabilityJob
FAILED

This issue has previously been reported by Matt over on
https://github.com/joshua-decoder/thrax/issues/10

Debugging right now folks.
Lewis

-- 
http://home.apache.org/~lewismc/
@hectorMcSpector
http://www.linkedin.com/in/lmcgibbney

Reply via email to