[jira] [Comment Edited] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

Lewis John McGibbney (JIRA) Tue, 23 Aug 2016 21:43:35 -0700

    [ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434164#comment-15434164
 ]


Lewis John McGibbney edited comment on JOSHUA-304 at 8/24/16 4:42 AM:
----------------------------------------------------------------------

It should be noted that in order for me to override the exceptions thrown above 
the template ended up looking like the following N.B. the changes in values for 
forwardModels, reverseModels, mode and iters keys.
{code}
## word-align.conf
## ----------------------
## This is an example training script for the Berkeley
## word aligner.  In this configuration it uses two HMM
## alignment models trained jointly and then decoded 
## using the competitive thresholding heuristic.

##########################################
# Training: Defines the training regimen 
##########################################

forwardModels   HMM
reverseModels   HMM
mode                    JOINT
iters                   5

###############################################
# Execution: Controls output and program flow 
###############################################

execDir alignments/0
create
saveParams              false
numThreads              1
msPerLine               10000
alignTraining

#################
# Language/Data 
#################

foreignSuffix   es.0
englishSuffix   en.0

# Choose the training sources, which can either be directories or files that 
list files/directories
trainSources 
/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/data/train/splits/corpus
sentences        MAX
testSources /dev/null
overwriteExecDir true

#################
# 1-best output 
#################

competitiveThresholding

{code}


was (Author: lewismc):
It should be noted that in order for me to override the exceptions thrown above 
the template ended up looking like the following
{code}
## word-align.conf
## ----------------------
## This is an example training script for the Berkeley
## word aligner.  In this configuration it uses two HMM
## alignment models trained jointly and then decoded 
## using the competitive thresholding heuristic.

##########################################
# Training: Defines the training regimen 
##########################################

forwardModels   HMM
reverseModels   HMM
mode                    JOINT
iters                   5

###############################################
# Execution: Controls output and program flow 
###############################################

execDir alignments/0
create
saveParams              false
numThreads              1
msPerLine               10000
alignTraining

#################
# Language/Data 
#################

foreignSuffix   es.0
englishSuffix   en.0

# Choose the training sources, which can either be directories or files that 
list files/directories
trainSources 
/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/data/train/splits/corpus
sentences        MAX
testSources /dev/null
overwriteExecDir true

#################
# 1-best output 
#################

competitiveThresholding

{code}

> word-align.conf alignment template file not compatible with berkeley aligner
> ----------------------------------------------------------------------------
>
>                 Key: JOSHUA-304
>                 URL: https://issues.apache.org/jira/browse/JOSHUA-304
>             Project: Joshua
>          Issue Type: Bug
>          Components: alignment, berkeley, templates
>    Affects Versions: 6.0.5
>            Reporter: Lewis John McGibbney
>            Priority: Blocker
>             Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "5 5"
>       at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>       at java.lang.Integer.parseInt(Integer.java:580)
>       at java.lang.Integer.parseInt(Integer.java:615)
>       at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
>       at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
>       at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
>       at 
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
>       at 
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
>       at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
>       at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

Reply via email to