OK so as it turns out passing '--aligner berkeley' to the pipeline.pl invocation does not currently work in master branch. My log simply prints
Error: Unable to access jarfile /usr/local/incubator-joshua/lib/berkeleyaligner.jar I'll get this sorted out and submit a PR to try and fix. Thanks On Wed, Jul 20, 2016 at 7:07 AM, Lewis John Mcgibbney < lewis.mcgibb...@gmail.com> wrote: > Hi Kellen and Matt, > > On Tue, Jul 19, 2016 at 8:20 PM, < > dev-digest-h...@joshua.incubator.apache.org> wrote: > >> From: Matt Post <p...@cs.jhu.edu> >> To: dev@joshua.incubator.apache.org >> Cc: >> Date: Sun, 17 Jul 2016 23:30:33 -0400 >> Subject: Re: Issue Building LM on master branch >> Lewis — This is a good-sized dataset, and on a single desktop machine, I >> expect it would take at least a day to go all the way through alignment, >> model-building, and tuning. >> > > OK thanks for the estimate. > > >> >> fast_align is a good idea, though it isn't integrated into the pipeline >> (shouldn't be too hard, and is on the list). You could also just try >> "--aligner berkeley" and see if that works. >> > > I'll do exactly that. Starting with berkeley first and then moving on to > fast_align. I'll update here with any progress. > > >> >> Do you see anything in the GIZA error logs (RUNDIR/alignment/0/...)? >> Sometimes GIZA doesn't compile correctly, and this could be an error where >> it doesn't find GIZA++ or one of the support binaries (mkcls, snt2cooc.out). >> >> > AFAICT I don't see any errors prior to the bottom dozen or so lines. I've > put the log below and would greatly appreciate if you could have a look > through it and provide some feedback. > http://home.apache.org/~lewismc/giza.log > I'll update this thread on the berkeley alignment outcome before shooting > to use the fast_align. > Thanks both again. > Lewis > -- *Lewis*