Re: [Moses-support] [Moses-developers] Generation models with Mmsapt

Michael Denkowski Fri, 04 Sep 2015 22:01:48 -0700

I added non-binarized versions of all the model files including reordering
to the tarball:
https://drive.google.com/file/d/0B6trVKD0-obBdEV0dFg5RkN4Yjg/view?usp=sharing
.


Best,
Michael

On Fri, Sep 4, 2015 at 7:10 AM, Hieu Hoang <hieuho...@gmail.com> wrote:

> ok. I'm still getting segfault in Lex Reordering during loading. If you
> can provide the ro text file so I can binarize it myself, I can debug it.
> But not a priority, I can park the issue for another time
>
> Hieu Hoang
> Researcher
> New York University, Abu Dhabi
> http://www.hoang.co.uk/hieu
>
> On 4 September 2015 at 03:49, Michael Denkowski <
> michael.j.denkow...@gmail.com> wrote:
>
>> Hi Hieu,
>>
>> Yes, I have everything working together with the caveat about order in
>> the moses.ini file (https://github.com/moses-smt/mosesdecoder/pull/124).
>> The mmsapt files might be dependent on Boost version so I also included the
>> aligned bitext I used to build the model and rebuilt the tarball:
>> https://drive.google.com/file/d/0B6trVKD0-obBRHFjMGxRZTJvV1U/view?usp=sharing.
>> I made it a pull request instead of just merging it into master so you guys
>> could look over the changes since technically the default behavior could
>> change if the moses.ini file lists phrase tables before other features.
>>
>> Best,
>> Michael
>>
>> On Thu, Sep 3, 2015 at 8:44 PM, Hieu Hoang <hieuho...@gmail.com> wrote:
>>
>>> i saw your checkins, is it working for you now?
>>>
>>> Your test data doesn't seem to run for me, the lexical reordering file
>>> seems to be corrupt.
>>>
>>> (if you wanna share test data, can you do it via dropbox/google drive
>>> rather than the Moses github)
>>>
>>>
>>> On 03/09/2015 05:13, Michael Denkowski wrote:
>>>
>>> Sounds good.  I added a small test model to my branch:
>>> <https://github.com/moses-smt/mosesdecoder/raw/mjdenkowski/mmsapt-factor-test.tar.gz>
>>> https://github.com/moses-smt/mosesdecoder/raw/mjdenkowski/mmsapt-factor-test.tar.gz.
>>> This translates a sample of fr-en news with a Mmsapt, surface LM, and
>>> 400-class LM.
>>>
>>> --Michael
>>>
>>> On Wed, Sep 2, 2015 at 2:56 AM, Hieu Hoang <hieuho...@gmail.com> wrote:
>>>
>>>> It should work. The function
>>>>   EvaluateInIsolation()
>>>> in the LM is for optimisation reason. eg. if the target phrase is 'a b
>>>> c d' and the LM is a trigram, the trigrams 'a b c' and 'b c d' can be
>>>> precalculated in EvaluateInIsolation().
>>>>
>>>> Implementing a pt for factors requires setting up some variables, which
>>>> may not have happen yet in mmsapt. if you can send me a small example
>>>> model, i'll see what i can do
>>>>
>>>>
>>>> On 01/09/2015 02:11, Ulrich Germann wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> I have no experience with factored models, so I'm speculating here to
>>>> some degree. The reason the phrase table calls EvaluateInIsolation is
>>>> because all "isolated" phrase scores are considered when pruning. In my
>>>> opinion pruning should not happen within the phrase tables (for exactly the
>>>> reason that it does not allow feature functions to be agnostic about other
>>>> feature functions) but by whatever object calls all the phrase tables and
>>>> does the generation. However, for software legacy reasons, that's the way
>>>> it is right now, and I'm not likely to address this issue any time soon
>>>> myself. The most reasonable fix for this in my opinion is to move pruning
>>>> where it belongs --- post all the factor generation stuff.
>>>>
>>>> Hieu is probably still the person with the best understanding of how
>>>> factored phrase table entry generation works, so maybe he can chime in on
>>>> this ...
>>>>
>>>> Cheers - Uli
>>>>
>>>>
>>>> On Mon, Aug 31, 2015 at 11:29 PM, Michael Denkowski <
>>>> <michael.j.denkow...@gmail.com>michael.j.denkow...@gmail.com> wrote:
>>>>
>>>>> Hi Ulrich,
>>>>>
>>>>> I was looking into using a class-based LM with your dynamic phrase
>>>>> table via generation models.  I translate factor 0 to 0 with the Mmsapt,
>>>>> then generate target factor 1 (word class) with a GM.  The class-based LM
>>>>> operates on factor 1.
>>>>>
>>>>> I'm hitting a segfault on what appears to be an order-of-operations
>>>>> issue with the PT and LM.  In mmsapt.cpp:578, Mmsapt::mkTPhrase makes a
>>>>> call to tp->EvaluateInIsolation.  This calls all of the models, including
>>>>> the LMs.  The class LM tries to score factor 1, which doesn't exist yet
>>>>> (since generation happens after translation), and it dies.  By nature,
>>>>> other phrase tables don't have this issue since they can just pull up
>>>>> pre-computed scores.
>>>>>
>>>>> Is scoring with all of the models here a strategic choice to get
>>>>> better performance or would it be sufficient to just score with the PT
>>>>> features?  Thanks!
>>>>>
>>>>> --Michael
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Ulrich Germann
>>>> Senior Researcher
>>>> School of Informatics
>>>> University of Edinburgh
>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-developers mailing 
>>>> listMoses-developers@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-developers
>>>>
>>>>
>>>> --
>>>> Hieu Hoang
>>>> Researcher
>>>> New York University, Abu Dhabihttp://www.hoang.co.uk/hieu
>>>>
>>>>
>>>
>>> --
>>> Hieu Hoang
>>> Researcher
>>> New York University, Abu Dhabihttp://www.hoang.co.uk/hieu
>>>
>>>
>>
>

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] [Moses-developers] Generation models with Mmsapt

Reply via email to