[Moses-support] hypergraph decoding with the decoder

2017-08-24 Thread Angli Liu
Hi all,

I'm wondering how to decode hypergraph using the

-search-algorithm 5

feature in the moses decoder? What format should the hypergraph be written
in? (Is it the same as what https://github.com/kpu/lazy requires?) What
format of the language model does it support?

Thanks,
Angli
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] lattice mbr output empty translation result

2017-03-29 Thread Angli Liu
Hi, I was using lattice mbr to decode the source sentences; the model was
tuned using mert. However, despite the fact that other decoding methods
such as maximum probability decoding and consensus decoding can output
results without a problem, mbr decoding using the -lmbr flag let the
decoder output an empty file, whatever size, scale and pruning factor I
set.

In its simplest form, the code that caused this problem is essentially
equivalent to the following:

moses \
-f moses.ini \
-output-unknowns file1 \
-n-best-list file2 50 \
-output-search-graph file3 \
-lmbr \
(-lmbr-p 0.8 -lmbr-r 0.8 -mbr-scale 5 -lmbr-pruning-factor 50) \
< in_file \
> out_file

1. parameters in parentheses are optional, though either way nothing was
output by the decoder.
2. the problem essentially is that it is out_file that tuned out to be
empty.

What was the problem? Thanks for your input in advance!

-Angli
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] nscores in phrase table binarization

2017-03-27 Thread Angli Liu
Hi Moses community,

What is -nscores used for as a parameter of
mosesdecoder/bin/processPhraseTableMin ?

(In the baseline system at http://www.statmt.org/moses/?n=Moses.Baseline,
this parameter was set to 4. )

Thanks!
Angli
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] How to print out intermediate confusion networks / lattices?

2017-03-24 Thread Angli Liu
I see! Does this mean that the default decoding algorithm and MBR/consensus
decoding are all trying to rerank the n-best list extracted from the search
graph (in different ways)? If so, does it make sense at all to try to
develop a search method to directly extract the best path from the search
graph, i.e., the lattice?

Thanks,
Angli

On Fri, Mar 24, 2017 at 8:54 AM, Philipp Koehn <p...@jhu.edu> wrote:

> Hi,
>
> the search graph does not include the MBR objective, since that is
> computed afterwards, on top of the n-best list extracted from the search
> graph.
>
> You can mix cube pruning and MBR together. As mentioned above, the
> "decision rule" (MBR vs. max-prob) is applied after search is finished.
>
> -phi
>
> On Fri, Mar 24, 2017 at 11:50 AM, Angli Liu <ang...@cs.washington.edu>
> wrote:
>
>> Thanks!
>>
>> Furthermore, does "output-search-graph" output the search graph only when
>> the default objective (posterior probability) is used, or also when minimum
>> bayes risk decoding / consensus decoding is used (smoothed BLEU)?
>>
>> Also, is cube pruning applicable to minimum bayes risk decoding or
>> consensus decoding? Namely, should I turn on -search-algorithm 1 when -lmbr
>> or -con is on?
>>
>> Thanks,
>> Angli
>>
>> On Fri, Mar 24, 2017 at 8:00 AM, Philipp Koehn <p...@jhu.edu> wrote:
>>
>>> Hi,
>>>
>>> the option to output the search graph is called "output-search-graph"
>>>
>>> See http://www.statmt.org/moses/?n=Advanced.Search for details.
>>>
>>> The source code is in $MOSES/moses-cmd and $MOSES/moses
>>>
>>> -phi
>>>
>>>
>>>
>>> On Thu, Mar 23, 2017 at 6:30 PM, Angli Liu <ang...@cs.washington.edu>
>>> wrote:
>>>
>>>> Hi Moses community,
>>>>
>>>> In decoding, is it possible to have Moses output a confusion network
>>>> (CN) or a word lattice (WL), instead of the decoded text for each sentence?
>>>> I'm aware that one parameter of the decoder is "-inputtype", so the
>>>> question is what parameter of the decoder should be used to determine the
>>>> output type (among CN, WL and plain texts)?
>>>>
>>>> Also, where can I exactly find the decoder code (responsible for what
>>>> the binary "moses" does) inside https://github.com/moses-smt/m
>>>> osesdecoder?
>>>>
>>>> Thanks,
>>>> Angli
>>>>
>>>> ___
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>
>>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] How to print out intermediate confusion networks / lattices?

2017-03-24 Thread Angli Liu
Thanks!

Furthermore, does "output-search-graph" output the search graph only when
the default objective (posterior probability) is used, or also when minimum
bayes risk decoding / consensus decoding is used (smoothed BLEU)?

Also, is cube pruning applicable to minimum bayes risk decoding or
consensus decoding? Namely, should I turn on -search-algorithm 1 when -lmbr
or -con is on?

Thanks,
Angli

On Fri, Mar 24, 2017 at 8:00 AM, Philipp Koehn <p...@jhu.edu> wrote:

> Hi,
>
> the option to output the search graph is called "output-search-graph"
>
> See http://www.statmt.org/moses/?n=Advanced.Search for details.
>
> The source code is in $MOSES/moses-cmd and $MOSES/moses
>
> -phi
>
>
>
> On Thu, Mar 23, 2017 at 6:30 PM, Angli Liu <ang...@cs.washington.edu>
> wrote:
>
>> Hi Moses community,
>>
>> In decoding, is it possible to have Moses output a confusion network (CN)
>> or a word lattice (WL), instead of the decoded text for each sentence? I'm
>> aware that one parameter of the decoder is "-inputtype", so the question is
>> what parameter of the decoder should be used to determine the output type
>> (among CN, WL and plain texts)?
>>
>> Also, where can I exactly find the decoder code (responsible for what the
>> binary "moses" does) inside https://github.com/moses-smt/mosesdecoder?
>>
>> Thanks,
>> Angli
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] How to print out intermediate confusion networks / lattices?

2017-03-23 Thread Angli Liu
Hi Moses community,

In decoding, is it possible to have Moses output a confusion network (CN)
or a word lattice (WL), instead of the decoded text for each sentence? I'm
aware that one parameter of the decoder is "-inputtype", so the question is
what parameter of the decoder should be used to determine the output type
(among CN, WL and plain texts)?

Also, where can I exactly find the decoder code (responsible for what the
binary "moses" does) inside https://github.com/moses-smt/mosesdecoder?

Thanks,
Angli
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] BLEU score decoding word lattice

2017-02-01 Thread Angli Liu
Hi all,

Is there a way to do lattice decoding with BLEU in Moses? I.e., given a
word lattice, find the path that represents the highest BLEU score? If so,
what function to call and in what format should I feed a lattice in?

Thanks!
Angli
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Tuning for factored phrase based systems

2016-12-06 Thread Angli Liu
Thank you!

On Tue, Dec 6, 2016 at 12:55 AM Sašo Kuntaric <saso.kunta...@gmail.com>
wrote:

> Hi Angli,
>
> Here is an excerpt of Hieu's answers regarding this topic when I was doing
> research in factored models, might be of some help:
>
> On 30/06/2016 21:44, Sašo Kuntaric wrote:
>
> Hi all,
>
> I would like to ask one more question. When you say that my reference only
> has the surface form, are you talking about the "tuning corpus", which in
> the case of my command
>
> ~/mosesdecoder/scripts/training/mert-moses.pl
> ~/working/IT_corpus/TMX/txt/factored_corpus/singles/tuning_corpus.tagged.clean.en
> ~/working/IT_corpus/TMX/txt/factored_corpus/singles/
> tuning_corpus.tagged.clean.sl ~/mosesdecoder/bin/moses
> ~/working/IT_corpus/TMX/txt/factored_corpus/singles/test/model/moses.ini
> --mertdir ~/mosesdecoder/bin/ --decoder-flags="-threads all"
>
> are tuning_corpus.tagged.clean.en and tuning_corpus.tagged.clean.sl? Can
> tuning be done with files that only contains surface forms?
>
> it's usual that the reference tuning data does not have factors, even if
> there are factors in the phrase table. After all, you don't care if the
> output surface form is correct but the other factors are wrong.
>
> Will the results be compatible with tuning done with a factored tuning
> corpus?
>
> yes
>
> Best regards,
>
> Sašo
>
> 2016-12-04 1:37 GMT+01:00 Hieu Hoang <hieuho...@gmail.com>:
>
>
>
> Hieu
> Sent while bumping into things
>
> On 1 Dec 2016 07:01, "Angli Liu" <ang...@cs.washington.edu> wrote:
>
> Hi, what's the major difference between the tuning process for a factored
> phrase based system (i.e., surface+pos data) and a simple baseline phrase
> based system?
>
>
> Nothing, the tuning just optimise weights for feature functions.
>
> If you decompose your translation so that it has multiple phrase tables
> and generation models, then they are just extra feature functions with
> weights to be tuned
>
> Do I need to organize the dev set the same way as the training set (i.e.,
> surface|pos)?
>
> Yes
>
> Is there a tutorial on the moses website on this topic?
>
> Maybe this
> http://www.statmt.org/moses/?n=FactoredTraining.FactoredTraining
>
>
> Thanks!
>
> -Angli
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> lp,
>
> Sašo
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Tuning for factored phrase based systems

2016-11-30 Thread Angli Liu
Hi, what's the major difference between the tuning process for a factored
phrase based system (i.e., surface+pos data) and a simple baseline phrase
based system? Do I need to organize the dev set the same way as the
training set (i.e., surface|pos)? Is there a tutorial on the moses website
on this topic? Thanks!

-Angli
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] BLEU score on dev set doesn't match what's reported in moses.ini

2016-10-26 Thread Angli Liu
Hi - I trained a phrase based system from a low resource language to
english, and got *13.6633* as the BLEU score. However, when I tested on the
same dev set and computed BLEU against the English corpus in the dev set, I
only got *3.69*. Then I did a manual grid search over the parameter space
in moses.ini (the one that's generated upon the end of tuning/development),
and got the BLEU of *3.77* at best. Both recasing and tokenization are used
to the dev set I computed BLEU on.
I'm wondering what could be the potential reason why the BLEU score
reported in moses.ini derived from the dev set doesn't align with the one I
computed with the same dev set.

Thanks.

- Angli
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support