Re: [Moses-support] Syntax-based Constrained Decoding

2016-11-16 Thread Hieu Hoang
good to know that the constrained decoding works. And yes, the 
reachability of the training data is only theoritical in the absence of 
pruning such as cube pruning, beams etc.



On 15/11/2016 20:00, Shuoyang Ding wrote:

Hi Hieu,

I’ve made change 1, 2, 4 before emailing you, and the coverage didn’t 
change much. It turns out the bottleneck is on beam-threshold — the 
default value was 1e-5, which is a pretty tough limit for constrained 
decoding.


After setting that to 0 I played around a little bit with cube-pruning 
limit. The coverage is around 25% to 40% depending on what number you 
use, but higher coverage comes with longer decoding time, which is 
what one would expect to happen.


Still, for string-to-tree constrained decoding the easiest way may 
still be decoding with phrase tables built per-sentence, since the 
decoding is generally slower. However, even for that, the default 
value of beam-threshold needs to be overridden in order to make it 
work properly.


Hope the info helps.

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding 

On Oct 28, 2016, at 9:27 AM, Hieu Hoang > wrote:


good point. The decoder is set up to translate quickly so there's a 
few pruning parameters which throws out low scoring rules or hypotheses.


These are some of the pruning parameters you'll need to change (there 
may be more):

  1. [feature]
  PhraseDictionaryWHATEVER table-limit=0
  2. [cube-pruning-pop-limit]
  100
  3. [beam-threshold]
  0
  4. [stack]
  100
Make the change 1 at a time in case it makes decoding too slow, even 
with constrained decoding.


It may be that you have to run the decoder with  phrase-tables that 
are trained only on 1 sentence at a time.


I'll be interested in knowing how you get on so let me know how it goes

On 26/10/2016 13:56, Shuoyang Ding wrote:

Hi All,

I’m trying to do syntax-based constrained decoding on the same data 
from which I extracted my rules, and I’m getting very low coverage 
(~12%). I’m using GHKM rule extraction which in theory should be 
able to reconstruct the target translation even only with minimal rules.


Judging from the search graph output, the decoder seems to prune out 
rules with very low scores, even if they are the only rule that can 
reconstruct the original reference.


I’m curious if there is a way in the current constrained decoding 
implementation such that I can disable pruning? Or at least, if it 
is feasible to do so?


Thanks!

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding 


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support






___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Syntax-based Constrained Decoding

2016-11-15 Thread Shuoyang Ding
Hi Hieu,

I’ve made change 1, 2, 4 before emailing you, and the coverage didn’t change 
much. It turns out the bottleneck is on beam-threshold — the default value was 
1e-5, which is a pretty tough limit for constrained decoding.

After setting that to 0 I played around a little bit with cube-pruning limit. 
The coverage is around 25% to 40% depending on what number you use, but higher 
coverage comes with longer decoding time, which is what one would expect to 
happen.

Still, for string-to-tree constrained decoding the easiest way may still be 
decoding with phrase tables built per-sentence, since the decoding is generally 
slower. However, even for that, the default value of beam-threshold needs to be 
overridden in order to make it work properly.

Hope the info helps.

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding 

> On Oct 28, 2016, at 9:27 AM, Hieu Hoang  wrote:
> 
> good point. The decoder is set up to translate quickly so there's a few 
> pruning parameters which throws out low scoring rules or hypotheses.
> 
> These are some of the pruning parameters you'll need to change (there may be 
> more):
>   1. [feature]
>   PhraseDictionaryWHATEVER table-limit=0
>   2. [cube-pruning-pop-limit]
>   100
>   3. [beam-threshold]
>   0
>   4. [stack]
>   100
> Make the change 1 at a time in case it makes decoding too slow, even with 
> constrained decoding.
> 
> It may be that you have to run the decoder with  phrase-tables that are 
> trained only on 1 sentence at a time.
> 
> I'll be interested in knowing how you get on so let me know how it goes
> 
> On 26/10/2016 13:56, Shuoyang Ding wrote:
>> Hi All,
>> 
>> I’m trying to do syntax-based constrained decoding on the same data from 
>> which I extracted my rules, and I’m getting very low coverage (~12%). I’m 
>> using GHKM rule extraction which in theory should be able to reconstruct the 
>> target translation even only with minimal rules.
>> 
>> Judging from the search graph output, the decoder seems to prune out rules 
>> with very low scores, even if they are the only rule that can reconstruct 
>> the original reference.
>> 
>> I’m curious if there is a way in the current constrained decoding 
>> implementation such that I can disable pruning? Or at least, if it is 
>> feasible to do so?
>> 
>> Thanks!
>> 
>> Regards,
>> Shuoyang Ding
>> 
>> Ph.D. Student
>> Center for Language and Speech Processing
>> Department of Computer Science
>> Johns Hopkins University
>> 
>> Hackerman Hall 225A
>> 3400 N. Charles St.
>> Baltimore, MD 21218
>> 
>> http://cs.jhu.edu/~sding
>> 
>> 
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> 

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Syntax-based Constrained Decoding

2016-10-28 Thread Hieu Hoang
good point. The decoder is set up to translate quickly so there's a few 
pruning parameters which throws out low scoring rules or hypotheses.

These are some of the pruning parameters you'll need to change (there 
may be more):
1. [feature]
PhraseDictionaryWHATEVER table-limit=0
2. [cube-pruning-pop-limit]
100
3. [beam-threshold]
0
4. [stack]
100
Make the change 1 at a time in case it makes decoding too slow, even 
with constrained decoding.

It may be that you have to run the decoder with  phrase-tables that are 
trained only on 1 sentence at a time.

I'll be interested in knowing how you get on so let me know how it goes

On 26/10/2016 13:56, Shuoyang Ding wrote:
> Hi All,
>
> I’m trying to do syntax-based constrained decoding on the same data from 
> which I extracted my rules, and I’m getting very low coverage (~12%). I’m 
> using GHKM rule extraction which in theory should be able to reconstruct the 
> target translation even only with minimal rules.
>
> Judging from the search graph output, the decoder seems to prune out rules 
> with very low scores, even if they are the only rule that can reconstruct the 
> original reference.
>
> I’m curious if there is a way in the current constrained decoding 
> implementation such that I can disable pruning? Or at least, if it is 
> feasible to do so?
>
> Thanks!
>
> Regards,
> Shuoyang Ding
>
> Ph.D. Student
> Center for Language and Speech Processing
> Department of Computer Science
> Johns Hopkins University
>
> Hackerman Hall 225A
> 3400 N. Charles St.
> Baltimore, MD 21218
>
> http://cs.jhu.edu/~sding
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Syntax-based Constrained Decoding

2016-10-26 Thread Shuoyang Ding
Hi All,

I’m trying to do syntax-based constrained decoding on the same data from which 
I extracted my rules, and I’m getting very low coverage (~12%). I’m using GHKM 
rule extraction which in theory should be able to reconstruct the target 
translation even only with minimal rules.

Judging from the search graph output, the decoder seems to prune out rules with 
very low scores, even if they are the only rule that can reconstruct the 
original reference.

I’m curious if there is a way in the current constrained decoding 
implementation such that I can disable pruning? Or at least, if it is feasible 
to do so?

Thanks!

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support