Re: [Moses-support] Pruning by the chart-decoder

Per Tunedal Tue, 18 Jun 2013 13:27:35 -0700

Hi Philip,
Thank you. I won't fiddle with the stack size for my Hierarchical models
then!
Yours,
Per Tunedal


On Tue, Jun 18, 2013, at 22:13, Philip Williams wrote:
> Hi Per,
> 
> the note about stack size having little impact on performance should be
> true for syntax-based as well, and 200 is probably more than enough for
> most syntax-based systems.  Increasing the stack size comes at the cost
> of additional memory consumption, and more so for syntax-based because
> the number of stacks is greater than for phrase-based: O(n^2) instead of
> O(n) where n is source sentence length (though -max-chart-span will often
> reduce the number in practice).
> 
> Phil
> 
> On 18 Jun 2013, at 20:39, Per Tunedal <[email protected]> wrote:
> 
> > Hi Philip,
> > thanks for your thorough answer. Just what I needed!
> > 
> > I've only got one more question: Why such a small stack size as 200 as
> > default? Why not 2000 as suggested on the Advanced Features page when
> > using decoding with cube pruning for phrase models:
> > 
> > "To get faster performance than the default Moses setting at roughly the
> > same performance, use the parameter settings:
> > 
> > -search-algorithm 1 -cube-pruning-pop-limit 2000 -s 2000
> > 
> > This uses cube pruning (-search-algorithm) that adds 2000 hypotheses to
> > each stack (-cube-pruning-pop-limit 2000) and also increases the stack
> > size to 2000 (-s 2000). Note that with cube pruning, the size of the
> > stack has little impact on performance, so it should be set rather high.
> > The speed/quality trade-off is mostly regulated by the cube pruning pop
> > limit, i.e. the number of hypotheses added to each stack."
> > 
> > BTW I've tried phrase model decoding as above, but with
> > -cube-pruning-pop-limit 1000 to match my chart models. The translation
> > quality (BLEU score) improved slightly, compared to the standard
> > decoding. And the translation was faster.
> > 
> > Yours,
> > Per Tunedal
> > 
> > On Tue, Jun 18, 2013, at 21:17, Philip Williams wrote:
> >> Hi Per,
> >> 
> >> the -stack option controls histogram pruning in moses_chart.  It's always
> >> on and the default stack size is 200 (the same as for phrase-based).
> >> 
> >> If your model is hierarchical or tree-to-string (i.e. no target syntax)
> >> then there will be up to two stacks per chart cell: one for X hypotheses
> >> and one for S hypotheses.  If your model has target syntax then there can
> >> potentially be many stacks per cell (one for each distinct LHS label for
> >> the hypotheses that cover the span).
> >> 
> >> The pop limit is applied on a per-cell basis and controls the number of
> >> hypotheses popped from the cube pruning queue (assuming you're using cube
> >> pruning, of course).  Once a hypothesis is popped it's added to one of
> >> the chart cell's stacks, chosen according to the hypothesis' label.  If
> >> the stack already contains one or more hypotheses with identical LM state
> >> values then the hypotheses are merged into a single stack entry
> >> (hypothesis recombination).  Finally, the stack is pruned so that it
> >> doesn't exceed the maximum number of entries.
> >> 
> >> The default pruning values are probably fine for most models.  If you're
> >> using both source and target syntax then translation quality is likely to
> >> be substantially worse than phrase-based due to syntactic divergence. 
> >> Solutions for this problem have been proposed in the literature (e.g.
> >> David Chiang's "Learning to Translate with Source and Target Syntax" in
> >> ACL 2010) but haven't made it into Moses yet.  Results for string-to-tree
> >> or tree-to-string should be closer to phrase-based.
> >> 
> >> Hope that helps...
> >> 
> >> Phil
> >> 
> >> 
> >> On 18 Jun 2013, at 17:49, Kenneth Heafield <[email protected]> wrote:
> >> 
> >>> There are a few kinds of pruning in syntactic decoding:
> >>> 
> >>> -ttable-limit sets the max number of target side rules per source rule.
> >>> 
> >>> -cube-pruning-pop-limit controls the beam size aka pop limit even if the 
> >>> algorithm isn't cube pruning.  Default 1000.
> >>> 
> >>> I think the parser can also throw out entire source side matches.  This 
> >>> is disabled by default; ask Phil Williams about it.
> >>> 
> >>> Kenneth
> >>> 
> >>> On 06/18/13 12:36, Per Tunedal wrote:
> >>>> Hi,
> >>>> Is histogram pruning always on? What's the default stack size? Why isn't
> >>>> it listed among the settings on the Syntax Tutorial page?
> >>>> 
> >>>> How is the search and pruning actually done for hierarchical models?
> >>>> When is the histogram pruning done?
> >>>> 
> >>>> I've tested with the settings:
> >>>> moses_chart -f trainTreeSyntaxInSourceAndTarget/model/moses.ini <
> >>>> fr-sv.test.fr > fr-sv.test.fr.TreeSyntaxInSourceAndTarget.translated 2>
> >>>> stderrTreeSyntaxInSourceAndTarget.txt
> >>>> 
> >>>> I suppose I got the default cube-pruning-pop-limit 1000, but what about
> >>>> the histogram pruning? I've read that the stack size isn't much of a
> >>>> problem when using cube pruning (Advanced Features page).
> >>>> 
> >>>> BTW The results are very disappointing.
> >>>> 
> >>>> Yours,
> >>>> Per Tunedal
> >>>> 
> >>>> PS I've read a lot of books and papers on the subject but got lost in
> >>>> the details. A very simplified explanation with the main steps for the
> >>>> implementation in Moses would be helpful.
> >>>> 
> >>>> On Tue, Jun 18, 2013, at 17:08, Hieu Hoang wrote:
> >>>>> there is historgram pruning in the chart decoder. It's implemented in
> >>>>>   ChartHypothesisCollection::PruneToSize()
> >>>>> there is no threshold pruning
> >>>>> 
> >>>>> 
> >>>>> On 17 June 2013 21:14, Per Tunedal <[email protected]> wrote:
> >>>>> 
> >>>>>> 
> >>>>>> Hi,
> >>>>>> I can't find any settings for e.g. histogram pruning for the
> >>>>>> chart-decoder. Is cube pruning the only pruning used by Moses
> >>>>>> chart-decoder?
> >>>>>> When using cube pruning with a phrase model I suppose both threshold
> >>>>>> pruning and histogram pruning are used as well. The cube pruning only
> >>>>>> affects what new hypothesis to add to the stack, if I've understood the
> >>>>>> Advanced Features page correctly.
> >>>>>> Have I missed something?
> >>>>>> Yours,
> >>>>>> Per Tunedal
> >>>>>> _______________________________________________
> >>>>>> Moses-support mailing list
> >>>>>> [email protected]
> >>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>>>>> 
> >>>>> 
> >>>>> 
> >>>>> 
> >>>>> --
> >>>>> Hieu Hoang
> >>>>> Research Associate
> >>>>> University of Edinburgh
> >>>>> http://www.hoang.co.uk/hieu
> >>>> _______________________________________________
> >>>> Moses-support mailing list
> >>>> [email protected]
> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>>> 
> >>> _______________________________________________
> >>> Moses-support mailing list
> >>> [email protected]
> >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >> 
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> 
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Pruning by the chart-decoder

Reply via email to