Thanks for the information Kevin. Where would I find these feature weights? I've found files in Moses that I suspect might be the weights but they're not labeled and the file/directory names don't really help either. -- Taylor Rose Machine Translation Intern Language Intelligence IRC: Handle: trose Server: freenode
On Tue, 2011-09-20 at 23:32 -0400, Kevin Gimpel wrote: > Hey Taylor, > Sounds like you are trying to come up with a simple heuristic for > scoring phrase table entries for purposes of pruning. Many choices are > possible here, so it's good to check the literature as folks mentioned > above. But as far as I know there's no single optimal answer for this. > Typically researchers try a few things and use the approach that gives > the best results on the task at hand. But while there's no single > correct answer, here are some suggestions: > If you have trained weights for the features, you should definitely > use those weights (as Miles suggested). So this would involve > computing the dot product of the features and weights as follows: > score(f, e) = \theta_1 * log(p(e | f)) + \theta_2 * log(lex(e | f)) + > \theta_3 * log(p(f | e)) + \theta_4 * log(lex(f | e)) > where the thetas are the learned weights for each of the phrase table > features. > Note that the phrase table typically stores the feature values as > probabilities, and Moses takes logs internally before computing the > dot product. So you should take logs yourself before multiplying by > the feature weights. > If you don't have feature weights, using uniform weights is > reasonable. > And regarding your original question above: since the phrase penalty > feature has the same value for all phrase pairs, it shouldn't affect > pruning, right? > HTH, > Kevin > > On Tue, Sep 20, 2011 at 4:21 PM, Lane Schwartz <dowob...@gmail.com> > wrote: > Taylor, > > If you don't have a background in NLP or CL (or even if you > do), I > highly recommend taking a look at Philipp's book "Statistical > Machine > Translation" > > I hope this doesn't come across as RTFM. That's not what I > mean. :) > > Cheers, > Lane > > > On Tue, Sep 20, 2011 at 3:45 PM, Taylor Rose > <tr...@languageintelligence.com> wrote: > > What would happen if I just multiplied the Direct Phrase > Translation > > probability φ(e|f) by the Direct Lexical weight Lex(e|f)? > That seems > > like it would work? Sorry if I'm asking dumb questions. I > come from the > > computational side of computational linguistics. I'm > learning as fast as > > I can. > > -- > > Taylor Rose > > Machine Translation Intern > > Language Intelligence > > IRC: Handle: trose > > Server: freenode > > > > > > On Tue, 2011-09-20 at 12:11 -0400, Burger, John D. wrote: > >> Taylor Rose wrote: > >> > >> > So what exactly can I infer from the metrics in the > phrase table? I want > >> > to be able to compare phrases to each other. From my > experience, > >> > multiplying them and sorting by that number has given me > more accurate > >> > phrases... Obviously calling that metric "probability" is > wrong. My > >> > question is: What is that metric best indicative of? > >> > >> That product has no principled interpretation that I can > think of. Phrase pairs with high values on all four features > will obviously have high value products, but that's only > interesting because all the features happen to be roughly > monotonic in phrase quality. If you wanted a more principled > way to rank the phrases, I'd just use the MERT weights for > those features, and combine them with a dot product. > >> > >> Pre-filtering the phrase table is something lots of people > have looked at, and there are many approaches to this. I like > this paper: > >> > >> Improving Translation Quality by Discarding Most of the > Phrasetable > >> Johnson, John Howard; Martin, Joel; Foster, George; Kuhn, > Roland > >> > > http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/ctrl?action=shwart&index=an&req=5763542 > >> > >> - JB > >> > >> > On Tue, 2011-09-20 at 16:14 +0100, Miles Osborne wrote: > >> >> exactly, the only correct way to get real probabilities > out would be > >> >> to compute the normalising constant and renormalise the > dot products > >> >> for each phrase pair. > >> >> > >> >> remember that this is best thought of as a set of > scores, weighted > >> >> such that the relative proportions of each model are > balanced > >> >> > >> >> Miles > >> >> > >> >> On 20 September 2011 16:07, Burger, John D. > <j...@mitre.org> wrote: > >> >>> Taylor Rose wrote: > >> >>> > >> >>>> I am looking at pruning phrase tables for the > experiment I'm working on. > >> >>>> I'm not sure if it would be a good idea to include the > 'penalty' metric > >> >>>> when calculating probability. It is my understanding > that multiplying 4 > >> >>>> or 5 of the metrics from the phrase table would result > in a probability > >> >>>> of the phrase being correct. Is this a good > understanding or am I > >> >>>> missing something? > >> >>> > >> >>> I don't think this is correct. At runtime all the > features from the phrase table and a number of other features, > some only available during decoding, are combined in an inner > product with a weight vector to score partial translations. I > believe it's fair to say that at no point is there an explicit > modeling of "a probability of the phrase being correct", at > least not in isolation from the partially translated > sentence. This is not to say you couldn't model this > yourself, of course. > >> >>> > >> >>> - John Burger > >> >>> MITRE > >> >>> _______________________________________________ > >> >>> Moses-support mailing list > >> >>> Moses-support@mit.edu > >> >>> http://mailman.mit.edu/mailman/listinfo/moses-support > >> >>> > >> >>> > >> >> > >> >> > >> >> > >> > > >> > _______________________________________________ > >> > Moses-support mailing list > >> > Moses-support@mit.edu > >> > http://mailman.mit.edu/mailman/listinfo/moses-support > >> > >> _______________________________________________ > >> Moses-support mailing list > >> Moses-support@mit.edu > >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > _______________________________________________ > > Moses-support mailing list > > Moses-support@mit.edu > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > > -- > When a place gets crowded enough to require ID's, social > collapse is not > far away. It is time to go elsewhere. The best thing about > space travel > is that it made it possible to go elsewhere. > -- R.A. Heinlein, "Time Enough For Love" > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support