on a related note, you don't even have to use probabilities as features
in the phrase-table.
for instance, using counts(e|f) and counts(f|e), instead of p(e|f) and
p(f|e) gives ok translation. The features really are just scores.
using probabilities:
devtest2006: 27.55 BLEU-c ; 28.29 BLEU
nc-de
The weights are in the Moses config file that is produced by the MERT scripts.
Cheers,
Lane
On Wed, Sep 21, 2011 at 9:45 AM, Taylor Rose
wrote:
> Thanks for the information Kevin. Where would I find these feature
> weights? I've found files in Moses that I suspect might be the weights
> but they
Thanks for the information Kevin. Where would I find these feature
weights? I've found files in Moses that I suspect might be the weights
but they're not labeled and the file/directory names don't really help
either.
--
Taylor Rose
Machine Translation Intern
Language Intelligence
IRC: Handle: tros
Hey Taylor,
Sounds like you are trying to come up with a simple heuristic for scoring
phrase table entries for purposes of pruning. Many choices are possible
here, so it's good to check the literature as folks mentioned above. But as
far as I know there's no single optimal answer for this. Typicall
Taylor,
If you don't have a background in NLP or CL (or even if you do), I
highly recommend taking a look at Philipp's book "Statistical Machine
Translation"
I hope this doesn't come across as RTFM. That's not what I mean. :)
Cheers,
Lane
On Tue, Sep 20, 2011 at 3:45 PM, Taylor Rose
wrote:
> W
What would happen if I just multiplied the Direct Phrase Translation
probability φ(e|f) by the Direct Lexical weight Lex(e|f)? That seems
like it would work? Sorry if I'm asking dumb questions. I come from the
computational side of computational linguistics. I'm learning as fast as
I can.
--
Taylo
some terminology: these are feature values, not metrics.
feature values have a number of roles to play eg P(e | f) indicates
the chance that phrase e should be the translation of phrase f. these
values are designed to be used together, and weighted to produce an
overall score for a translation c
Taylor Rose wrote:
> So what exactly can I infer from the metrics in the phrase table? I want
> to be able to compare phrases to each other. From my experience,
> multiplying them and sorting by that number has given me more accurate
> phrases... Obviously calling that metric "probability" is wron
So what exactly can I infer from the metrics in the phrase table? I want
to be able to compare phrases to each other. From my experience,
multiplying them and sorting by that number has given me more accurate
phrases... Obviously calling that metric "probability" is wrong. My
question is: What is t
exactly, the only correct way to get real probabilities out would be
to compute the normalising constant and renormalise the dot products
for each phrase pair.
remember that this is best thought of as a set of scores, weighted
such that the relative proportions of each model are balanced
Miles
Taylor Rose wrote:
> I am looking at pruning phrase tables for the experiment I'm working on.
> I'm not sure if it would be a good idea to include the 'penalty' metric
> when calculating probability. It is my understanding that multiplying 4
> or 5 of the metrics from the phrase table would result
I am looking at pruning phrase tables for the experiment I'm working on.
I'm not sure if it would be a good idea to include the 'penalty' metric
when calculating probability. It is my understanding that multiplying 4
or 5 of the metrics from the phrase table would result in a probability
of the phr
12 matches
Mail list logo