Hi,

On Mon, Sep 26, 2011 at 5:40 PM, marco turchi <marco.tur...@gmail.com> wrote:
> corpus-coverage-summary and ttable-coverage-summary:
> what does each column represent?

- n-gram order
- number of occurrences in corpus/t-table
- distinct number of phrases in test set with this number of
occurrences ("type")
- total number of phrases in test set with this number of occurrences ("token")

For the low occurrence counts, this is reported on the web page on the top.

> ttable-coverage-by-phrase:
> I suppose that the second column is the number of source phrases in the tt
> table where that particular phrase appears, but what is it the third column?
> is the translation entropy?

Yes, translation entropy based on normalized forward phrase
translation probability.

> input-annotation:
> which information is reported after each sentence?

For each span over the input sentence:
- span range
- count in corpus
- count in ttable (number of distinct translations)
- translation table entropy

This is the basis of the colorful visualization over the input
sentence on the web page.

-phi
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to