Re: [Moses-support] Finding the top 5 most ambiguous words

2016-05-13 Thread Barry Haddow

Hi Joe

You could also look at the entropy of the distribution. I'll leave Matt 
to post the one-liner for that one,


cheers - Barry

On 13/05/16 15:10, Matt Post wrote:
gzip -cd model/phrase-table.gz | cut -d\| -f1 | sort | uniq -c | sort 
-nr | head -n5


(according to one definition of "ambiguous")

On May 11, 2016, at 2:53 AM, Joe Jean > wrote:


Hello,

How would you go about finding the top 5 most ambiguous words in a 
translation system just by looking at the phrase table and the 
lexical translation tables? Thanks.



___
Moses-support mailing list
Moses-support@mit.edu 
http://mailman.mit.edu/mailman/listinfo/moses-support




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Finding the top 5 most ambiguous words

2016-05-13 Thread Matt Post
gzip -cd model/phrase-table.gz | cut -d\| -f1 | sort | uniq -c | sort -nr | 
head -n5

(according to one definition of "ambiguous")

> On May 11, 2016, at 2:53 AM, Joe Jean  wrote:
> 
> Hello, 
> 
> How would you go about finding the top 5 most ambiguous words in a 
> translation system just by looking at the phrase table and the lexical 
> translation tables? Thanks.
> 
>  
> ___
> Moses-support mailing list
> Moses-support@mit.edu 
> http://mailman.mit.edu/mailman/listinfo/moses-support 
> 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support