Jan, Terence, Jesús,

Thank you for your feedback. 

I'm glad to know that research is being pursued in this direction as well. It's 
quite a surprising aspect (for me, at least) that BLEU is actually recommended 
only for tuning SMTs (by Philip Koehn himself!). 

 Best regards,
 Vadim
  ----- Original Message ----- 
  From: Vadim Berman 
  To: [email protected] 
  Sent: Wednesday, October 27, 2010 10:51 PM
  Subject: [Mt-list] Meaning-based MT evaluation


  Hello group,

  We all know well that words are not just sets of characters. They stand for 
real-world entities. Sadly, it appears that the mainstream metrics have little 
regards to semantics. OK, NIST has something like semantic weight, and METEOR 
tries to match synonyms. I am intrigued by the less-known BADGER, but it 
appears to have adopted by more or less the same approach. 

  However, it appears that the metrics themselves are looking at the content as 
string data with some semantic characteristics. (I am sure the developers of 
the metrics are present in this group, and will be happy to hear their 
comments.)

  The "similarity to human translation" is not an exact science: there can be 
many different correct (or equally incorrect) human translations. J. L. 
Borges's essay "The Translators of the One Thousand and One Nights" 
(http://books.google.com.au/books?id=vLC5luAnbSUC&lpg=PR8&dq=%22The%20Translators%20of%20the%20One%20Thousand%20and%20One%20Nights%22%20borges&pg=PA94#v=onepage&q=%22The%20Translators%20of%20the%20One%20Thousand%20and%20One%20Nights%22%20borges&f=false
 - sorry, that's the best I could find) is an excellent illustration. A 
professional translator is not a mathematical unit. 


  What's constant then? The main idea, the semantic relations. The agent, the 
patient, the circumstance. These should stay put in the source and the target 
content. 

  Back in 1960s, the ALPAC people noted that there are two variables in play: 
intelligibility and fidelity / trustworthiness. For some reason, the fidelity 
seems to be largely overlooked. That's bizarre, because MT is about gisting, 
isn't it? 

  OK, everybody heard about "Heath Ledger died" becoming "Tom Cruise died" in 
English -> Spanish translation by Google Translate. 100% readability, 0% 
fidelity. A pretty dangerous combination. (Worse than the opposite, BTW: 
disinformation is worse than no information.) This, actually, happens quite 
often in SMT, and when the agent / patient must be swapped in translation, more 
often than not they stay put, resulting the opposite meaning. Yet the metrics 
do not really reflect this well enough. So what if "not" is missing? It's just 
one small particle. 

  I'm just throwing ideas. Any critique / comments are welcome:

  1. What if there was a metric that would output not a single figure, but a 
multidimensional vector of values. 
  2. The metric would try to parse the source and target content, and convert 
it to a set of semantic nodes, which would be then inspected. 
  3. The values would be: 
    a.. semantic distance from the agent and patient semantic nodes in the 
source content 
    b.. semantic distance from the action 
    c.. same for circumstances (time, location), and indirect objects 
    d.. attributes like adjectives, adverbs, and some determiners and the 
correctness of their relative positions 
    e.. intelligibility: ratio between "parseability" (?) of the target content 
to the "parseability" of the source content 
    f.. choice of lexicon: the choice of words in the current domain. For 
instance, the same concept may have a synonym not appropriate in a current 
context (e.g. racial slur)

  Does any of this make sense?

  Best regards,
  Vadim Berman

  Digital Sonata Pty Ltd
  www.digitalsonata.com
  Australian Business Number: 54 122 188 998

  Address: PO Box 803, Camberwell, Vic 3124, Australia
  Phone: +61 (0)3 98094461
  Mobile: +61 (0)432 894 862
  Skype: vadimberman




------------------------------------------------------------------------------


  _______________________________________________
  Mt-list mailing list
_______________________________________________
Mt-list mailing list

Reply via email to