[Corpora-List] Call for Participation: "Help us break LLMs" - Test suite sub-task of the Ninth Conference on Machine Translation (WMT24)

Eleftherios Avramidis via Corpora Fri, 22 Mar 2024 08:31:53 -0700

The “test suites” sub-task will be included for the sixth time in theGeneral MT Shared Task of the Conference on Machine Translation (WMT24).


*OVERVIEW*

Test suites are custom extensions to the test sets of the General MTShared Task, constructed so that they can focus on concrete aspects ofthe MT output. They consist of a source-side test-set and a customizedevaluation service. As opposed to the standard evaluation process whichproduces generic quality scores, test suites often produce separatefine-grained results for each phenomenon.

Since the usage of LLMs for translation is getting more popular, and weare expecting more LLMs participations in WMT this year, the theme ofthis year’s test suite sub-task is "Help us break LLMs", i.e. to revealweaknesses and serious flaws of LLMs when translating, hidden within theoverall high-quality generation.


*IMPORTANT DATES*

 * 11th April: Test suite source texts may be submitted for a pre-run
   on SoTA MT systems
 * 12th June: Test suite source texts must reach us
 * 11th July: Translated test suites shipped back to test suites authors:
 * TBC - August: Test suite description and analysis paper
 * 12th-13th November: Conference

Potential participants are kindly requested to fill in this form

https://forms.office.com/e/e4JuMTSWFF <the “Test suites” sub-task willbe included for the sixth time in the General MT Shared Task of theConference on Machine Translation (WMT24). , ,,*OVERVIEW* ,,Test suitesare custom extensions to the test sets of the General MT Shared Task,constructed so that they can focus on concrete aspects of the MT output.They consist of a source-side test-set and a customized evaluationservice. As opposed to the standard evaluation process which producesgeneric quality scores, test suites often produce separate fine-grainedresults for each phenomenon. ,,Since the usage of LLMs for translationis getting more popular, and we are expecting more LLMs participationsin WMT this year, the theme of this year’s test suite sub-task is "Helpus break LLMs", i.e. to reveal weaknesses and serious flaws of LLMs whentranslating, hidden within the overall high-quality generation. ,,,,*IMPORTANT DATES* ,,11th April: Test suite source texts may besubmitted for a pre-run on SoTA MT systems ,,12th June: Test suitesource texts must reach us ,,18th July: Translated test suites shippedback to test suites authors: ,,TBC - August: Test suite description andanalysis paper ,, ,,Potential participants are kindly requested to fillin this form ,,https://forms.office.com/e/e4JuMTSWFF ,, ,,Furtherinformation can be found in the dedicated page of the WMT website,,http://www2.statmt.org/wmt24/testsuite-subtask.html>


Further information can be found in the dedicated page of the WMT website

http://www2.statmt.org/wmt24/testsuite-subtask.html<http://www2.statmt.org/wmt24/testsuite-subtask.html>


--
Eleftherios Avramidis, senior researcher
German Research Center for Artificial Intelligence (DFKI)
departments: Design Research eXplorations, Speech and Language Technology

short name: Lefteris, (pronouns: he/him), languages: English, German, Greek

Website:https://www.dfki.de/~elav01
Address: Alt Moabit 91c, 10559 Berlin, Germany
Tel.:          +49 30 23895 1806
Sec.:          +49 30 23895 1800
Fax.:          +49 30 23895 1810

Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
Geschäftsführung: Prof. Dr. Antonio Krüger, Helmut Ditzer
Vorsitzender des Aufsichtsrats: Dr. Ferri Abolhassan
Amtsgericht Kaiserslautern, HRB 2313

_______________________________________________
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info

[Corpora-List] Call for Participation: "Help us break LLMs" - Test suite sub-task of the Ninth Conference on Machine Translation (WMT24)

Reply via email to