Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia dc., 9 de set.
2020 a les 11:34:

> Hey guys,
> I'm writing a system demonstration to be submitted at LowResMT 2020 about
> the recent project that was done as part of GSoC, titled "Translating the
> internet into low resource languages with Apertium" (Accepting snazzier
> title suggestions).
>
> As part of this demonstration, I want to show some real world examples of
> how the new system of markup handling will help the translation of webpages
> and formatted documents - odt, pptx, rtx, etc. To show this effectively, I
> need to choose 3-4 released language pairs that are sufficiently
> syntactically divergent that they show the effect of markup reordering in
> the translation output. As far as I know, spa-cat is one of our most mature
> pairs, however I'm not sure how syntactically divergent it is. If it is,
> then I'm happy to be corrected. If your language pair has had issues with
> webpage translation and those issues are now solved (ish), then some
> examples would be really helpful.
>
>
Spanish and Catalan are very similar in terms of syntax. We could
definitely try to get examples of where diverge the most, but those
examples would need to be completely synthetic.

Markup handling helps, though, in markup handling on different areas: some
formats where inline tags are common (like ODT), previous
formatter/deformatter was splitting words where tags appeared, so
translation of those has improved quite a lot.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to