[ https://issues.apache.org/jira/browse/TIKA-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337882#comment-17337882 ]
ASF GitHub Bot commented on TIKA-3329: -------------------------------------- chrismattmann commented on pull request #419: URL: https://github.com/apache/tika/pull/419#issuecomment-830692404 OK had to make some changes so that it would pass the forbidden APIs (RTG and RTGTest) and also to tika-server classic modules that were failing checkstyle (probably has for a while but my Maven version seems to care and have checkstyle as a failure case which I haven't seen). Also had to update the test b/c the translation returned slightly different than the original PR (it had an extra comma, and a period). Anyways it's fixed and works! ``` [INFO] tika-server-classic ................................ SUCCESS [ 16.967 s] [INFO] tika-server-client ................................. SUCCESS [ 1.039 s] [INFO] Apache Tika eval ................................... SUCCESS [ 0.063 s] [INFO] tika-eval-core ..................................... SUCCESS [ 13.070 s] [INFO] tika-eval-app ...................................... SUCCESS [ 21.834 s] [INFO] Apache Tika fuzzing ................................ SUCCESS [ 0.938 s] [INFO] Apache Tika examples ............................... SUCCESS [ 9.334 s] [INFO] Apache Tika Java-7 Components ...................... SUCCESS [ 1.628 s] [INFO] Apache Tika ........................................ SUCCESS [ 0.024 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 10:42 min [INFO] Finished at: 2021-05-01T13:41:08-07:00 [INFO] ------------------------------------------------------------------------ [2]- Done emacs tika-translate/src/test/java/org/apache/tika/language/translate/RTGTranslatorTest.java [3]+ Done emacs RTGTranslator.java (wd: ~/git/tika/tika-translate/src/main/java/org/apache/tika/language/translate) (wd now: ~/git/tika) (base) mattmann@proscuitto:~/git/tika$ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > RTG Translator with many-to-eng translation > ------------------------------------------- > > Key: TIKA-3329 > URL: https://issues.apache.org/jira/browse/TIKA-3329 > Project: Tika > Issue Type: Improvement > Components: translation > Reporter: Thamme Gowda > Assignee: Chris Mattmann > Priority: Major > > The existing translation services in tika-translate are either > commercial/paid engines (e.g. Google, Microsoft etc ) or not state of the > art (such as Joshua, Moses etc). > Reader Translator Generator () is a neural machine translation toolkit > [https://isi-nlp.github.io/rtg/] > and has the implementation of Transformer NMT model (current state of the > art). > It also has massively multilingual pretrained NMT model ( many-to-English > translation direction) > [https://hub.docker.com/repository/docker/tgowda/rtg-model] > in which about 500 source languages are represented, with atleast ~300 source > languages have good enough quality (For a comparison Google translate has > ~106 languages, and Microsoft has about 80 languages). > This issue is for integrating RTG Translator into tika-translate > -- This message was sent by Atlassian Jira (v8.3.4#803005)