[ 
https://issues.apache.org/jira/browse/TIKA-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337882#comment-17337882
 ] 

ASF GitHub Bot commented on TIKA-3329:
--------------------------------------

chrismattmann commented on pull request #419:
URL: https://github.com/apache/tika/pull/419#issuecomment-830692404


   OK had to make some changes so that it would pass the forbidden APIs (RTG 
and RTGTest) and also to tika-server classic modules that were failing 
checkstyle (probably has for a while but my Maven version seems to care and 
have checkstyle as a failure case which I haven't seen). Also had to update the 
test b/c the translation returned slightly different than the original PR (it 
had an extra comma, and a period). Anyways it's fixed and works!
   
   ```
   [INFO] tika-server-classic ................................ SUCCESS [ 16.967 
s]
   [INFO] tika-server-client ................................. SUCCESS [  1.039 
s]
   [INFO] Apache Tika eval ................................... SUCCESS [  0.063 
s]
   [INFO] tika-eval-core ..................................... SUCCESS [ 13.070 
s]
   [INFO] tika-eval-app ...................................... SUCCESS [ 21.834 
s]
   [INFO] Apache Tika fuzzing ................................ SUCCESS [  0.938 
s]
   [INFO] Apache Tika examples ............................... SUCCESS [  9.334 
s]
   [INFO] Apache Tika Java-7 Components ...................... SUCCESS [  1.628 
s]
   [INFO] Apache Tika ........................................ SUCCESS [  0.024 
s]
   [INFO] 
------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] 
------------------------------------------------------------------------
   [INFO] Total time:  10:42 min
   [INFO] Finished at: 2021-05-01T13:41:08-07:00
   [INFO] 
------------------------------------------------------------------------
   [2]-  Done                    emacs 
tika-translate/src/test/java/org/apache/tika/language/translate/RTGTranslatorTest.java
   [3]+  Done                    emacs RTGTranslator.java  (wd: 
~/git/tika/tika-translate/src/main/java/org/apache/tika/language/translate)
   (wd now: ~/git/tika)
   (base) mattmann@proscuitto:~/git/tika$ 
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> RTG Translator with many-to-eng translation
> -------------------------------------------
>
>                 Key: TIKA-3329
>                 URL: https://issues.apache.org/jira/browse/TIKA-3329
>             Project: Tika
>          Issue Type: Improvement
>          Components: translation
>            Reporter: Thamme Gowda
>            Assignee: Chris Mattmann
>            Priority: Major
>
> The existing translation services in tika-translate are either 
> commercial/paid engines (e.g. Google, Microsoft  etc ) or not state of the 
> art (such as Joshua, Moses etc). 
> Reader Translator Generator () is a neural machine translation toolkit 
> [https://isi-nlp.github.io/rtg/]
>  and has the implementation of Transformer NMT model (current state of the 
> art). 
> It also has massively multilingual pretrained NMT model  ( many-to-English 
> translation direction)  
> [https://hub.docker.com/repository/docker/tgowda/rtg-model] 
> in which about 500 source languages are represented, with atleast ~300 source 
> languages have good enough quality (For a comparison Google translate has 
> ~106 languages, and Microsoft has about 80 languages). 
> This issue is for integrating RTG Translator into tika-translate
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to