[ 
https://issues.apache.org/jira/browse/TIKA-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated TIKA-1425:
------------------------------------
    Fix Version/s:     (was: 1.7)
                   1.8

- push to 1.8

> Automatic batching of Microsoft service calls
> ---------------------------------------------
>
>                 Key: TIKA-1425
>                 URL: https://issues.apache.org/jira/browse/TIKA-1425
>             Project: Tika
>          Issue Type: Improvement
>          Components: translation
>    Affects Versions: 1.6
>            Reporter: Lewis John McGibbney
>             Fix For: 1.8
>
>
> Right now when I use the following code I get the stack trace at the bottom 
> of this description. This seems to be because the Request URI is too large to 
> make the service request. We need to have a mechansim within the call to 
> Tika.translate which will, on a service-by-service basis, determine the 
> maximum Request URI which can be sent. I beleive that this should be on the 
> Tika side as how else am I meant to know the maximum request size?
> {code:title=translator.java|borderStyle=solid}
> +    Translator translate = new MicrosoftTranslator();
> +    ((MicrosoftTranslator) translate).setId("...");
> +    ((MicrosoftTranslator) translate).setSecret("...");
>      for (java.util.Map.Entry<Text, Parse> entry : parseResult) {
>        Parse parse = entry.getValue();
>        LOG.info("---------\nUrl\n---------------\n");
> @@ -201,7 +207,7 @@
>        System.out.print(parse.getData().toString());
>        if (dumpText) {
>          LOG.info("---------\nParseText\n---------\n");
> -        System.out.print(parse.getText());
> +        System.out.print(translate.translate(parse.getText(), "fr"));
>        }
> {code}
> {code:title=stacktrace.log|borderStyle=solid}
> Exception in thread "main" java.lang.Exception: [microsoft-translator-api] 
> Error retrieving translation : Server returned HTTP response code: 414 for 
> URL: 
> http://api.microsofttranslator.com/V2/Ajax.svc/Translate?&from=&to=fr&text=%D0%A4%D0...
> ...
>       at 
> com.memetix.mst.MicrosoftTranslatorAPI.retrieveString(MicrosoftTranslatorAPI.java:202)
>       at com.memetix.mst.translate.Translate.execute(Translate.java:61)
>       at com.memetix.mst.translate.Translate.execute(Translate.java:76)
>       at 
> org.apache.tika.language.translate.MicrosoftTranslator.translate(MicrosoftTranslator.java:104)
>       at org.apache.nutch.parse.ParserChecker.run(ParserChecker.java:210)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>       at org.apache.nutch.parse.ParserChecker.main(ParserChecker.java:228)
> Caused by: java.io.IOException: Server returned HTTP response code: 414 for 
> URL: 
> http://api.microsofttranslator.com/V2/Ajax.svc/Translate?&from=&to=fr&text=%D0%A4%D0%BE%D1%80%D1%83%D0%B...
> ...
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>       at 
> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1675)
>       at 
> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1673)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1671)
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1244)
>       at 
> com.memetix.mst.MicrosoftTranslatorAPI.retrieveResponse(MicrosoftTranslatorAPI.java:178)
>       at 
> com.memetix.mst.MicrosoftTranslatorAPI.retrieveString(MicrosoftTranslatorAPI.java:199)
>       ... 6 more
> Caused by: java.io.IOException: Server returned HTTP response code: 414 for 
> URL: 
> http://api.microsofttranslator.com/V2/Ajax.svc/Translate?&from=&to=fr&text=%D0%A4%D0%BE...
> ...
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626)
>       at 
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
>       at 
> com.memetix.mst.MicrosoftTranslatorAPI.retrieveResponse(MicrosoftTranslatorAPI.java:177)
>       ... 7 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to