[jira] [Commented] (TIKA-3073) Add compression option to /rmeta output

2020-03-19 Thread Tim Allison (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062591#comment-17062591
 ] 

Tim Allison commented on TIKA-3073:
---

Thank you, [~grossws]!  That gave me the right words to google to the 
answer...turns out cxf handles this elegantly:

{noformat}
//set compression interceptors
List> outInterceptors = new 
ArrayList<>();
outInterceptors.add(new GZIPOutInterceptor());
sf.setOutInterceptors(outInterceptors);

List> inInterceptors = new 
ArrayList<>();
inInterceptors.add(new GZIPInInterceptor());
sf.setInInterceptors(inInterceptors);
{noformat}


> Add compression option to /rmeta output
> ---
>
> Key: TIKA-3073
> URL: https://issues.apache.org/jira/browse/TIKA-3073
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
>
> On TIKA-3069, [~carina.antunes] requested compressing /rmeta output. This 
> makes sense as a start...we might also look into allowing more 
> configurability around which metadata fields and file types to send back over 
> the wire.  Few people need everything...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TIKA-3073) Add compression option to /rmeta output

2020-03-18 Thread Konstantin Gribov (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061672#comment-17061672
 ] 

Konstantin Gribov commented on TIKA-3073:
-

[~tallison], usually webserver should accept HTTP {{Accept-Encoding: gzip, 
deflate}} header (you could set it with curl's --compressed), but I don't know 
how this should be configured in CXF. But it seems tika-server ignores it and 
just use {{chinked}}. So, IMHO, it's out of scope for JAX-RS but more to do 
with CXF/Jetty. Jetty itself has 
https://www.eclipse.org/jetty/documentation/current/gzip-filter.html which can 
be enabled for whole server using by adding it with 
{{org.eclipse.jetty.server.Server#insertHandler}}.

Some servers would return {{Content-Encoding}} instead of {{Transfer-Encoding}} 
and curl supports both. To test just call {{curl --compressed --http1.1 -v 
https://code.jquery.com/jquery-3.3.1.slim.min.js}} with and without 
{{--compressed}} flag.

> Add compression option to /rmeta output
> ---
>
> Key: TIKA-3073
> URL: https://issues.apache.org/jira/browse/TIKA-3073
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
>
> On TIKA-3069, [~carina.antunes] requested compressing /rmeta output. This 
> makes sense as a start...we might also look into allowing more 
> configurability around which metadata fields and file types to send back over 
> the wire.  Few people need everything...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TIKA-3073) Add compression option to /rmeta output

2020-03-16 Thread Tim Allison (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060344#comment-17060344
 ] 

Tim Allison commented on TIKA-3073:
---

[~sergey_beryozkin] [~sergeyb], is there a standard way to ask jax-rs to 
compress a stream that it returns?

> Add compression option to /rmeta output
> ---
>
> Key: TIKA-3073
> URL: https://issues.apache.org/jira/browse/TIKA-3073
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
>
> On TIKA-3069, [~carina.antunes] requested compressing /rmeta output. This 
> makes sense as a start...we might also look into allowing more 
> configurability around which metadata fields and file types to send back over 
> the wire.  Few people need everything...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)