[ 
https://issues.apache.org/jira/browse/OAK-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072597#comment-16072597
 ] 

Thomas Mueller commented on OAK-5048:
-------------------------------------

> It triggers opening of the underlying stream which in case of S3DataStore 
> would trigger fetching of whole file

Do you know, is this LazyInputStream? I see even thought this one is called 
"Lazy", it's not lazy when calling markSupported... 

(Just an idea, not sure if it can be done) maybe if we wrap the LazyInputStream 
in a regular BufferedInputStream, then this is resolved. Because 
BufferedInputStream.markSupported always returns true (without calling the 
filtered input stream).

> Upgrade to Tika 1.15 version
> ----------------------------
>
>                 Key: OAK-5048
>                 URL: https://issues.apache.org/jira/browse/OAK-5048
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Tommaso Teofili
>            Assignee: Chetan Mehrotra
>             Fix For: 1.8
>
>
> Oak Lucene indes is currently using Tika 1.5 version while current latest 
> release of Apache Tika is 1.14, I think there're lots of "interesting" bugs 
> fixed, and possibly improvements (performance, more accurate text extraction, 
> etc.) we could get at almost 0 cost by just bumping the version number.
> Release notes https://tika.apache.org/1.15/index.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to