[
https://issues.apache.org/jira/browse/TIKA-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847509#action_12847509
]
Daan de Wit commented on TIKA-388:
----------------------------------
I did not test it, and it might be a premature optimization, but wouldn't it be
better to check if the stream is already a BufferedInputStream?
> Don't trust streams that claim mark support
> -------------------------------------------
>
> Key: TIKA-388
> URL: https://issues.apache.org/jira/browse/TIKA-388
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Reporter: Jukka Zitting
> Assignee: Jukka Zitting
> Priority: Minor
> Fix For: 0.7
>
>
> As seen on tika-dev@ and in JCR-2576, there are some InputStream
> implementations that claim to support the mark feature, but lose the mark as
> soon as the end of stream has been reached. There's no way for a client to
> detect such behaviour, so it's probably best for Tika to always use
> BufferedInputStream to wrap incoming streams when mark support is needed.
> This may cause one layer of extra buffering, but avoids problems with such
> broken streams.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.