[
https://issues.apache.org/jira/browse/TIKA-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15879807#comment-15879807
]
Tim Allison edited comment on TIKA-2273 at 2/23/17 3:54 AM:
------------------------------------------------------------
First draft of a patch. Not all tests are passing.
If anyone has a chance to review, I'd appreciate it!
Parsers that use the AutoDetectReader have to grab TikaConfig from
somewhere...I don't much like this.
This could lead to inefficiencies of creating the entire TikaConfig at each
parse for TXTParser and HtmlParser and others. I've mitigated this for those
using AutoDetectParser by including a TikaConfig in the ParseContext if a user
hasn't already specified one.
Are there better options?
was (Author: [email protected]):
First draft of a patch. If anyone has a chance to review, I'd appreciate it!
Parsers that use the AutoDetectReader have to grab TikaConfig from
somewhere...I don't much like this.
This could lead to inefficiencies of creating the entire TikaConfig at each
parse for TXTParser and others. I've mitigated this for those using
AutoDetectParser by including a TikaConfig in the ParseContext if a user hasn't
already specified one.
Are there better options?
> Enable configuration of EncodingDetectors via TikaConfig
> --------------------------------------------------------
>
> Key: TIKA-2273
> URL: https://issues.apache.org/jira/browse/TIKA-2273
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Priority: Minor
> Attachments: TIKA_2273_first_draft.patch
>
>
> It would be nice to allow easier configuration of encoding detectors. It
> should be straightforward to follow the example of detectors...(famous last
> words).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)