[
https://issues.apache.org/jira/browse/CONNECTORS-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033092#comment-16033092
]
Julien Massiera commented on CONNECTORS-1428:
---------------------------------------------
A certain configuration for parsers may be very specific for the
repository/folder crawled. Specially if you want to replace for example the
standard DcXML parser by your own one to have a very different metadata/content
extraction behaviour concerning the XML files.
Having the configuration at the connector level implies to create different
Tika connectors. Maybe it is still best to let the configuration on this side,
what do you recommend ?
> Allow tika config parameter
> ---------------------------
>
> Key: CONNECTORS-1428
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1428
> Project: ManifoldCF
> Issue Type: Wish
> Components: Tika extractor
> Affects Versions: ManifoldCF 2.7
> Reporter: Julien Massiera
> Assignee: Karl Wright
> Priority: Minor
> Fix For: ManifoldCF 2.8
>
> Attachments: CONNECTORS-1428.patch
>
>
> It would be nice to have an option to pass a tika config file to the
> connector through the UI.
> The connector would load it in the "TikaParser" class like :
> private static Parser parser = new AutoDetectParser(new TikaConfig(new
> File("path/to/file")));
> This is just an example of course, it has to be done properly
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)