[ 
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847931#comment-17847931
 ] 

Tim Allison commented on TIKA-4243:
-----------------------------------

Separately, but related to this and also to TIKA-4252 -- should we allow for 
the serialization of ParseContext?

That would be the more natural way to set per-parse settings. That would also 
allow us to pass in an object that a fetcher could use for authentication, and 
we'd keep the Metadata object for, well, Metadata... in TIKA-4252. :D

> tika configuration overhaul
> ---------------------------
>
>                 Key: TIKA-4243
>                 URL: https://issues.apache.org/jira/browse/TIKA-4243
>             Project: Tika
>          Issue Type: New Feature
>          Components: config
>    Affects Versions: 3.0.0
>            Reporter: Nicholas DiPiazza
>            Priority: Major
>
> In 3.0.0 when dealing with Tika, it would greatly help to have a Typed 
> Configuration schema. 
> In 3.x can we remove the old way of doing configs and replace with Json 
> Schema?
> Json Schema can be converted to Pojos using a maven plugin 
> [https://github.com/joelittlejohn/jsonschema2pojo]
> This automatically creates a Java Pojo model we can use for the configs. 
> This can allow for the legacy tika-config XML to be read and converted to the 
> new pojos easily using an XML mapper so that users don't have to use JSON 
> configurations yet if they do not want.
> When complete, configurations can be set as XML, JSON or YAML
> tika-config.xml
> tika-config.json
> tika-config.yaml
> Replace all instances of tika config annotations that used the old syntax, 
> and replace with the Pojo model serialized from the xml/json/yaml.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to