On Mon, 15 Jun 2015, Mattmann, Chris A (3980) wrote:
Hey nick I guess my point is that parser context aka config properties for parsers and custom config files e.g., x.properties loaded from the classpath aren't configured from Tika app or server

Ah, good point. In my ideal world, you'd set the "all documents of this kind" settings (eg paths) in the config, then set this "this document only" settings (eg pdf column count, pdf inline image settings) via a command line option to the app / request header to the server, converted into ParseContext options[1]. That would then be largely the same as for the pure-Java users.

Hopefully there aren't too many settings which are debatable as to what they are!

Not sure how huge a tika config file this would all lead to...

I could see some value in properties files, for things that don't change between machines but do need configuration, eg the mappings for external parsers. Since it isn't obvious if you've missed one, I'm not sure we want to use them heavily for customisations for paths etc


Also, since you mention having been caught out by missing jars or missing service files, maybe we need to put something on the wiki about how to check if you have what you expected? (IIRC we log if a parser can't be found or can't be loaded, so mostly it's about how to enable that)

Nick

[1] Do we have tickets for adding these in yet?

Reply via email to