Hi

A CXF colleague of mine started experimenting with providing a light-weight search utility code as part CXF Search API which would use a Lucene handler (shipped with CXF for a while) which can translate FIQL or ODATA queries into a composite Lucene Query and use it against Tika provided metadata and content. The idea is not new, I believe SOLR does some very advanced Tika based search. In CXF users would use it as part of their regular JAX-RS applications.

The problem seems to be that Tika Parsers module contains many dependencies that may not be needed by a specific custom JAX-RS application.

For example, we'd expect a given application dealing with PDF only, or a certain set of image formats only, or word docs only, etc.

I'm not sure how many Tika-parsers dependencies are strongly required for any Tika application and which can be made optional.

If Tika Parsers does have some possibly optional dependencies then would it make sense to make them as such for external Tika consumers having not to download all the deps ? It would make a difference IMHO

Thanks, Sergey


Reply via email to