On Wed, 18 Jun 2014, Ray Gauss wrote:
I think for 2.0 we should consider splitting out parsers into their own projects for a streamlined dependency hierarchy then reassembling them with something like a tika-parsers-all artifact.

We had another thread on that not that long ago, where someone cautioned against breaking it up into too many pieces. We also have fairly frequent posts on the users list from people who aren't getting any content returned, because they've forgotten to include a dependency on tika-parsers

I'm not convinced that splitting tika parsers into 20 odd dependencies is really going to help more than it hinders - more people will get confused by missing dependencies they really wanted, and anyone with special needs about what does/doesn't get parsed is probably going to be taking such care that they can just exclude everything by default anyway and just pull in what they need. I'd probably rather we just gave an example pom snippet that shows how to exclude all except one thing, and let people with special cases work from there.

Nick

Reply via email to