[ https://issues.apache.org/jira/browse/TIKA-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716318#comment-14716318 ]
Jukka Zitting commented on TIKA-1706: ------------------------------------- Note that o.a.tika.io is a part of the public API of tika-core, so even if we restore the commons-io dependency we should keep these classes for backwards compatibility (perhaps as dummies that just inherit the relevant commons-io classes or redirect static calls to there). I don't have a strong opinion here. I do think that the "no dependencies" principle of tika-core is useful and worth the overhead of a dozen duplicated classes. And a 30% increase in the tika-core footprint because of the added dependency would still be non-trivial. On the other hand the argument about missing out on improvements in commons-io is valid. Personally I'd start here by checking what exactly has changed in the classes we duplicate from commons-io. If it's just a few lines then I'd just merge those changes to Tika and be happy with that for the next five years. If there are more substantial improvements, switching back to a dependency is probably worth it. > Bring back commons-io to tika-core > ---------------------------------- > > Key: TIKA-1706 > URL: https://issues.apache.org/jira/browse/TIKA-1706 > Project: Tika > Issue Type: Improvement > Components: core > Reporter: Yaniv Kunda > Priority: Minor > Fix For: 1.11 > > > TIKA-249 inlined select commons-io classes in order to simplify the > dependency tree and save some space. > I believe these arguments are weaker nowadays due to the following concerns: > - Most of the non-core modules already use commons-io, and since tika-core is > usually not used by itself, commons-io is already included with it > - Since some modules use both tika-core and commons-io, it's not clear which > code should be used > - Having the inlined classes causes more maintenance and/or technology debt > (which in turn causes more maintenance) > - Newer commons-io code utilizes newer platform code, e.g. using Charset > objects instead of encoding names, being able to use StringBuilder instead of > StringBuffer, and so on. > I'll be happy to provide a patch to replace usages of the inlined classes > with commons-io classes if this is accepted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)