[ 
https://issues.apache.org/jira/browse/TIKA-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716318#comment-14716318
 ] 

Jukka Zitting commented on TIKA-1706:
-------------------------------------

Note that o.a.tika.io is a part of the public API of tika-core, so even if we 
restore the commons-io dependency we should keep these classes for backwards 
compatibility (perhaps as dummies that just inherit the relevant commons-io 
classes or redirect static calls to there).

I don't have a strong opinion here. I do think that the "no dependencies" 
principle of tika-core is useful and worth the overhead of a dozen duplicated 
classes. And a 30% increase in the tika-core footprint because of the added 
dependency would still be non-trivial. On the other hand the argument about 
missing out on improvements in commons-io is valid.

Personally I'd start here by checking what exactly has changed in the classes 
we duplicate from commons-io. If it's just a few lines then I'd just merge 
those changes to Tika and be happy with that for the next five years. If there 
are more substantial improvements, switching back to a dependency is probably 
worth it.

> Bring back commons-io to tika-core
> ----------------------------------
>
>                 Key: TIKA-1706
>                 URL: https://issues.apache.org/jira/browse/TIKA-1706
>             Project: Tika
>          Issue Type: Improvement
>          Components: core
>            Reporter: Yaniv Kunda
>            Priority: Minor
>             Fix For: 1.11
>
>
> TIKA-249 inlined select commons-io classes in order to simplify the 
> dependency tree and save some space.
> I believe these arguments are weaker nowadays due to the following concerns:
> - Most of the non-core modules already use commons-io, and since tika-core is 
> usually not used by itself, commons-io is already included with it
> - Since some modules use both tika-core and commons-io, it's not clear which 
> code should be used
> - Having the inlined classes causes more maintenance and/or technology debt 
> (which in turn causes more maintenance)
> - Newer commons-io code utilizes newer platform code, e.g. using Charset 
> objects instead of encoding names, being able to use StringBuilder instead of 
> StringBuffer, and so on.
> I'll be happy to provide a patch to replace usages of the inlined classes 
> with commons-io classes if this is accepted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to