There is also
org.apache.poi.util.StringUtil (in cad module)
and
org.apache.poi.util.LittleEndian (in code module)
Neither of these seem to have commons libraries replacements from what I
can see. Given the small amount of code in the methods that are
actually used would it make sense to move these into tika-core? Perhaps
suggest a patch to the commons to take on some of these? Other suggestions?
- Bob
On 3/27/2016 9:39 AM, Bob Paulin wrote:
Hi,
Currently the Apache POI dependency is in several modules and it's
sort of a beast (> 2 MB in size). It appears many of the modules are
only using the IOUtils library. The big exception is the office
module which is responsible for parsing documents. These methods
appear to also exist in commons io which is only ~ 180 KB. Any
concerns with replacing this POI stuff with commons-io? Does POI
offer anything above the commons-io functionality in IOUtils? If not I
think it would be great to isolate the poi dependency to the office
module only.
- Bob