There is also

org.apache.poi.util.StringUtil (in cad module)
and
org.apache.poi.util.LittleEndian (in code module)

Neither of these seem to have commons libraries replacements from what I can see. Given the small amount of code in the methods that are actually used would it make sense to move these into tika-core? Perhaps suggest a patch to the commons to take on some of these? Other suggestions?

- Bob

On 3/27/2016 9:39 AM, Bob Paulin wrote:
Hi,

Currently the Apache POI dependency is in several modules and it's sort of a beast (> 2 MB in size). It appears many of the modules are only using the IOUtils library. The big exception is the office module which is responsible for parsing documents. These methods appear to also exist in commons io which is only ~ 180 KB. Any concerns with replacing this POI stuff with commons-io? Does POI offer anything above the commons-io functionality in IOUtils? If not I think it would be great to isolate the poi dependency to the office module only.

- Bob

Reply via email to