On Sun, 27 Mar 2016, Bob Paulin wrote:
Currently the Apache POI dependency is in several modules and it's sort of a beast (> 2 MB in size).

You should've seen it before Jukka and Yegor spent a crazy ApacheCon hacking up the ooxml-lite support... ;-)

It appears many of the modules are only using the IOUtils library.

I suspect a strong overlap with the parser classes I've helped write...

Any concerns with replacing this POI stuff with commons-io? Does POI offer anything above the commons-io functionality in IOUtils? If not I think it would be great to isolate the poi dependency to the office module only.

A lot of the use is for endian-specific reading of numbers and strings. Might be a bit of stream stuff, but mostly that can be passed off to the Tika IO utils classes.

From a quick check, I can't see any endian number stuff in commons IO, but
I might of missed it, or it might be in a different commons module. If not, there might be something to be said for popping that POI logic along with some of the Ogg-Vorbis utils stuff (another one with my grubby mits all over it) into a more helpful general utils grouping

Nick

Reply via email to