On Fri, 10 Sep 2010, Ken Krugler wrote:
The issue is that the definitions of the types that are supported come from
POI:
Collections.unmodifiableSet(new HashSet<MediaType>(Arrays.asList(
POIFSDocumentType.WORKBOOK.type,
POIFSDocumentType.OLE10_NATIVE.type,
POIFSDocumentType is actually a Tika class, not a poi one. However,
POIFSDocumentType does depend on several POI classes, as it contains both
a list of poi types, and a detector for them
Quite a lot of OfficeParser does depend on poifs code though, as well as a
few bits that depend on some of the less common POI text extractors.
Nick