On Fri, 10 Sep 2010, Ken Krugler wrote:
The issue is that the definitions of the types that are supported come from POI:

      Collections.unmodifiableSet(new HashSet<MediaType>(Arrays.asList(
        POIFSDocumentType.WORKBOOK.type,
        POIFSDocumentType.OLE10_NATIVE.type,

POIFSDocumentType is actually a Tika class, not a poi one. However, POIFSDocumentType does depend on several POI classes, as it contains both a list of poi types, and a detector for them

Quite a lot of OfficeParser does depend on poifs code though, as well as a few bits that depend on some of the less common POI text extractors.

Nick

Reply via email to