https://bz.apache.org/bugzilla/show_bug.cgi?id=61267
Javen O'Neal <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Meta data of attached word |Extract text from Microsoft |file gets parsed. However, |Word 2.0 (pre-OLE2) |content of file is not |document |parsed and is blank | Severity|major |enhancement --- Comment #3 from Javen O'Neal <[email protected]> --- There are several entry points into POI. We should figure out what class should be responsible for checking the first few bytes (magic number) of a file to figure out what file format it is (Tika style). We could continue adding known magic numbers to o.a.p.poifs.HeaderBlock, but we may want to reuse that code elsewhere, such as WorkbookFactory/DocumentFactory/SlideshowFactory, the Extractor classes for Tika, etc. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
