https://bz.apache.org/bugzilla/show_bug.cgi?id=69314
Bug ID: 69314
Summary: Header content from .doc not extracted
Product: POI
Version: 5.2.3-FINAL
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: HWPF
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
Over on https://issues.apache.org/jira/browse/TIKA-4307, August Valera shared a
.doc file whose header content is not being extracted.
The content is extracted when he converts the .doc to a .docx, and I can see
the content when I open the file in LibreOffice.
The debug logging file that August shared shows that POI identifies some issues
during the initial parse -- there may just be problems with the file.
I can confirm through the debugger that the content is in the document string,
but the ranges for the HeaderStories do not seem to include the header content.
Any help would be appreciated. Thank you!
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]