https://issues.apache.org/bugzilla/show_bug.cgi?id=53810
Priority: P2
Bug ID: 53810
Assignee: [email protected]
Summary: [PATCH] fix for incorrect loop detection in NPOIFS
Severity: normal
Classification: Unclassified
OS: Mac OS X 10.4
Reporter: [email protected]
Hardware: PC
Status: NEW
Version: 3.8
Component: POIFS
Product: POI
While upgrading our application to use Tika 1.2 (previously Tika 0.9), a few
PowerPoint 97-03 (PPT) files which previously parsed correctly started failing
with exceptions in NPOIFS.
The root cause appears to be a difference in the way that BAT entries are read
from XBAT blocks between POIFSFileSystem and NPOIFSFileSystem. In POIFS, the
header's getBATCount is used as a hard-limit for the number of BATs which are
read; in NPOIFS, XBATEntriesPerBlock are read for every XBAT, even if this
causes more total BAT entries to be read than header.getBATCount. In some
files, the extraneous BAT blocks are all initialized to the same value, which
is then detected as a possible cycle.
The attached PPT file demonstrates this problem (it was found via a web-crawler
search for test content, so I can not grant a license to Apache to redistribute
it). The attached patch implements similar behavior in NPOIFS to what exists in
POIFS, and allows the file to parse without exception.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]