Parsing Visio diagrams with tika-app causes TikaException (Found a chunk with a
negative length)
------------------------------------------------------------------------------------------------
Key: TIKA-316
URL: https://issues.apache.org/jira/browse/TIKA-316
Project: Tika
Issue Type: Bug
Affects Versions: 0.4, 0.5
Environment: Windows Server 2003 SP2, JRE 1.6.0_16, tika-app, Visio
2003
Reporter: Mike Hays
tika-app (0.4 and 0.5 nightly) return the following when attempting to parse a
Visio 2003 file (other versions may be affected):
Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected
RuntimeException from org.apache.tika.parser.microsoft.officepar...@145e044
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:123)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:103)
at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:176)
at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:63)
Caused by: java.lang.IllegalArgumentException: Found a chunk with a negative
length, which isn't allowed
at
org.apache.poi.hdgf.chunks.ChunkFactory.createChunk(ChunkFactory.java:120)
at
org.apache.poi.hdgf.streams.ChunkStream.findChunks(ChunkStream.java:59)
at
org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:93)
at
org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100)
at
org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100)
at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:95)
at
org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:52)
at
org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:49)
at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:118)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121)
... 3 more
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.