Hi all,
We are using Tika 3.0.0 with IBM Semeru Runtime Open Edition 21.0.5.11 (build
21.0.5+11-LTS) and doing the following:
try (final InputStream is = Files.newInputStream(file.toPath())) {
final DefaultDetector detector = new DefaultDetector();
final Metadata metadata = new Metadata();
metadata.add(RESOURCE_NAME_KEY, filename);
try (TikaInputStream doc = TikaInputStream.get(inputStream)) {
return detector.detect(doc, metadata);
}
}
We had an issue with a .ppt file which resulted in the following stack trace:
IOExceptionResetting to invalid mark
• java.io.BufferedInputStream in implReset
• java.io.BufferedInputStream in reset
• org.apache.commons.io.input.ProxyInputStream in reset at line 293
• org.apache.tika.io.TikaInputStream in reset at line 822
• org.apache.tika.io.TikaInputStream in getPath at line 710
• org.apache.tika.detect.microsoft.POIFSContainerDetector in
getTopLevelNames at line 566
• org.apache.tika.detect.microsoft.POIFSContainerDetector in detect at line
629
• org.apache.tika.detect.CompositeDetector in detect at line 84
Unfortunately I cannot share the file nor do I have access to it.
Could you help me figure out whether this is a bug in Tika or a user error on
my side?
Kind Regards,
Patrick Langer