sbathrutheen created TIKA-2143:
----------------------------------
Summary: POI deprecated method used in TIKA 1.13
Key: TIKA-2143
URL: https://issues.apache.org/jira/browse/TIKA-2143
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.13, 1.9
Environment: Windows java application
Reporter: sbathrutheen
Priority: Trivial
Fix For: 1.13
We see that TIKA throws a long list of errors when extraction ppt files. We
tested with standalone tike application (1.13) we cannot reproduce the issue.
We took a look at POI source code and abserved the class "HSLFSlideShow" we
could see the below deprecated method defined
*
/**
- * Get the lookup from slide numbers to their offsets inside
- * _ptrData, used when adding or moving slides.
- *
- * @deprecated since POI 3.11, not supported anymore
- */
- @Deprecated
- public Hashtable<Integer,Integer> getSlideOffsetDataLocationsLookup() {
- throw new
UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
is not supported since 3.12-Beta1");
- }
*
we may think Tika library still calling this deprecated method causing this run
time Exception
Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException
from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at
com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
... 14 more
Caused by: java.lang.UnsupportedOperationException
at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
at org.apache.poi.hslf.HSLFSlideShow.<init>(HSLFSlideShow.java:188)
at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
... 17 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)