[jira] [Commented] (TIKA-2143) POI deprecated method used in TIKA 1.13

2016-11-22 Thread sbathrutheen (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687243#comment-15687243
 ] 

sbathrutheen commented on TIKA-2143:


We have requested our client for opts details. will update you as soon as got 
the details.

> POI deprecated method used in TIKA 1.13 
> 
>
> Key: TIKA-2143
> URL: https://issues.apache.org/jira/browse/TIKA-2143
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.9, 1.13
> Environment: Windows java application
>Reporter: sbathrutheen
> Fix For: 1.13
>
>
> We see that TIKA throws a long list of errors when extraction ppt files. We  
> tested with standalone tike application (1.13) we cannot reproduce the issue.
> We took a look at POI source code and abserved the class "HSLFSlideShow" we 
> could see the below deprecated method defined 
> *
> /**
> -  * Get the lookup from slide numbers to their offsets inside
> -  *  _ptrData, used when adding or moving slides.
> -  * 
> -  * @deprecated since POI 3.11, not supported anymore
> -  */
> - @Deprecated
> - public Hashtable getSlideOffsetDataLocationsLookup() {
> - throw new 
> UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
>  is not supported since 3.12-Beta1");
> - }
> *
> we may think Tika library still calling this deprecated method causing this 
> run time Exception
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
> ... 14 more
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
> at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
> at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
> at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:188)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2143) POI deprecated method used in TIKA 1.13

2016-11-22 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15686939#comment-15686939
 ] 

Tim Allison commented on TIKA-2143:
---

Any further info on this issue, [~sbathrutheen]?

> POI deprecated method used in TIKA 1.13 
> 
>
> Key: TIKA-2143
> URL: https://issues.apache.org/jira/browse/TIKA-2143
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.9, 1.13
> Environment: Windows java application
>Reporter: sbathrutheen
> Fix For: 1.13
>
>
> We see that TIKA throws a long list of errors when extraction ppt files. We  
> tested with standalone tike application (1.13) we cannot reproduce the issue.
> We took a look at POI source code and abserved the class "HSLFSlideShow" we 
> could see the below deprecated method defined 
> *
> /**
> -  * Get the lookup from slide numbers to their offsets inside
> -  *  _ptrData, used when adding or moving slides.
> -  * 
> -  * @deprecated since POI 3.11, not supported anymore
> -  */
> - @Deprecated
> - public Hashtable getSlideOffsetDataLocationsLookup() {
> - throw new 
> UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
>  is not supported since 3.12-Beta1");
> - }
> *
> we may think Tika library still calling this deprecated method causing this 
> run time Exception
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
> ... 14 more
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
> at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
> at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
> at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:188)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2143) POI deprecated method used in TIKA 1.13

2016-11-15 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15667086#comment-15667086
 ] 

Tim Allison commented on TIKA-2143:
---

What other opts are you using?  I'm happy to experiment if you share what 
you're using.  Perhaps there's another that causes problems with setting a 
value in TreeMap?

> POI deprecated method used in TIKA 1.13 
> 
>
> Key: TIKA-2143
> URL: https://issues.apache.org/jira/browse/TIKA-2143
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.9, 1.13
> Environment: Windows java application
>Reporter: sbathrutheen
> Fix For: 1.13
>
>
> We see that TIKA throws a long list of errors when extraction ppt files. We  
> tested with standalone tike application (1.13) we cannot reproduce the issue.
> We took a look at POI source code and abserved the class "HSLFSlideShow" we 
> could see the below deprecated method defined 
> *
> /**
> -  * Get the lookup from slide numbers to their offsets inside
> -  *  _ptrData, used when adding or moving slides.
> -  * 
> -  * @deprecated since POI 3.11, not supported anymore
> -  */
> - @Deprecated
> - public Hashtable getSlideOffsetDataLocationsLookup() {
> - throw new 
> UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
>  is not supported since 3.12-Beta1");
> - }
> *
> we may think Tika library still calling this deprecated method causing this 
> run time Exception
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
> ... 14 more
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
> at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
> at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
> at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:188)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2143) POI deprecated method used in TIKA 1.13

2016-11-11 Thread sbathrutheen (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15658238#comment-15658238
 ] 

sbathrutheen commented on TIKA-2143:


[~talli...@apache.org] the problem caused by this JVM option : 
-XX:+AggressiveOpts. The exception  is appearing in 2 servers but the JVM 
option is configure only in one.
Any additional advise you have for us regarding this issue?

> POI deprecated method used in TIKA 1.13 
> 
>
> Key: TIKA-2143
> URL: https://issues.apache.org/jira/browse/TIKA-2143
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.9, 1.13
> Environment: Windows java application
>Reporter: sbathrutheen
>Priority: Trivial
> Fix For: 1.13
>
>
> We see that TIKA throws a long list of errors when extraction ppt files. We  
> tested with standalone tike application (1.13) we cannot reproduce the issue.
> We took a look at POI source code and abserved the class "HSLFSlideShow" we 
> could see the below deprecated method defined 
> *
> /**
> -  * Get the lookup from slide numbers to their offsets inside
> -  *  _ptrData, used when adding or moving slides.
> -  * 
> -  * @deprecated since POI 3.11, not supported anymore
> -  */
> - @Deprecated
> - public Hashtable getSlideOffsetDataLocationsLookup() {
> - throw new 
> UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
>  is not supported since 3.12-Beta1");
> - }
> *
> we may think Tika library still calling this deprecated method causing this 
> run time Exception
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
> ... 14 more
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
> at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
> at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
> at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:188)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2143) POI deprecated method used in TIKA 1.13

2016-11-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15644880#comment-15644880
 ] 

Tim Allison commented on TIKA-2143:
---

I think this may be explained by the following from 
[StackOverflow|http://stackoverflow.com/questions/12181872/why-java-treemap-entryset-returns-a-set-of-simpleimmutableentry].
  A user was getting exceptions on one server but not the other with the same 
code when setting a value in a TreeMap.

{noformat}
Finally found that the problem was caused by this JVM option : 
-XX:+AggressiveOpts. It was only present on the server which was raising the 
exception. – Jan Aug 30 '12 at 13:59 
{noformat}

However, another issue is that if you think you're using Tika 1.13, there's 
still an old version of POI somewhere in the class path.  Tika 1.9 used POI 
3.12, and that version of 
[POI|http://svn.apache.org/viewvc/poi/tags/REL_3_12_FINAL/src/scratchpad/src/org/apache/poi/hslf/HSLFSlideShow.java?revision=1678500&view=markup]
 lines up with the line numbers in the stacktrace.



> POI deprecated method used in TIKA 1.13 
> 
>
> Key: TIKA-2143
> URL: https://issues.apache.org/jira/browse/TIKA-2143
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.9, 1.13
> Environment: Windows java application
>Reporter: sbathrutheen
>Priority: Trivial
> Fix For: 1.13
>
>
> We see that TIKA throws a long list of errors when extraction ppt files. We  
> tested with standalone tike application (1.13) we cannot reproduce the issue.
> We took a look at POI source code and abserved the class "HSLFSlideShow" we 
> could see the below deprecated method defined 
> *
> /**
> -  * Get the lookup from slide numbers to their offsets inside
> -  *  _ptrData, used when adding or moving slides.
> -  * 
> -  * @deprecated since POI 3.11, not supported anymore
> -  */
> - @Deprecated
> - public Hashtable getSlideOffsetDataLocationsLookup() {
> - throw new 
> UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
>  is not supported since 3.12-Beta1");
> - }
> *
> we may think Tika library still calling this deprecated method causing this 
> run time Exception
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
> ... 14 more
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
> at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
> at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
> at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:188)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2143) POI deprecated method used in TIKA 1.13

2016-11-07 Thread sbathrutheen (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15644646#comment-15644646
 ] 

sbathrutheen commented on TIKA-2143:


Hi Tim Allison,
Do you have any other suggestions on this issue.
Thanks


> POI deprecated method used in TIKA 1.13 
> 
>
> Key: TIKA-2143
> URL: https://issues.apache.org/jira/browse/TIKA-2143
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.9, 1.13
> Environment: Windows java application
>Reporter: sbathrutheen
>Priority: Trivial
> Fix For: 1.13
>
>
> We see that TIKA throws a long list of errors when extraction ppt files. We  
> tested with standalone tike application (1.13) we cannot reproduce the issue.
> We took a look at POI source code and abserved the class "HSLFSlideShow" we 
> could see the below deprecated method defined 
> *
> /**
> -  * Get the lookup from slide numbers to their offsets inside
> -  *  _ptrData, used when adding or moving slides.
> -  * 
> -  * @deprecated since POI 3.11, not supported anymore
> -  */
> - @Deprecated
> - public Hashtable getSlideOffsetDataLocationsLookup() {
> - throw new 
> UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
>  is not supported since 3.12-Beta1");
> - }
> *
> we may think Tika library still calling this deprecated method causing this 
> run time Exception
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
> ... 14 more
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
> at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
> at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
> at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:188)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2143) POI deprecated method used in TIKA 1.13

2016-11-03 Thread sbathrutheen (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15632838#comment-15632838
 ] 

sbathrutheen commented on TIKA-2143:


Hi,  
Actually this issue is reported by one of our clients , there is no classpath 
set in their servers.
Thanks


> POI deprecated method used in TIKA 1.13 
> 
>
> Key: TIKA-2143
> URL: https://issues.apache.org/jira/browse/TIKA-2143
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.9, 1.13
> Environment: Windows java application
>Reporter: sbathrutheen
>Priority: Trivial
> Fix For: 1.13
>
>
> We see that TIKA throws a long list of errors when extraction ppt files. We  
> tested with standalone tike application (1.13) we cannot reproduce the issue.
> We took a look at POI source code and abserved the class "HSLFSlideShow" we 
> could see the below deprecated method defined 
> *
> /**
> -  * Get the lookup from slide numbers to their offsets inside
> -  *  _ptrData, used when adding or moving slides.
> -  * 
> -  * @deprecated since POI 3.11, not supported anymore
> -  */
> - @Deprecated
> - public Hashtable getSlideOffsetDataLocationsLookup() {
> - throw new 
> UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
>  is not supported since 3.12-Beta1");
> - }
> *
> we may think Tika library still calling this deprecated method causing this 
> run time Exception
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
> ... 14 more
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
> at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
> at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
> at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:188)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2143) POI deprecated method used in TIKA 1.13

2016-11-01 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625484#comment-15625484
 ] 

Tim Allison commented on TIKA-2143:
---

Hi [~sbathrutheen], any luck finding an older version of POI on your classpath?

> POI deprecated method used in TIKA 1.13 
> 
>
> Key: TIKA-2143
> URL: https://issues.apache.org/jira/browse/TIKA-2143
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.9, 1.13
> Environment: Windows java application
>Reporter: sbathrutheen
>Priority: Trivial
> Fix For: 1.13
>
>
> We see that TIKA throws a long list of errors when extraction ppt files. We  
> tested with standalone tike application (1.13) we cannot reproduce the issue.
> We took a look at POI source code and abserved the class "HSLFSlideShow" we 
> could see the below deprecated method defined 
> *
> /**
> -  * Get the lookup from slide numbers to their offsets inside
> -  *  _ptrData, used when adding or moving slides.
> -  * 
> -  * @deprecated since POI 3.11, not supported anymore
> -  */
> - @Deprecated
> - public Hashtable getSlideOffsetDataLocationsLookup() {
> - throw new 
> UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
>  is not supported since 3.12-Beta1");
> - }
> *
> we may think Tika library still calling this deprecated method causing this 
> run time Exception
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
> ... 14 more
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
> at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
> at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
> at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:188)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2143) POI deprecated method used in TIKA 1.13

2016-10-25 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605797#comment-15605797
 ] 

Tim Allison commented on TIKA-2143:
---

Hi [~sbathrutheen], thank you for raising this.  From the stacktrace, it looks 
like the problem is initiated by Tika calling ...I don't think this is a 
problem with Tika calling a deprecated method.  Is there a chance you have a 
version of POI on your classpath that is older than 3.15-beta1?  The line 
numbers in the stacktrace don't line up at all with POI 3.15-beta1's 
[HSLFSlideShow|http://svn.apache.org/viewvc/poi/tags/REL_3_15_BETA1/src/scratchpad/src/org/apache/poi/hslf/usermodel/HSLFSlideShow.java?view=markup#l188]
 

> POI deprecated method used in TIKA 1.13 
> 
>
> Key: TIKA-2143
> URL: https://issues.apache.org/jira/browse/TIKA-2143
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.9, 1.13
> Environment: Windows java application
>Reporter: sbathrutheen
>Priority: Trivial
> Fix For: 1.13
>
>
> We see that TIKA throws a long list of errors when extraction ppt files. We  
> tested with standalone tike application (1.13) we cannot reproduce the issue.
> We took a look at POI source code and abserved the class "HSLFSlideShow" we 
> could see the below deprecated method defined 
> *
> /**
> -  * Get the lookup from slide numbers to their offsets inside
> -  *  _ptrData, used when adding or moving slides.
> -  * 
> -  * @deprecated since POI 3.11, not supported anymore
> -  */
> - @Deprecated
> - public Hashtable getSlideOffsetDataLocationsLookup() {
> - throw new 
> UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup()
>  is not supported since 3.12-Beta1");
> - }
> *
> we may think Tika library still calling this deprecated method causing this 
> run time Exception
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
> ... 14 more
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
> at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
> at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
> at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:188)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
> ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)