[ 
https://issues.apache.org/jira/browse/TIKA-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095133#comment-17095133
 ] 

Abhijit Rajwade edited comment on TIKA-3094 at 4/29/20, 7:37 AM:
-----------------------------------------------------------------

I am working with [~abchauha] on this issue.

One question.
I do not see reference to SparseBitSet in Tika 1.24 sources.
Is it required because Tika 1.24 uses POI 4.1.2 and POI added dependency on 
SparseBitSet 1.2?

Does the same issue exists with Tika 1.24.1 as well?


was (Author: arajwade):
I am working with [~abchauha] on this issue.

One question.
I do not see reference to SparseBitSet in Tika 1.24 sources.
Is it required because Tika 1.24 uses POI 4.1.2 and POI added dependency on 
SparseBitSet 1.2?

> Apache Tika fails to extract text for pptx extension.
> -----------------------------------------------------
>
>                 Key: TIKA-3094
>                 URL: https://issues.apache.org/jira/browse/TIKA-3094
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.24
>            Reporter: Abhishek Chauhan
>            Priority: Major
>         Attachments: Sample PPT.pptx
>
>
> This is regressed from 1.23 version of Apache Tika. Text extraction for .pptx 
> ententions which was earlier working with Apache Tika 1.23 is no longer 
> working in 1.24 version.
> For .ppt extention it is working fine in both 1.23 and 1.24
>  
> As I referred to release notes [https://tika.apache.org/1.24/index.html], you 
> have updated the POI to 4.1.2. That might be the root cause of this problem. 
> POI requires [https://mvnrepository.com/artifact/com.zaxxer/SparseBitSet/1.2] 
> which is not present in bundle I guess.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to