[ 
https://issues.apache.org/jira/browse/TIKA-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417703#comment-16417703
 ] 

Tim Allison commented on TIKA-2569:
-----------------------------------

Whoa!  This added a huge amount of newly extracted text in our regression 
corpus.  Thank you, [~BAEApache]!

> Grouped Text boxes in .ppt
> --------------------------
>
>                 Key: TIKA-2569
>                 URL: https://issues.apache.org/jira/browse/TIKA-2569
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.16
>            Reporter: Richard A
>            Assignee: Tim Allison
>            Priority: Major
>              Labels: easyfix
>             Fix For: 1.18, 2.0.0
>
>         Attachments: Presentation1.ppt, Presentation1.pptx
>
>
> Grouped Text boxes are unable to be parsed and no content is returned when 
> items have been grouped together. This issue does not seem to affect .pptx 
> files, only .ppt. The attached documents are the same except the file format. 
> It should give a very simple example of a .ppt document where no content will 
> be returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to