[ 
https://issues.apache.org/jira/browse/TIKA-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112202#comment-13112202
 ] 

Nick Burch commented on TIKA-712:
---------------------------------

I'd suggest you take the pptx file (it'll be simpler to poke around in that the 
ppt one), and unzip it. Then, look at the xml file for the master slide, and 
see how the text you've added differs from the boilerplate parts. Are there any 
obvious differences between the two? Are they in different sections? Different 
xml? Anything we could filter on? 

> Master slide text isn't extracted
> ---------------------------------
>
>                 Key: TIKA-712
>                 URL: https://issues.apache.org/jira/browse/TIKA-712
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Michael McCandless
>         Attachments: TIKA-712.patch, testPPT_masterFooter.ppt, 
> testPPT_masterFooter.pptx, testPPT_masterFooter2.ppt, 
> testPPT_masterFooter2.pptx
>
>
> It looks like we are not getting text from the master slide for PPT
> and PPTX.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to