[ 
https://issues.apache.org/jira/browse/TIKA-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated TIKA-1067:
------------------------------------

    Component/s: parser
    
> Tika extracts non-existent asterisks (*) from .ppt files
> --------------------------------------------------------
>
>                 Key: TIKA-1067
>                 URL: https://issues.apache.org/jira/browse/TIKA-1067
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Michael McCandless
>
> I created a new blank presentation, put in title + subtitle, saved it as 
> .ppt, and then ran TikaCLI -t:
> {noformat}
> <body><div class="slideShow"><div class="slide"><p 
> class="slide-master-content">*<br/>
> *<br/>
> </p>
> <p class="slide-content">Testing<br/>
> testing<br/>
> </p>
> </div>
> </div>
> <div class="slideNotes"/>
> {noformat}
> The two extra *'s seem to be coming from the master slide, but I'm not sure 
> which text runs they are and how to stop them ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to