[ 
https://issues.apache.org/jira/browse/TIKA-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264129#comment-13264129
 ] 

Nick Burch commented on TIKA-905:
---------------------------------

Are you able to identify where in the file these text boxes occur, and what 
sort of tags hold the text? If the text boxes don't occur in the main text 
area, can you identify how to link back from the main text to the text box? 
(You might find it helpful to review how annotations work, which we now support 
as of r1331640, for an idea of how this might work)
                
> Embedded text boxes and shapes with text not supported
> ------------------------------------------------------
>
>                 Key: TIKA-905
>                 URL: https://issues.apache.org/jira/browse/TIKA-905
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.0
>         Environment: Windows 7
>            Reporter: Gabriel Valencia
>              Labels: iWork
>         Attachments: testPagesEmbeddedJIRA.pages
>
>
> This is similar to TIKA-904 but for normal word processing documents. In 
> those, text contained in text boxes and shapes is not extracted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to