[ 
https://issues.apache.org/jira/browse/TIKA-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226461#comment-13226461
 ] 

Nick Burch commented on TIKA-873:
---------------------------------

Tika has a number of unit tests for the extraction of embedded resources from 
Word documents, in POIContainerExtractionTest

Are you having this problem for only some files, or all? Do you get some, all 
or none of the embedded resources out?
                
> Tika --extract fails for DOC
> ----------------------------
>
>                 Key: TIKA-873
>                 URL: https://issues.apache.org/jira/browse/TIKA-873
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.0
>         Environment: Windows 7 + Java v1.6
>            Reporter: Albert L.
>             Fix For: 1.2
>
>         Attachments: embedded.doc
>
>
> A file that is embedded in an DOCfile doesn't get extracted to disk.
> To "embed" a file into an DOC, simply drag-drop it into an DOC document when 
> using MS-Word 2010.  It will then create an EMF of the embedded file's 
> preview.
> See attached file "embedded.doc" for an example input file that fails with 
> Tika v1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to