[ https://issues.apache.org/jira/browse/TIKA-819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris A. Mattmann updated TIKA-819: ----------------------------------- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Make Option to Exclude Embedded Files' Text for Text Content > ------------------------------------------------------------ > > Key: TIKA-819 > URL: https://issues.apache.org/jira/browse/TIKA-819 > Project: Tika > Issue Type: New Feature > Components: general > Affects Versions: 1.0 > Environment: Windows-7 + JDK 1.6 u26 > Reporter: Albert L. > Fix For: 1.8 > > > It would be nice to be able to disable text content from embedded files. > For example, if I have a DOCX with an embedded PPTX, then I would like the > option to disable text from the PPTX from showing up when asking for the text > content from DOCX. In other words, it would be nice to have the option to > get text content *only* from the DOCX instead of the DOCX+PPTX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)