[ https://issues.apache.org/jira/browse/TIKA-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Burch resolved TIKA-637. ----------------------------- Resolution: Not A Problem Closing as "Not A Problem", as this is handled by supplying a recursing parser on the ParseContext. For an example of this, see how the -z option in the TikaCLI works > Need API to get list of embedded documents > ------------------------------------------ > > Key: TIKA-637 > URL: https://issues.apache.org/jira/browse/TIKA-637 > Project: Tika > Issue Type: New Feature > Components: parser > Affects Versions: 0.10 > Reporter: Manish > > Apache tika works great to extract the content and the meta data of > documents. > but if it can have APIs where it can get you individual documents' input > stream along with its content and meta data, it would be great. > For example, if it is extracting zip files, then if we can have the output in > the form of list of <text, metadata, inputstream> for each document, or > provide an callback for each <text, metadata, inputstream>, then it can be > used for both text extraction and also to extract individual documents from > container files. > I have already done it for zip and also PST. But if we can have some standard > API, then it would be great. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira