See my response to your question on the Solr users’ list here: https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201602.mbox/%3CCY1PR09MB0795E8DBA7B2B6603A45820EC7A80%40CY1PR09MB0795.namprd09.prod.outlook.com%3E
I don’t think this is a Tika problem. This is the standard way that Solr’s DIH handles embedded documents…it concatenates all embedded documents onto one String. If you want to treat each individual attachment as a separate file, you’ll have to do preprocessing on your pst or run Tika on your own (see the RecursiveParserWrapper, perhaps) and send documents to Solr via SolrJ (https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/). From: Sreenivasa Kallu [mailto:sreenivasaka...@gmail.com] Sent: Tuesday, February 16, 2016 6:35 PM To: user@tika.apache.org Subject: tika is unable to extract outlook messages Hi , I am currently indexing individual outlook messages and searching is working fine. I have created solr core using following command. ./solr create -c sreenimsg1 -d data_driven_schema_configs I am using following command to index individual messages. curl "http://localhost:8983/solr/sreenimsg/update/extract?literal.id=msg9&uprefix=attr_&fmap.content=attr_content&commit=true" -F "myfile=@/home/ec2-user/msg9.msg<mailto:myfile=@/home/ec2-user/msg9.msg>" This setup is working fine. But new requirement is extract messages using outlook pst file. I tried following command to extract messages from outlook pst file. curl "http://localhost:8983/solr/sreenimsg1/update/extract?literal.id=msg7&uprefix=attr_&fmap.content=attr_content&commit=true" -F "myfile=@/home/ec2-user/sateamc_0006.pst<mailto:myfile=@/home/ec2-user/sateamc_0006.pst>" This command extracting only high level tags and extracting all messages into one message. I am not getting all tags when extracted individual messgaes. is above command is correct? is it problem not using recursion? how to add recursion to above command ? is it tika library problem? Please help to solve above problem. Advanced Thanks. --sreenivasa kallu