Hi Tim,
Regarding the returning of the list of Metadata objects, is the code
suppose to include the information on the number of attachments in the
particular email and/or the name of the attachment?
For example, if there are 3 attachments in the email, we should be able to
see immediately from th
Thanks for the reply, will find out more about it.
Currently I am able to retrieve the normal Metadata of the email, but not
the Metadata of the attachments which are part of the contents in the EML
file, which looks something like this.
--d8b77b057d59ca19--
--d8b77e057d5
I'd strongly recommend rolling your own ingest code. See Erick's
superb: https://lucidworks.com/post/indexing-with-solrj/
You can easily get attachments via the RecursiveParserWrapper, e.g.
https://github.com/apache/tika/blob/master/tika-parsers/src/test/java/org/apache/tika/parser/RecursiveParse
Try the Apache Tika mailing list.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
> 2. aug. 2019 kl. 05:01 skrev Zheng Lin Edwin Yeo :
>
> Hi,
>
> Does anyone knows if this can be done on the Solr side?
> Or it has to be done on the Tika side?
>
> Regards,
> Edwin
>
Hi,
Does anyone knows if this can be done on the Solr side?
Or it has to be done on the Tika side?
Regards,
Edwin
On Thu, 1 Aug 2019 at 09:38, Zheng Lin Edwin Yeo
wrote:
> Hi,
>
> Would like to check, Is there anyway which we can detect the number of
> attachments and their names during indexi
Hi,
Would like to check, Is there anyway which we can detect the number of
attachments and their names during indexing of EML files in Solr, and index
those information into Solr?
Currently, Solr is able to use Tika and Tesseract OCR to extract the
contents of the attachments. However, I could no