Re: Indexing information on number of attachments and their names in EML file

2019-08-14 Thread Zheng Lin Edwin Yeo
Hi Tim, Regarding the returning of the list of Metadata objects, is the code suppose to include the information on the number of attachments in the particular email and/or the name of the attachment? For example, if there are 3 attachments in the email, we should be able to see immediately from th

Re: Indexing information on number of attachments and their names in EML file

2019-08-02 Thread Zheng Lin Edwin Yeo
Thanks for the reply, will find out more about it. Currently I am able to retrieve the normal Metadata of the email, but not the Metadata of the attachments which are part of the contents in the EML file, which looks something like this. --d8b77b057d59ca19-- --d8b77e057d5

Re: Indexing information on number of attachments and their names in EML file

2019-08-02 Thread Tim Allison
I'd strongly recommend rolling your own ingest code. See Erick's superb: https://lucidworks.com/post/indexing-with-solrj/ You can easily get attachments via the RecursiveParserWrapper, e.g. https://github.com/apache/tika/blob/master/tika-parsers/src/test/java/org/apache/tika/parser/RecursiveParse

Re: Indexing information on number of attachments and their names in EML file

2019-08-02 Thread Jan Høydahl
Try the Apache Tika mailing list. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 2. aug. 2019 kl. 05:01 skrev Zheng Lin Edwin Yeo : > > Hi, > > Does anyone knows if this can be done on the Solr side? > Or it has to be done on the Tika side? > > Regards, > Edwin >

Re: Indexing information on number of attachments and their names in EML file

2019-08-01 Thread Zheng Lin Edwin Yeo
Hi, Does anyone knows if this can be done on the Solr side? Or it has to be done on the Tika side? Regards, Edwin On Thu, 1 Aug 2019 at 09:38, Zheng Lin Edwin Yeo wrote: > Hi, > > Would like to check, Is there anyway which we can detect the number of > attachments and their names during indexi

Indexing information on number of attachments and their names in EML file

2019-07-31 Thread Zheng Lin Edwin Yeo
Hi, Would like to check, Is there anyway which we can detect the number of attachments and their names during indexing of EML files in Solr, and index those information into Solr? Currently, Solr is able to use Tika and Tesseract OCR to extract the contents of the attachments. However, I could no