Also, check this link for SolrJ example code (including the recursion):
https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/

Geraint


Geraint Duck
Data Scientist
Toxicology and Health Sciences
Syngenta UK
Email: geraint.d...@syngenta.com

-----Original Message-----
From: Jan Høydahl [mailto:jan....@cominvent.com]
Sent: 16 October 2015 12:14
To: solr-user@lucene.apache.org
Subject: Re: Recursively scan documents for indexing in a folder in SolrJ

SolrJ does not have any file crawler built in.
But you are free to steal code from SimplePostTool.java related to directory 
traversal, and then index each document found using SolrJ.

Note that SimplePostTool.java tries to be smart with what endpoint to post 
files to, xml, csv and json content will be posted to /update while office docs 
go to /update/extract

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 16. okt. 2015 kl. 05.22 skrev Zheng Lin Edwin Yeo <edwinye...@gmail.com>:
>
> Hi,
>
> I understand that in SimplePostTool (post.jar), there is this command
> to automatically detect content types in a folder, and recursively
> scan it for documents for indexing into a collection:
> bin/post -c gettingstarted afolder/
>
> This has been useful for me to do mass indexing of all the files that
> are in the folder. Now that I'm moving to production and plans to use
> SolrJ to do the indexing as it can do more things like robustness
> checks and retires for indexes that fails.
>
> However, I can't seems to find a way to do the same in SolrJ. Is it
> possible for this to be done in SolrJ? I'm using Solr 5.3.0
>
> Thank you.
>
> Regards,
> Edwin


________________________________


Syngenta Limited, Registered in England No 2710846;Registered Office : Syngenta 
Limited, European Regional Centre, Priestley Road, Surrey Research Park, 
Guildford, Surrey, GU2 7YH, United Kingdom
________________________________
 This message may contain confidential information. If you are not the 
designated recipient, please notify the sender immediately, and delete the 
original and any copies. Any use of the message by you is prohibited.

Reply via email to