Dear Team, I have a query. I'm not sure if this is the right place to ask. But here it goes:
I have to crawl and index my website. These are the steps I have been asked to follow. Delete the crawl folders (apache-nutch-1.10\crawl) Remove the existing indexes: Solr-Admin-> Skyweb->Documents->Document Type (xml) and execute : Go to Solr-Admin -> Core Admin -> Click on 'Reload' and then 'Optimize' And run the crawl job using the following command: bin/crawl -i -D solr.server.url=http://IP:8080/solr/website/ urls/ crawl/ 5 I did some research and felt that doing these tasks manually is overwork and the script should take care of all the above tasks. So my queries\concerns are: Doesn't the above script take care of the entire process? Do I still need to delete the crawl folders and clear the existing indexes manually? What is the relevance of the Admin tasks - 'Reload' and 'Optimize'? Can I cron schedule the the crawl script to run weekly and will it take care of the entire process? How else can I automate the crawling and indexing to run periodically? Regards, Mohammed Ajmal Rahman Tata Consultancy Services Mailto: ajmal.rah...@tcs.com Website: http://www.tcs.com ____________________________________________ Experience certainty. IT Services Business Solutions Consulting ____________________________________________ =====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you