Re: Gone content not reported to Solr

2015-07-22 Thread Steven Hayles
Hi Sebastian Thanks for the explanation. If db.update.purge.404 is not set, would records with status DB_GONE stay forever, and Solr be repeatedly told to remove them? Steven Hayles Systems Analyst IT Services, University of Leicester, Propsect House, 94 Regent Rd, Leicester, LE1 7DA, UK T

Nutch on the cloud

2015-07-22 Thread Ankit Goel
Hi, After my runs on my lappy, I'm ready to port my work to the cloud. Planning to use Amazon. One thing I noticed when I started with nutch that there were a lot of things unsaid on the site/wiki and took me a lot of time to figure out. Pitfalls if I may call them. I dont really have code or scrip

Re: Nutch on the cloud

2015-07-22 Thread Mattmann, Chris A (3980)
Thanks Ankit for the honest feedback. Would you be willing to update our wiki and improve the instructions based on your experiences for our gotchas? We have a guide we have been working on ourselves to getting Nutch running and churning on ElasticMap Reduce. That’s where I’d recommend starting.