Thanks Ankit for the honest feedback. Would you be willing to update
our wiki and improve the instructions based on your experiences for
our gotchas?

We have a guide we have been working on ourselves to getting Nutch
running and churning on ElasticMap Reduce. That’s where I’d recommend
starting.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: Ankit Goel <ankitgoel2...@gmail.com>
Reply-To: "user@nutch.apache.org" <user@nutch.apache.org>
Date: Wednesday, July 22, 2015 at 5:51 PM
To: "user@nutch.apache.org" <user@nutch.apache.org>
Subject: Nutch on the cloud

>Hi,
>After my runs on my lappy, I'm ready to port my work to the cloud.
>Planning
>to use Amazon. One thing I noticed when I started with nutch that there
>were a lot of things unsaid on the site/wiki and took me a lot of time to
>figure out. Pitfalls if I may call them. I dont really have code or
>scripts, but I need nutch to run all the time on the cloud.
>
>So before I port to the cloud, are there any things I should beware of or
>lookout for? Like is AWS fine with nutch? Are there any configurations I
>should remember? Any advice on implementation to ease my transition and
>run
>nutch 24hrs? i will be running a seed file and crawl the net in general.
>Thanks
>
>-- 
>Regards,
>Ankit Goel
>http://about.me/ankitgoel

Reply via email to