Hallo Sebastian! I have now installed hadoop, unfortunately there are problems. Will make a post..
Thanks Mike Am Di., 17. Jan. 2023 um 09:49 Uhr schrieb Sebastian Nagel <wastl.na...@googlemail.com.invalid>: > Hi Mike, > > the Nutch configuration files are included in the job file found in > runtime/deploy after build. This means you need to compile Nutch yourself > if used in "distributed" mode. > > For exercising, you can first work in "pseudo-distributed" mode, i.e. > on a single-node Hadoop cluster. All commands are the same than in fully > distributed mode. > > If it helps, I prepared some setup scripts to run Nutch in > pseudo-distributed mode: > https://github.com/sebastian-nagel/nutch-test-single-node-cluster > > Best, > Sebastian > > On 1/15/23 04:26, Mike wrote: > > I will now try to configure the bot url etc. before the building, > > but how and where do I configure between the crawls e.g. number of pages > > per host? > > > > where do I configure nutch in cluster mode? > > > > thx, mike > > >