Phew.. was about to ignore this one as it was hidden among lot of other
auto generated emails for wiki updates !!

On Wed, Mar 20, 2013 at 12:01 PM, kiran chitturi
<chitturikira...@gmail.com>wrote:

> Hi!
>
> I want to update the Nutch tutorials in the wiki with the crawl script
> (./bin/crawl). The presence of the crawl command in the tutorials makes
> users use these crawl command run in to issues which makes us suggest them
> use the crawl script instead of the command.
>
> Can we make it uniform all over wiki that crawl command is deprecated and
> it is recommended to use crawl script ?
>
> Yes. The references to crawl command must be replaced with the crawl
script in the tutorials.

Second, for a user running Nutch on a single node or local mode the default
> size of topN (50,000) makes the crawl run for a long time. Can we make the
> topN parameter configurable through the script ?
>
> I think that the crawl script has lot of hard-coding and can be used by
people for getting started with crawl without getting bugged with the
params, their explanations and optimal values to be set. The script
says "MODIFY
THE PARAMETERS BELOW TO YOUR NEEDS" so that anyone who feels to change
these values can modify it. I prefer keeping it as it is for now. Lets see
whats others have to say about this.


Thank you,
>
> --
> Kiran Chitturi
>
> <http://www.linkedin.com/in/kiranchitturi>
>
>
>

Reply via email to