Re: problem with URLS/nutch

2008-06-23 Thread Drew Hite
My understanding is Nutch is designed to use Hadoop to run in a distributed fashion across many machines. In order to scale across those machines, Nutch needs to be able to accept inputs through shared memory that all nodes in the cluster can read (this often means files on the Hadoop file system)

Re: problem with URLS/nutch

2008-06-23 Thread All day coders
Well if you want to add URL using the Nutch API then you should trace the program until you find the point where the directory containing the list of URL it's used for loading the list of URLs. On Mon, Jun 23, 2008 at 5:27 AM, yogesh somvanshi <[EMAIL PROTECTED]> wrote: > Hello all > > i m worrki

problem with URLS/nutch

2008-06-23 Thread yogesh somvanshi
Hello all i m worrking on Nutch. When u use standered crawl command like :bin/nutch crawl urls -dir crawl -depth 3 -topN 50 crawling do well but i want to remove need of that Url folder i want to change or replace urls folder with some Array or map ,but when i try to du some change to code then