Thanks, Julien!
Your first recommendation worked great!
On Jan 27, 2012 5:35 PM, "Julien Nioche" <[email protected]>
wrote:

> of course you can also copy nutch-site.xml over to the hadoop conf dir on
> the master node
>
> On 26 January 2012 10:33, Julien Nioche <[email protected]>wrote:
>
>> Hi Ali
>>
>> You need to modify $NUTCH_HOME/conf/nutch-site.xml and rebuild the job
>> file with 'ant job'. In distributed mode the conf files are taken from
>> within the job file
>>
>> HTH
>>
>> Julien
>>
>>
>>
>>  The configuration files for the "local" mode are setup fine (since a
>>> crawl
>>> in local mode succeeded). However, for running in deploy mode (as output
>>> above), since the "deploy" folder did not have any "conf" subdirectory, I
>>> assumed that either:
>>> a) the conf files need to be copied over under "deploy/conf", OR
>>> b) the conf files need to be placed onto HDFS.
>>>
>>> I have verified that option (a) above does not fix the issue. So, I'm
>>> assuming that the Nutch configuration files need to exist in HDFS, for
>>> the
>>> HDFS fetcher to run successfully? However, I don't know at what path
>>> within
>>> HDFS I should place these Nutch conf files, or perhaps I'm barking up the
>>> wrong tree?
>>>
>>> If Nutch reads config files during "deploy" mode from the files under
>>> "local/conf", then why is it that the local crawl worked fine, but the
>>> deploy-mode crawl isn't?
>>>
>>
>>
>>
>> --
>> *
>> *Open Source Solutions for Text Engineering
>>
>> http://digitalpebble.blogspot.com/
>> http://www.digitalpebble.com
>>
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
>

Reply via email to