[ 
http://issues.apache.org/jira/browse/NUTCH-186?page=comments#action_12363998 ] 

Doug Cutting commented on NUTCH-186:
------------------------------------

The config rules at present are:

1. All user-settable values should be in nutch-default.xml, as documentation 
that they exist.  Any other config will override this.  This file should not be 
altered by users.

2. nutch-site.xml is always loaded last, overriding all other options.  This is 
empty by default.

mapred-default.xml was added specifically to permit the specification of things 
that a job can override.

I think the fix that's needed here is documentation.  The documentation for 
these parameters should perhaps caution against putting them in nutch-site.xml, 
and point folks towards mapred-default.xml.

We might eventually move to a more complex configuration, where we break things 
into modules, each with three parts: base, default, final.  So there could be a 
mapred-base.xml that listed all of the settable mapred parameters.  Then the 
overridable defauld value could be set in mapred-default.xml.  And 
non-overrideable values (e.g., the jobtracker host) could be specified in 
mapred-final.

> mapred-default.xml is over ridden by nutch-site.xml
> ---------------------------------------------------
>
>          Key: NUTCH-186
>          URL: http://issues.apache.org/jira/browse/NUTCH-186
>      Project: Nutch
>         Type: Bug
>     Versions: 0.8-dev
>  Environment: All
>     Reporter: Gal Nitzan
>     Priority: Minor
>  Attachments: myBeautifulPatch.patch
>
> If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and 
> also in mapred-default.xml the definitions from nutch-site.xml are those that 
> will take effect.
> So if a user mistakenly copies those entries into nutch-site.xml from the 
> nutch-default.xml she will not understand what happens.
> I would like to propose removing these setting completely from the 
> nutch-default.xml and put it only in mapred-default.xml where it belongs.
> I will be happy to supply a patch for that  if the proposition accepted.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to