I don't know if this bug is in the way ToolRunner works, ToolBase works, or the way Nutch implements some of its jobs, but here is the scenario.

Many Nutch jobs (Injector for instance) use ToolBase and call the doMain(Configuration conf, String[] args) method to run. ToolBase now calls ToolRunner as return ToolRunner.run(this, args); The problem is that any the configuration object passed in to toolbase is not set as the conf object in Toolbase and so is essentially ignored by ToolRunner. So any nutch resources are ignored.

The solution to this is pretty simple:

public final int doMain(Configuration conf, String[] args) throws Exception {
    setConf(conf);
    return ToolRunner.run(this, args);
  }

But since we are moving away from ToolBase I didn't know if there is a better solution for this, for example should the current Nutch jobs be moved over to ToolRunner instead or should we make this simple change now for compatibility as we move the jobs to ToolRunner? Any guidance is appreciated.

Dennis Kubes

Reply via email to