Hi Stefan, I think these are fine things to be doing. Just two points: (1) Why not just always pass the NutchConf to the constructor of any class that needs it? Instead of distinguishing between the case of whether the class will use 1 or 2 configuration parameters; or more than that. Just for consistency. Also, it's possible that a class that CURRENTLY only uses 2 configuration parameters will use 3 or 4 at some point in the future, and it would be a shame to have to rewrite its constructor when that happens. (2) What I'd REALLY like to see is if NutchConf were an interface, with methods that allow the retrieval of properties from any source. There could be a class NutchXmlConf which implements the NutchConf interface, which works the current way (with nutch-default.xml, nutch-site.xml and so on). Where we need to create a NutchConf, we actually create a NutchXmlConf, but pass it to class constructors whose arguments are of type NutchConf. That way, if I want to use a non-standard mechanism for storing my Nutch parameters (eg, a properties file, a relational database, the Windows Registry, whatever), I can write my own class that implements the NutchConf interface; then instantiate it and pass it around, without having to re-write every Nutch class that uses it. The benefits of (2) are legion. In particular, for people who want to use a Nutch search engine as part of an existing web application, where that existing application uses a specific (non-XML) mechanism for storing configuration parameters. It would also give extra flexibility for people working on Nutch installations that sit in multiple environments (Development, System Test, UAT, Production etc) and get deployed from one environment to the next. Regards, David. From: Stefan Groschupf <[EMAIL PROTECTED]> Date: Wed, 4 Jan 2006 15:39:38 +0100 Subject: [Nutch-dev] no static NutchConf
Hi, to move forward in the direction of having a nutch gui, I would love to start removing the static access of NutchConf. Based on experience first I would love to get a kind of general agreement and a 'go' before wasting to much time for an unaccented solution. I suggest: + removing NutchConf.get(). + in case a lower level object use only one, two but not more than 3 parameters from the nutch configuration, we add this parameter to the constructor of this object. (e.g. MapFile.Reader needs only the parameter INDEX_SKIP) + for higher level objects like fetcher tool- that need more than 3 parameters for the lower level object - we add a instance of NutchConf to the Constructor + for all dynamic used object that implements a specific interface (interface > no control over the object constructor) we use the Configurable interface to set the NutchConf in a inversion of control like style. (e.g. Plugin Extension Implementations like Parser or Protocols) + PluginRegestry will not longer a singleton but will get an constructor with a NutchConf instance. + Getting a Extension, require also a NutchConf that is injected in case the Extension Object (e.g. a Parser) implements a Configurable interface. Any comments, improvement suggestions, more use-cases? I would love to do this job, can I get a go from the other developers? >From my point of view NutchConf is actually a showblocker since a lot of people run in trouble integrating nutch in other projects, also my suggestions are require to write a nutch gui. Stefan ******************************************************************************** This email may contain legally privileged information and is intended only for the addressee. It is not necessarily the official view or communication of the New Zealand Qualifications Authority. If you are not the intended recipient you must not use, disclose, copy or distribute this email or information in it. If you have received this email in error, please contact the sender immediately. NZQA does not accept any liability for changes made to this email or attachments after sending by NZQA. All emails have been scanned for viruses and content by MailMarshal. NZQA reserves the right to monitor all email communications through its network. ********************************************************************************
