Re: no static NutchConf
Yes. I thought that the call of the method setConf only set the NutchConf. This is a philosophical question. All right, the implementation class can also load/set the fields. Some Map / Reduce Classes already use this mechanism. E.g. see CrawlDbReducer, there is a configure method. But here the JobConfigurable interface is used. Stefan
Re: no static NutchConf
Am 08.01.2006 um 16:08 schrieb Stefan Groschupf: Marko, as mentioned... All these classes will implement the NutchConfigurable interface. The plugin system will instantiate these objects and inject the nutch configuration object *BEFORE* it will return the object instance to the caller object. So we can be sure that setConf is called before any e.g. parse method is called. Thats right. So the answer is the fields will be setted / intialized in the setConf method that need to be implemented by each extension class and we have the agreement that this method is called directly after the constructor but before any other call. Does that clarify my suggestion? Yes. I thought that the call of the method setConf only set the NutchConf. This is a philosophical question. All right, the implementation class can also load/set the fields. Thanks, Marko
Re: no static NutchConf
Marko, as mentioned... All these classes will implement the NutchConfigurable interface. The plugin system will instantiate these objects and inject the nutch configuration object *BEFORE* it will return the object instance to the caller object. So we can be sure that setConf is called before any e.g. parse method is called. So the answer is the fields will be setted / intialized in the setConf method that need to be implemented by each extension class and we have the agreement that this method is called directly after the constructor but before any other call. Does that clarify my suggestion? Stefan Am 08.01.2006 um 15:49 schrieb Marko Bauhardt: + Getting a Extension, require also a NutchConf that is injected in case the Extension Object (e.g. a Parser) implements a Configurable interface. I think this is a good idea. But many plugins like BasicIndexingFilter or ExtParse require some fileds in the "parse" or "filter" method. These fields are load over the static way (over static NutchConf or static blocks). And this is ok, because the fields are load only one time. If we load the fields in the "parse" or "filter" methods, the fields would be load many times. And this is a performance problem. The initialization of the fields over the constructor does not work, because setConf() is calling after the constructor. Should we add a method like "loadNutchConfiguration()" to the NutchConfigurable interface, to load the NutchConfiguration Parameter? Hm, i don't know. Should the fields are loading in the setConf() method? Hm, the name of the method says: set the NutchConf and not load the required NutchConfiguration-Parameter. Has anyone an other elegant solution? Marko --- company:http://www.media-style.com forum:http://www.text-mining.org blog:http://www.find23.net
Re: no static NutchConf
+ Getting a Extension, require also a NutchConf that is injected in case the Extension Object (e.g. a Parser) implements a Configurable interface. I think this is a good idea. But many plugins like BasicIndexingFilter or ExtParse require some fileds in the "parse" or "filter" method. These fields are load over the static way (over static NutchConf or static blocks). And this is ok, because the fields are load only one time. If we load the fields in the "parse" or "filter" methods, the fields would be load many times. And this is a performance problem. The initialization of the fields over the constructor does not work, because setConf() is calling after the constructor. Should we add a method like "loadNutchConfiguration()" to the NutchConfigurable interface, to load the NutchConfiguration Parameter? Hm, i don't know. Should the fields are loading in the setConf() method? Hm, the name of the method says: set the NutchConf and not load the required NutchConfiguration-Parameter. Has anyone an other elegant solution? Marko
Re: no static NutchConf
Doug Cutting wrote: > Stefan Groschupf wrote: > >>> I have two more ideas: >>> 1) create NutchConf as interface (not class) >>> 2) make it work as plugin >> >> >> I like the idea to make the conf as a singleton and understand the >> need to be able to integrate nutch. >> However I would love to do one first step and later on we can make >> this second step. I made the experience that if you change to much >> people do not accept your patch. > > > +1 > > I don't see a big advantage in trying to make both of these changes at > the same time. And, when possible, small incremental changes are easier > for the community to process. I never thought to make these changes at once. These were just some thoughts on how to improve the nutch configuration. I agree with Stefan in this point. Thomas
Re: no static NutchConf
Stefan Groschupf wrote: I have two more ideas: 1) create NutchConf as interface (not class) 2) make it work as plugin I like the idea to make the conf as a singleton and understand the need to be able to integrate nutch. However I would love to do one first step and later on we can make this second step. I made the experience that if you change to much people do not accept your patch. +1 I don't see a big advantage in trying to make both of these changes at the same time. And, when possible, small incremental changes are easier for the community to process. Doug
Re: no static NutchConf
Stefan Groschupf wrote: Hi Andrzej, may be I come closer to your idea of caching some objects. Yes. If you remember our discussion, I'd like also to follow a pattern where such instances are cached inside this NutchConf instance, if appropriate (i.e. if they are reusable and multi- threaded). As mentioned I think it makes no sense to cache things like plugin extension object, but what you think about caching the PluginRepository that was already created with this specific configuration instance. Of course we can not serialize this, but I guess this will improve the performance somehow, since we do not need to scan the plugin folder and time. Yes, I agree on both accounts. :-) -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: no static NutchConf
Hi Andrzej, may be I come closer to your idea of caching some objects. Yes. If you remember our discussion, I'd like also to follow a pattern where such instances are cached inside this NutchConf instance, if appropriate (i.e. if they are reusable and multi- threaded). As mentioned I think it makes no sense to cache things like plugin extension object, but what you think about caching the PluginRepository that was already created with this specific configuration instance. Of course we can not serialize this, but I guess this will improve the performance somehow, since we do not need to scan the plugin folder and time. Stefan
Re: no static NutchConf
(2) What I'd REALLY like to see is if NutchConf were an interface, As mentioned, give us some time to get the first step done and than I'm sure such kind of community contributions are every-time welcome. May people can work together on this. Stefan
Re: no static NutchConf
I have two more ideas: 1) create NutchConf as interface (not class) 2) make it work as plugin I like the idea to make the conf as a singleton and understand the need to be able to integrate nutch. However I would love to do one first step and later on we can make this second step. I made the experience that if you change to much people do not accept your patch. This is painful since you invest some days of work and in the end wast your time for the trash. So lets add this to the jira as improvement suggestion and do this step after the actually change. Stefan
Re: no static NutchConf
But I like the direction and will not oppose against passing the whole NutchConf in this case. Ok than we will pass the NutchConf in the constructor. It is a lot of work and will may take some time.
Re: no static NutchConf
Hey Steve, Eclipse has a very good pattern for handling configuration for each of the components. Basically each component is responsible for its own configuration, and the tool just provides the framework to allow the configuration to be displayed, updated, and stored. I know the eclipse configuration mechanism, we have different case with nutch. The eclipse mechanism does not allow to run two eclipse in the same jvm, sure for eclipse that makes no sense, but for nutch it does very much (e.g. have a search engine for different parts of a corporate intranet on one box). The eclipse mechanism is a kind of singleton configuration and each component (eclipse plugin, load what it is interested in) for nutch we need to pass the configuration properties down the call stack, to be able running 2 fetchers with different configurations and having 2 instances of the same parser plugin but with different configuration values. Stefan
Re: no static NutchConf
> Another use case for eliminating the static uses of NutchConf is to > simplify the construction of a configuration gui. It would be nice to > have a web-based interface which permits one to configure parameters and > then have it run the system. Yes, it is a really needed feature. > This should be able to run multiple Nutch > instances in a single JVM. For example, a single Nutch-based "search > appliance" daemon should be able to crawl and search both your intranet > and your public websites, each configured separately. Ok, but why not using two JVM in such a case? Jérôme -- http://motrech.free.fr/ http://www.frutch.org/
Re: no static NutchConf
Stefan I would like to help you to do your project on the Nutch-based search appliance deamon. The reason is: I want to have experience and learn stuff. I started playing around with Nutch. I wrote a scraper in perl and now I am trying to run one of the sample plugins too ilango Stefan Groschupf <[EMAIL PROTECTED]> wrote: > Another use case for eliminating the static uses of NutchConf is to > simplify the construction of a configuration gui. It would be nice > to have a web-based interface which permits one to configure > parameters and then have it run the system. This should be able to > run multiple Nutch instances in a single JVM. For example, a > single Nutch-based "search appliance" daemon should be able to > crawl and search both your intranet and your public websites, each > configured separately. Well this is my long term goal, I have to do that for my project in any case. :-) Stefan - Yahoo! Photos Ring in the New Year with Photo Calendars. Add photos, events, holidays, whatever.
Re: no static NutchConf
Another use case for eliminating the static uses of NutchConf is to simplify the construction of a configuration gui. It would be nice to have a web-based interface which permits one to configure parameters and then have it run the system. This should be able to run multiple Nutch instances in a single JVM. For example, a single Nutch-based "search appliance" daemon should be able to crawl and search both your intranet and your public websites, each configured separately. Well this is my long term goal, I have to do that for my project in any case. :-) Stefan
Re: no static NutchConf
Hi Stefan, I think these are fine things to be doing. Just two points: (1) Why not just always pass the NutchConf to the constructor of any class that needs it? Instead of distinguishing between the case of whether the class will use 1 or 2 configuration parameters; or more than that. Just for consistency. Also, it's possible that a class that CURRENTLY only uses 2 configuration parameters will use 3 or 4 at some point in the future, and it would be a shame to have to rewrite its constructor when that happens. (2) What I'd REALLY like to see is if NutchConf were an interface, with methods that allow the retrieval of properties from any source. There could be a class NutchXmlConf which implements the NutchConf interface, which works the current way (with nutch-default.xml, nutch-site.xml and so on). Where we need to create a NutchConf, we actually create a NutchXmlConf, but pass it to class constructors whose arguments are of type NutchConf. That way, if I want to use a non-standard mechanism for storing my Nutch parameters (eg, a properties file, a relational database, the Windows Registry, whatever), I can write my own class that implements the NutchConf interface; then instantiate it and pass it around, without having to re-write every Nutch class that uses it. The benefits of (2) are legion. In particular, for people who want to use a Nutch search engine as part of an existing web application, where that existing application uses a specific (non-XML) mechanism for storing configuration parameters. It would also give extra flexibility for people working on Nutch installations that sit in multiple environments (Development, System Test, UAT, Production etc) and get deployed from one environment to the next. Regards, David. From: Stefan Groschupf <[EMAIL PROTECTED]> Date: Wed, 4 Jan 2006 15:39:38 +0100 Subject: [Nutch-dev] no static NutchConf Hi, to move forward in the direction of having a nutch gui, I would love to start removing the static access of NutchConf. Based on experience first I would love to get a kind of general agreement and a 'go' before wasting to much time for an unaccented solution. I suggest: + removing NutchConf.get(). + in case a lower level object use only one, two but not more than 3 parameters from the nutch configuration, we add this parameter to the constructor of this object. (e.g. MapFile.Reader needs only the parameter INDEX_SKIP) + for higher level objects like fetcher tool- that need more than 3 parameters for the lower level object - we add a instance of NutchConf to the Constructor + for all dynamic used object that implements a specific interface (interface > no control over the object constructor) we use the Configurable interface to set the NutchConf in a inversion of control like style. (e.g. Plugin Extension Implementations like Parser or Protocols) + PluginRegestry will not longer a singleton but will get an constructor with a NutchConf instance. + Getting a Extension, require also a NutchConf that is injected in case the Extension Object (e.g. a Parser) implements a Configurable interface. Any comments, improvement suggestions, more use-cases? I would love to do this job, can I get a go from the other developers? >From my point of view NutchConf is actually a showblocker since a lot of people run in trouble integrating nutch in other projects, also my suggestions are require to write a nutch gui. Stefan This email may contain legally privileged information and is intended only for the addressee. It is not necessarily the official view or communication of the New Zealand Qualifications Authority. If you are not the intended recipient you must not use, disclose, copy or distribute this email or information in it. If you have received this email in error, please contact the sender immediately. NZQA does not accept any liability for changes made to this email or attachments after sending by NZQA. All emails have been scanned for viruses and content by MailMarshal. NZQA reserves the right to monitor all email communications through its network.
Re: no static NutchConf
Andrzej Bialecki wrote: Example: what happens now if you try to run more than one fetcher at the same time, where the fetcher parameters differ (or a set of activated plugins differs)? You can't - the local tasks on each tasktracker will use whatever local config is there. That's true when mapred.job.tracker=local, but when things are distributed the config can vary since each task is spawned in a separate JVM with a separate classpath. The nutch-site.xml on each node can never be overidden. For example, so long as plugin.includes is not specified in nutch-site.xml on each node, then each task can override plugin.includes to use different plugins. Also note that plugin implementations can submitted in a jar file with the job, and plugin.folders can be overridden in the job to find the new plugins. So a job jar might include a folder named "my.plugins" and set plugin.folders to "my.plugins, plugins", then alter plugin.includes to include job-specific plugins. What happens if you change the config on a node that submits the job? The changes won't be propagated to the tasktracker nodes, because tasktrackers use local configuration (through a singleton NutchConf.get()), instead of supplying a serialized/deserialized instance of the config from the originating node... etc. Again, I'm not sure this is a problem. Properties which tasks should be able to override should not be specified in nutch-site.xml, but rather in mapred-default.xml. Lots of job-specific properties are currently passed this way. Another use case for eliminating the static uses of NutchConf is to simplify the construction of a configuration gui. It would be nice to have a web-based interface which permits one to configure parameters and then have it run the system. This should be able to run multiple Nutch instances in a single JVM. For example, a single Nutch-based "search appliance" daemon should be able to crawl and search both your intranet and your public websites, each configured separately. Doug
Re: no static NutchConf
Hi, Stefan Groschupf wrote: [...] > Any comments, improvement suggestions, more use-cases? I completely agree with you. I have two more ideas: 1) create NutchConf as interface (not class) 2) make it work as plugin 1) If NutchConf is an interface, the NutchConf implementation can be written with a hashmap in mind (like now) or with JMX or commons-configuration. 2) There are only 4 required configuration options (plugin.excludes, plugin.includes, plugin.folders, plugin.auto-activation) the plugin registry needs to start up. If these options are provided by a bootstrap configuration, configuration plugins will be possible. If help is needed, i would like to implement a JMX implementation of NutchConf (since i will need it myself;). Regards, Thomas
Re: no static NutchConf
+1 in general In fact I like the approach presented by Stefan to pass only required parameters to objects that have small number of configurable params instead of NutchConf - it makes it obvious which parameters are required for such basic objects to run and as they are usually building blocks for something bigger it makes it easier to reuse it with different params in different parts of the code. But I like the direction and will not oppose against passing the whole NutchConf in this case. Regards Piotr
Re: no static NutchConf
Jérôme Charron wrote: Excuse me in advance, I probably missed something, but what are the use cases for having many NutchConf instances with different values? Running many different tasks in parallel, each using different config, inside the same JVM. Ok, I understand this Andrzej, but it is not really what I call a use case. It is more a feature that you describe here. In fact, what I mean is that I don't understand in which cases it will be usefull. And I don't understand how a particular NutchConfig will be selected for a particular task... Use case: executing multiple tasks on any single tasktracker node, but with drastically different configurations per each task. Example: what happens now if you try to run more than one fetcher at the same time, where the fetcher parameters differ (or a set of activated plugins differs)? You can't - the local tasks on each tasktracker will use whatever local config is there. What happens if you change the config on a node that submits the job? The changes won't be propagated to the tasktracker nodes, because tasktrackers use local configuration (through a singleton NutchConf.get()), instead of supplying a serialized/deserialized instance of the config from the originating node... etc. NutchConf instances will be created when you create a JobConf. Then they will have to be serialized/deserialized when job descriptors are sent by jobtracker to tasktrackers on mapred nodes, and used locally by tasktrackers to instantiate local tasks using copies of the original NutchConf instance. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: no static NutchConf
> >Excuse me in advance, I probably missed something, but what are the use > >cases for having many NutchConf instances with different values? > Running many different tasks in parallel, each using different config, > inside the same JVM. Ok, I understand this Andrzej, but it is not really what I call a use case. It is more a feature that you describe here. In fact, what I mean is that I don't understand in which cases it will be usefull. And I don't understand how a particular NutchConfig will be selected for a particular task... Regards Jérôme -- http://motrech.free.fr/ http://www.frutch.org/
RE: no static NutchConf
If you are going to be able to reconfigure a nutch component at runtime, you need to remove any configuration from the constructor and have a method that allows you to get/set the configuration for the component. The problem with keeping the entire configuration in a single component is trying to display/filter the configuration information for the user. So the user knows what component it is configuring. Eclipse has a very good pattern for handling configuration for each of the components. Basically each component is responsible for its own configuration, and the tool just provides the framework to allow the configuration to be displayed, updated, and stored. The drawback of that approach is that you really don't have a GUI, or at least have to be able to run without one. I think that, at the very least, removing the configuration information from the constructor is the first step. You can still have a properties object set the configuration. Then we can discuss the relative merits of displaying, changing, and storing the configuration. (Like, how a user is supposed to know what component is affected by which property.) Thanks, Steve Betts [EMAIL PROTECTED] 937-477-1797 -Original Message- From: Stefan Groschupf [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 04, 2006 12:22 PM To: nutch-dev@lucene.apache.org Subject: Re: no static NutchConf > > I don't fully agree with this. In most such cases, you already have > a NutchConf instance in the method or class context, so it makes > sense to use it in the constructor. You could add these construtors > with all parameters iterated, but I'd expect that the constructors > using NutchConf would be used most frequently. My idea is to be able using low level things outside of nutch also. It is may a philosophically question in case of the map file writer you pass a complete hashmap with a bunch of properties to the object, but the objects only reads one int from this hashmap. I personal don't like to use a hashmap to 'transport' just one value. So my suggestion looks like: new MapFile.Reader(parameterA, nutchConf.getInt("parameterKey", 0)); if I understand you correct you prefer: new MapFile.Reader(parameterA, nutchConf); ... public MapFile(...){ this.parameter = nutchConf.getInt("parameterKey",0); } As mentioned this is more a code philosophy question and this is not important for me, my only idea was to decouple things as much as possible if we touch it anyway. >> + Getting a Extension, require also a NutchConf that is injected >> in case the Extension Object (e.g. a Parser) implements a >> Configurable interface. > > > Yes. If you remember our discussion, I'd like also to follow a > pattern where such instances are cached inside this NutchConf > instance, if appropriate (i.e. if they are reusable and multi- > threaded). I'm afraid I still do not clearly understand your idea here. As discussed it makes from my point of view no sense to cache any objects in a nutchConf. Especially extension implementation like parsers are multithreaded and exists that often as we have threads. A caching would make more sense behind the sense of the plugin registry, but it is may difficult since you can run in trouble with resource life cycle management. PluginClass instances are already cached and working like a kind of singleton for each existing plugin registry. Also I see some trouble when using this caching mechanism since NutchConf can be serialized. Actually I have no idea where this mechanism is used, but I guess distributed map reduce will use this mechanism heavily. So the cached objects need to be Serializable as well. Stefan
Re: no static NutchConf
Jérôme Charron wrote: Excuse me in advance, I probably missed something, but what are the use cases for having many NutchConf instances with different values? Running many different tasks in parallel, each using different config, inside the same JVM. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: no static NutchConf
> My idea is to be able using low level things outside of nutch also. > It is may a philosophically question in case of the map file writer > you pass a complete hashmap with a bunch of properties to the object, > but the objects only reads one int from this hashmap. I personal > don't like to use a hashmap to 'transport' just one value. Yes Stefan, but passing only the NutchConf in the constructor 1. avoid breaking compatibility if a new parameter is used in a future version of the constructor. 2. Give control of default values to the class itself instead of the calling object. I think that we can accept the general convention that all NutchConfigurable objects must provide a constructor with a single NutchConf parameter. Excuse me in advance, I probably missed something, but what are the use cases for having many NutchConf instances with different values? Regards Jérôme
Re: no static NutchConf
I don't fully agree with this. In most such cases, you already have a NutchConf instance in the method or class context, so it makes sense to use it in the constructor. You could add these construtors with all parameters iterated, but I'd expect that the constructors using NutchConf would be used most frequently. My idea is to be able using low level things outside of nutch also. It is may a philosophically question in case of the map file writer you pass a complete hashmap with a bunch of properties to the object, but the objects only reads one int from this hashmap. I personal don't like to use a hashmap to 'transport' just one value. So my suggestion looks like: new MapFile.Reader(parameterA, nutchConf.getInt("parameterKey", 0)); if I understand you correct you prefer: new MapFile.Reader(parameterA, nutchConf); ... public MapFile(...){ this.parameter = nutchConf.getInt("parameterKey",0); } As mentioned this is more a code philosophy question and this is not important for me, my only idea was to decouple things as much as possible if we touch it anyway. + Getting a Extension, require also a NutchConf that is injected in case the Extension Object (e.g. a Parser) implements a Configurable interface. Yes. If you remember our discussion, I'd like also to follow a pattern where such instances are cached inside this NutchConf instance, if appropriate (i.e. if they are reusable and multi- threaded). I'm afraid I still do not clearly understand your idea here. As discussed it makes from my point of view no sense to cache any objects in a nutchConf. Especially extension implementation like parsers are multithreaded and exists that often as we have threads. A caching would make more sense behind the sense of the plugin registry, but it is may difficult since you can run in trouble with resource life cycle management. PluginClass instances are already cached and working like a kind of singleton for each existing plugin registry. Also I see some trouble when using this caching mechanism since NutchConf can be serialized. Actually I have no idea where this mechanism is used, but I guess distributed map reduce will use this mechanism heavily. So the cached objects need to be Serializable as well. Stefan
Re: no static NutchConf
Stefan Groschupf wrote: Hi, to move forward in the direction of having a nutch gui, I would love to start removing the static access of NutchConf. Based on experience first I would love to get a kind of general agreement and a 'go' before wasting to much time for an unaccented solution. I agree with the general direction. Some comments below: I suggest: + removing NutchConf.get(). I'm not sure about this... Somewhere you need to instantiate the default config, and this looks like a good place. + in case a lower level object use only one, two but not more than 3 parameters from the nutch configuration, we add this parameter to the constructor of this object. (e.g. MapFile.Reader needs only the parameter INDEX_SKIP) I don't fully agree with this. In most such cases, you already have a NutchConf instance in the method or class context, so it makes sense to use it in the constructor. You could add these construtors with all parameters iterated, but I'd expect that the constructors using NutchConf would be used most frequently. + for higher level objects like fetcher tool- that need more than 3 parameters for the lower level object - we add a instance of NutchConf to the Constructor Ok. + for all dynamic used object that implements a specific interface (interface > no control over the object constructor) we use the Configurable interface to set the NutchConf in a inversion of control like style. (e.g. Plugin Extension Implementations like Parser or Protocols) Ok. + PluginRegestry will not longer a singleton but will get an constructor with a NutchConf instance. Definitely yes. + Getting a Extension, require also a NutchConf that is injected in case the Extension Object (e.g. a Parser) implements a Configurable interface. Yes. If you remember our discussion, I'd like also to follow a pattern where such instances are cached inside this NutchConf instance, if appropriate (i.e. if they are reusable and multi-threaded). Any comments, improvement suggestions, more use-cases? I would love to do this job, can I get a go from the other developers? +1 from me. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
no static NutchConf
Hi, to move forward in the direction of having a nutch gui, I would love to start removing the static access of NutchConf. Based on experience first I would love to get a kind of general agreement and a 'go' before wasting to much time for an unaccented solution. I suggest: + removing NutchConf.get(). + in case a lower level object use only one, two but not more than 3 parameters from the nutch configuration, we add this parameter to the constructor of this object. (e.g. MapFile.Reader needs only the parameter INDEX_SKIP) + for higher level objects like fetcher tool- that need more than 3 parameters for the lower level object - we add a instance of NutchConf to the Constructor + for all dynamic used object that implements a specific interface (interface > no control over the object constructor) we use the Configurable interface to set the NutchConf in a inversion of control like style. (e.g. Plugin Extension Implementations like Parser or Protocols) + PluginRegestry will not longer a singleton but will get an constructor with a NutchConf instance. + Getting a Extension, require also a NutchConf that is injected in case the Extension Object (e.g. a Parser) implements a Configurable interface. Any comments, improvement suggestions, more use-cases? I would love to do this job, can I get a go from the other developers? From my point of view NutchConf is actually a showblocker since a lot of people run in trouble integrating nutch in other projects, also my suggestions are require to write a nutch gui. Stefan