Re: Trunk is broken

2005-12-30 Thread Thomas Jaeger
Hi Andrzej,

Gal Nitzan wrote:

>> It seems that Trunk is now broken...
>>


DmozParser seems to be broken, too. It's package declaration is still
org.apache.nutch.crawl instead of org.apache.nutch.tools.


TJ



Re: Trunk is broken

2006-01-02 Thread Thomas Jaeger
Hi Andrzej,

Gal Nitzan wrote:
> It seems that Trunk is now broken...
> 

DmozParser seems to be broken, too. It's package declaration is still
org.apache.nutch.crawl instead of org.apache.nutch.tools.


TJ


Re: Trunk is broken

2006-01-02 Thread Thomas Jaeger
Hi Andrzej,

Gal Nitzan wrote:

>> It seems that Trunk is now broken...
>>


DmozParser seems to be broken, too. It's package declaration is still
org.apache.nutch.crawl instead of org.apache.nutch.tools.


TJ



Re: no static NutchConf

2006-01-04 Thread Thomas Jaeger
Hi,

Stefan Groschupf wrote:
[...]
> Any comments, improvement suggestions, more use-cases?

I completely agree with you.

I have two more ideas:
1) create NutchConf as interface (not class)
2) make it work as plugin

1) If NutchConf is an interface, the NutchConf implementation can be
written with a hashmap in mind (like now) or with JMX or
commons-configuration.
2) There are only 4 required configuration options (plugin.excludes,
plugin.includes, plugin.folders, plugin.auto-activation) the plugin
registry needs to start up. If these options are provided by a bootstrap
configuration, configuration plugins will be possible.

If help is needed, i would like to implement a JMX implementation of
NutchConf (since i will need it myself;).


Regards,

Thomas


Re: no static NutchConf

2006-01-05 Thread Thomas Jaeger
Doug Cutting wrote:
> Stefan Groschupf wrote:
> 
>>> I have two more ideas:
>>> 1) create NutchConf as interface (not class)
>>> 2) make it work as plugin
>>
>>
>> I like the idea to make the conf as a singleton and understand the 
>> need to be able to integrate nutch.
>> However I would love to do one first step and later on we can make 
>> this second step. I made the experience that if you change to much 
>> people do not accept your patch.
> 
> 
> +1
> 
> I don't see a big advantage in trying to make both of these changes at
> the same time.  And, when possible, small incremental changes are easier
> for the community to process.

I never thought to make these changes at once. These were just some
thoughts on how to improve the nutch configuration. I agree with Stefan
in this point.


Thomas




Suggestions on plugin repository

2006-01-14 Thread Thomas Jaeger
Hi all,

some thoughts about the plugin package:

Currently the method Extension.getExtensionInstance returns a new
instance of an extension each time it is called. If an extension is a
singleton, each extension has to implement the singleton pattern.

My suggestion:
Add an instance cache to the Extension class. If an extension is marked
as a singleton, getExtensionInstance would always return the same instance.
Whether an extension is a singleton or not is determined by the
plugin.xml, for example:

To be compatible with the current implementation, the singleton
attribute would not be required and the default value is false.

Main advantages I see:
- singleton code in extensions would disappear, the code is getting more
readable and smaller
-- static variable access would disappear (think of concurrency and
classloader issues)
- Instance control is handed over to the plugin "container"
- it's possible to have two instances of a singleton for testing purposes


What do you think?
 Thomas


Re: Suggestions on plugin repository

2006-01-15 Thread Thomas Jaeger
Hi Stefan,

Stefan Groschupf wrote:
> to realize singletons you have to use a plugin class implementation 
> that hosts all resources you want to share within your extension  objects.
> Extension objects are not singletons since they are many times used  in
> a multithreaded environment. So just move all our fields you would  have
> in a singleton to your custom plugin class implementation and  verify
> that you are able to handle multithreaded resources access and  you are
> done.

Ok, i'm currently learning by reading the source, so sorry for asking
dump questions ;)
I first looked into httpclient.Http and there is a singleton pattern for
HttpClient in it. So I thought it is a good idea to have extensions as
singletons and i didn't looked at the Plugin class (which is already
cached)...
But my next question now is: Why isn't there any class inheriting Plugin
(as far as i can see)? And why isn't there a custom Plugin for
httpclient.Http, which hosts access to HttpClient and
MultiThreadedHttpConnectionManager?


Thomas