Re: Plugins initialized all the time!

2007-05-31 Thread Doğacan Güney
On 5/30/07, Doğacan Güney [EMAIL PROTECTED] wrote: On 5/30/07, Andrzej Bialecki [EMAIL PROTECTED] wrote: Doğacan Güney wrote: My patch is just a draft to see if we can create a better caching mechanism. There are definitely some rough edges there:) One important information: in future

Re: Plugins initialized all the time!

2007-05-31 Thread Nicolás Lichtmaier
Actually thinking a bit further into this, I kind of agree with you. I initially thought that the best approach would be to change PluginRepository.get(Configuration) to PluginRepository.get() where get() just creates a configuration internally and initializes itself with it. But then we

Re: Plugins initialized all the time!

2007-05-30 Thread Doğacan Güney
Hi, On 5/29/07, Nicolás Lichtmaier [EMAIL PROTECTED] wrote: Which job causes the problem? Perhaps, we can find out what keeps creating a conf object over and over. Also, I have tried what you have suggested (better caching for plugin repository) and it really seems to make a difference.

Re: Plugins initialized all the time!

2007-05-30 Thread Andrzej Bialecki
Doğacan Güney wrote: My patch is just a draft to see if we can create a better caching mechanism. There are definitely some rough edges there:) One important information: in future versions of Hadoop the method Configuration.setObject() is deprecated and then will be removed, so we have to

Re: Plugins initialized all the time!

2007-05-30 Thread Doğacan Güney
On 5/30/07, Andrzej Bialecki [EMAIL PROTECTED] wrote: Doğacan Güney wrote: My patch is just a draft to see if we can create a better caching mechanism. There are definitely some rough edges there:) One important information: in future versions of Hadoop the method Configuration.setObject()

Re: Plugins initialized all the time!

2007-05-29 Thread Doğacan Güney
Hi, On 5/28/07, Nicolás Lichtmaier [EMAIL PROTECTED] wrote: I'm having big troubles with nutch 0.9 that I hadn't with 0.8. It seems that the plugin repository initializes itself all the timem until I get an out of memory exception. I've been seeing the code... the plugin repository mantains a

Re: Plugins initialized all the time!

2007-05-29 Thread Briggs
I have also noticed this. The code explicitly loads an instance of the plugins for every fetch (well, or parse etc., depending on what you are doing). This causes OutOfMemoryErrors. So, if you dump the heap, you can see the filter classes get loaded and the never get unloaded (they are loaded

Re: Plugins initialized all the time!

2007-05-29 Thread Doğacan Güney
On 5/29/07, Briggs [EMAIL PROTECTED] wrote: I have also noticed this. The code explicitly loads an instance of the plugins for every fetch (well, or parse etc., depending on what you are doing). This causes OutOfMemoryErrors. So, if you dump the heap, you can see the filter classes get loaded

Re: Plugins initialized all the time!

2007-05-29 Thread Nicolás Lichtmaier
I'm having big troubles with nutch 0.9 that I hadn't with 0.8. It seems that the plugin repository initializes itself all the timem until I get an out of memory exception. I've been seeing the code... the plugin repository mantains a map from Configuration to plugin repositories, but the

Plugins initialized all the time!

2007-05-28 Thread Nicolás Lichtmaier
I'm having big troubles with nutch 0.9 that I hadn't with 0.8. It seems that the plugin repository initializes itself all the timem until I get an out of memory exception. I've been seeing the code... the plugin repository mantains a map from Configuration to plugin repositories, but the

Re: Plugins initialized all the time!

2007-05-28 Thread Nicolás Lichtmaier
More info... I see map progressing from 0% to 100. It seems to reload plugins whan reaching 100%. Besides, I've realized that each NutchJob is a Configuration, so (as is there's no equals) a plugin repo would be created per each NutchJob...