Hi Stefan,
> after a short time I already had 1602 time this lines in my
> tasktracker log files.
> 060307 022707 task_m_2bu9o4 found resource parse-plugins.xml at
> file:/home/joa/nutch/conf/parse-plugins.xml
>
> Sounds like this file is loaded 1602 (after lets say 3 minutes) I
> guess that wasn't the goal or do I oversee anything?
It certainly wasn't the goal at all. After NUTCH-88, Jerome and I had the
following line in the ParserFactory.java class:
/** List of parser plugins. */
private static final ParsePluginList PARSE_PLUGIN_LIST =
new ParsePluginsReader().parse();
(see revision 326889)
Looking at the revision history for the ParserFactory file, after the
application of NUTCH-169, the above changes to:
private ParsePluginList parsePluginList;
//... code here
public ParserFactory(NutchConf nutchConf) {
this.nutchConf = nutchConf;
this.extensionPoint = nutchConf.getPluginRepository().getExtensionPoint(
Parser.X_POINT_ID);
this.parsePluginList = new ParsePluginsReader().parse(nutchConf);
if (this.extensionPoint == null) {
throw new RuntimeException("x point " + Parser.X_POINT_ID + " not
found.");
}
if (this.parsePluginList == null) {
throw new RuntimeException(
"Parse Plugins preferences could not be loaded.");
}
}
Thus, every time the ParserFactory is constructed, the parse-plugins.xml
file is read (it's the result of the call to
ParsePluginsReader().parse(nutchConf)). So, if the fie is loaded 1602 times,
I'd guess that the ParserFactory is loaded 1602 times? Additionally, I'm
wondering why the parse-plugins.xml configuration parameters aren't declared
as final static anymore?
> That could be a serious performance improvement to just load this
> file once.
Yup, I think that's the reason we made it final static. If there is no
reason to not have it final static, I would suggest that it be put back to
final static. There may be a problem however, now since NUTCH-169, the
loading requires an existing Configuration object I believe. So, we may need
a static Configuration object as well. Thoughts?
> I was not able to find the code that is logging this statement, has
> anyone a idea where this happens?
The statement gets logged within the ParsePluginsReader.java class, line 98:
ppInputStream = conf.getConfResourceAsInputStream(
conf.get(PP_FILE_PROP));
HTH,
Chris
>
> Thanks.
> Stefan
> ---------------------------------------------
> blog: http://www.find23.org
> company: http://www.media-style.com
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers