Stefan Groschupf wrote:
Hi,
I'm wondering why the plugins are in the job file, since it looks like
the plugins are never loaded from the job file but from the outside
(plugin folder).
Should they?
If running your job jar on a pure hadoop platform, there are no plugins
on local disk. The job jar needs to carry all it needs to run.
If you have nutch everywhere on your cluster, there will be plugins on
disk and plugins in your job jar. Which gets favored should just be a
matter of the CLASSPATH when the child runs: The first plugin found wins
(Looks like those on disk will be found first going by TaskRunner
classpath).
In the past, I've had some trouble trying to load up extra plugins and
overrides of plugins already present in the nutch default 'plugins'
directory. At the time, naming the plugins directory in my job jar
other than 'plugins' -- e.g. 'xtra-plugins' -- and then adding it to the
plugins.include property in configuration loaded into my job jar AHEAD
of default 'plugin' directory got me further.
Nowadays, I build a job jar that that picks and chooses from multiple
plugin sources, the plugins I need, aggregating them under a plugin dir
in the job jar. The resultant job jar is run on a pure hadoop rather
than nutch platform.
St.Ack
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers