Mike Cannon-Brookes wrote:
Hey guys,

Hi, Mike!  Welcome.

- Classloading - I have had many problems with NutchConf due to the
way it loads it's resources. In a J2EE scenario, it's simply evil :)
Would there be any great problem with switching it's classloader to
Thead.currentThread().getContextClassloader() instead of the current
static classloader? It's a lot 'friendlier' to do it this way. I can
submit a patch to do this very quickly if others are keen (or anyone
can do it - I've done it locally, takes about 30 keystrokes!)

That's not a problem. Please submit a patch. Attach it to a bug report (if you know how to use Jira!).

- Statics - On that issue, there are an awful lot of static classes
and methods around. This makes configuring and using Nutch in 'non
standard' ways difficult as things are hard coded together (for
example I can't easily swap out NutchConf to do my own configuration
mechanism as it's all static accesses!). Is there any interest in
removing / refactoring these statics out to make Nutch more flexible?

Yes, that's a goal. I'd like to seriously attack it after we merge the mapred branch to trunk, probably next month.

I made a proposal in this vein almost a year ago:

http://www.mail-archive.com/[email protected]/msg00196.html

Note also that mapred's JobConf is always used dynamically, so all of the new mapred-based code can be dynamically configured. The biggest thing left to fix are plugins. I think perhaps each plugin factory method should take a configuration.

- Plugins / physical files - Quite a lot of stuff in Nutch seems to
rely on physical files (for example plugins are loaded by looking for
the "/plugins" directory on disk IIRC). In a J2EE environment, this
means you can't deploy the WAR as a non-expanded WAR for example. Can
we switch from loading files directly to loading resources as streams?
This means you can load a file from the classloader regardless of
whether or not it exists as a physical file.

The problem is that we sometimes need to list directories, e.g., to find out what resources are available. Is there a J2EE-safe way to to do that?

Cheers,

Doug

Reply via email to