Hey guys, Been playing with Nutch quite a bit lately, here's a random grab-bag of queries / questions / problems I've encountered.
- Classloading - I have had many problems with NutchConf due to the way it loads it's resources. In a J2EE scenario, it's simply evil :) Would there be any great problem with switching it's classloader to Thead.currentThread().getContextClassloader() instead of the current static classloader? It's a lot 'friendlier' to do it this way. I can submit a patch to do this very quickly if others are keen (or anyone can do it - I've done it locally, takes about 30 keystrokes!) - Statics - On that issue, there are an awful lot of static classes and methods around. This makes configuring and using Nutch in 'non standard' ways difficult as things are hard coded together (for example I can't easily swap out NutchConf to do my own configuration mechanism as it's all static accesses!). Is there any interest in removing / refactoring these statics out to make Nutch more flexible? - Plugins / physical files - Quite a lot of stuff in Nutch seems to rely on physical files (for example plugins are loaded by looking for the "/plugins" directory on disk IIRC). In a J2EE environment, this means you can't deploy the WAR as a non-expanded WAR for example. Can we switch from loading files directly to loading resources as streams? This means you can load a file from the classloader regardless of whether or not it exists as a physical file. More as I play more tomorrow - great work so far though, I love what I see. I know I'm using things as they're "not meant to be used" but I'm a big fan of flexible, simple systems and I think Nutch could get there with only a little work. Any time / answers most appreciated. Cheers, Mike -- ATLASSIAN - http://www.atlassian.com
