> Which brings me to three. I think it is time to get rid of the plugin > framework. I want to keep the functionality of the various plugins but > I think a dependency injection framework, such as spring, creating the > components needed for logic inside of tools is a much cleaner way to > proceed. This would allow much better unit and mock testing of tool and > logic functionality. It would allow Nutch to run on a non "nutchified" > Hadoop cluster, meaning just a plain old hadoop cluster. We could have > core jars and contrib jars and a contrib directory which is pulled from > by shell scripts when submitting jobs to Hadoop. With the > multiple-resources functionality in Hadoop it would be a simple matter > of creating the correct command lines for the job to run. > > And that brings me to separation of data and presentation. Currently > the Nutch website is one monolithic jsp application with plugins. I > think the next generation should segment that out into xml / json feeds > and a separate front end that uses those feeds. Again this would make > it much easier to create web applications using nutch.
I have not been using nutch as long as most everyone else here on the list (just since mid last year). I have written a handful of plugins. The current system seems to work well. However, I am a strong proponent of the unix method of systems. I strongly believe a system can be more flexible, more usable by more users, more customizable, when each subsystem can be run independent of each other. As long as the "feeds" are atomic enough in nature to be able to insert other modifying or filtering tools between the main nutch tools, I find this to be a better overall solution in the long run. Assuming this is the way Nutch moves forward, do we allow Nutch to stay as-is, with plugins and all, and create a new project? Or, do we not worry about abandoning the current setup and changing it en masse? JohnM -- john mendenhall [EMAIL PROTECTED] surf utopia internet services