+1
-- Sami Siren
Doug Cutting wrote:
I propose we cleanup Nutch's tools as follows.
First, some definitions:
1. An "action" is an operation on Nutch data. For example, GenerateSegmentFromDB, FetchSegment, UpdateDB, IndexSegment, MergeIndexes, SearchServer, etc. are all actions.
2. A "tool" invokes an action from the command line.
The proposal:
1. Actions and tools should be separate classes, in separate files.
2. A tool class should define no methods other than a main() and perhaps those required to parse the command line. All application logic should be in the action class.
3. All actions must implement the following interface:
public interface NutchConfigurable { void setConf(NutchConf conf); NutchConf getConf(); }
4. Most actions should implement this by extending:
public class NutchConfigured implements NutchConfigurable { private NutchConf conf; public NutchConfigured(NutchConf conf) { setConf(conf); } public void setConf(NutchConf conf) { this.conf = conf; } public NutchConf getConf() { return conf; } }
5. All plugins must implement NutchConfigurable.
6. Plugin factory methods must accept a NutchConf.
For example:
public static Protocol ProtocolFactory.getProtocol(String url);
will become:
public static Protocol ProtocolFactory.getProtocol(NutchConf, String);
Comments?
Doug
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
