On Tue, Feb 01, 2005 at 10:38:06AM +0100, Andrzej Bialecki wrote:
> John X wrote:
> >Stefan, 
> >
> >On Tue, Feb 01, 2005 at 01:55:03AM +0100, Stefan Groschupf wrote:
> >
> >>John,
> >>
> >>by the way, is the url filter multithreaded?
> >>Do you think it  is possible to implement the url filter extension 
> >>point multithreaded?
> >
> >
> >As far as I know, none of the tools that currently use URLFilter service
> >is multithreaded (WebDBInjector.java, UpdateDatabaseTool.java, etc.),
> >though it would be nice to make sure URLFilter plugins are thread-safe.
> 
> I was involved in implementation of Nutch-based "multi-crawler". We 
> wanted to run several Intranet crawls inside a single JVM - each crawl 
> with its own set of parameters, filters and configuration. This proved 
> to be rather difficult to implement, because in many places Nutch 
> assumes there is only one processing task (i.e. 1 or more threads, like 
> e.g. updatedb, generate, or fetch) per JVM.
> 
> There is no concept of processing context, which would tie together 
> plugins, filters, configuration parameters etc. This is now implemented 
> as static methods on a couple of classes, the worst example being the 
> use of LOG.severe to terminate processing.
> 
> An alternative would be to pass instances of "NutchContext" to all 
> processing tasks, so that they could read necessary parameters, or even 
> retrieve instances of plugins, filters etc. Such context could also 
> provide a data container to pass messages (like LOG.severe) to other 
> parts of the processing chain.

Do you have template code?

John


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to