RE: [Nutch-dev] make URLFilter as plugin

2005-01-31 Thread Chirag Chaman
John: Right now if the protocol-http comes across a 300 header it get the forwarding URL and it's content. It would be nice to get this URL and have it go thru the URL processing. For example, if one has www.oldjunksite.com being rejected, one could potential create a www.newjunksite.com and ha

Re: [Nutch-dev] make URLFilter as plugin

2005-01-31 Thread John X
Currently I would just create a barebone infrastructure to make existing URL-related filtering/processing pluggable, i.e., (1) define an interface URLFilter (2) convert RegexURLFilter.java, PrefixURLFilter.java as plugins. After that, people can write plugins with more sophistication as you suggest

Re: [Nutch-dev] make URLFilter as plugin

2005-01-31 Thread John X
Stefan, On Tue, Feb 01, 2005 at 01:55:03AM +0100, Stefan Groschupf wrote: > John, > > by the way, is the url filter multithreaded? > Do you think it is possible to implement the url filter extension > point multithreaded? As far as I know, none of the tools that currently use URLFilter servic

RE: [Nutch-dev] make URLFilter as plugin

2005-01-31 Thread Chirag Chaman
John: This is a very good idea -- and one that we currently use as a "hack" (i.e. very slow) Here are a few things that we faced: 1. At times we need to reprocess rules. Example: - Run URL filter and remove URL - Run RegexURL filter to transform passed url to another URL

Re: [Nutch-dev] make URLFilter as plugin

2005-01-31 Thread Stefan Groschupf
John, by the way, is the url filter multithreaded? Do you think it is possible to implement the url filter extension point multithreaded? Stefan Am 01.02.2005 um 01:53 schrieb John X: Hi, All, I propose to define plugin extension point for URLFilter, and convert current RegexURLFilter.java, Pref

Re: [Nutch-dev] make URLFilter as plugin

2005-01-31 Thread Stefan Groschupf
John, that's would be very well improvement. I have not checked, but I assume, by default, we can always name plugins in alphabetical order. Stefan: any better way to do this? May be an attribute in the plugin xml is a other way. To order the plugin in a alphabetically order make sense and is very

[Nutch-dev] make URLFilter as plugin

2005-01-31 Thread John X
Hi, All, I propose to define plugin extension point for URLFilter, and convert current RegexURLFilter.java, PrefixURLFilter.java, etc., into plugins. However there is one requirement, different from other plugin extensions: we should be able to specify the order by which plugins are loaded and app