Hi, Chirag,

Since nutch urlfilter has been converted into plugin, I am going to
take on the idea of rule-based filtering as you suggested before, maybe
a new urlfilter plugin.
Which commercial RETE engine you used?
Any open source one?

Thanks,

John

On Mon, Jan 31, 2005 at 08:03:03PM -0500, Chirag Chaman wrote:
> 
> 3. As rules grow filtering becomes slow -- prior to using Nutch we were
> using a commercial RETE rules engine in which we have loaded the REs as
> rules. This improved speed immensely. Maybe an overkill for now.  Below is a
> simpler way to do this.
> 
> Here's what we're planning on building -- is this helpful? How would this
> play in with plugins...
> 
> <GROUP> Rule Group Name
> <RULE>
>       <MATCH> RE to match </MATCH>
>       <ACTION> Discard/Substitution/GoTo </ACTION>
>       <SUBSTITUTION> Substitution </SUBSTUTION>
>       <GOTO>RuleGroupToSendProcess</GOTO>
>       <STOP> 0 or 1 - 0 would mean keep processing more rules <STOP>
> </RULE>
> </GROUP>
> 
> Here's who this would work.
> 
> -Each file has a "Default" group, under which all rules are kept.
> -For more advanced rules, one could send control to another RuleGroup on
> match (helpful when you want specific groups of rules for a certain domain,
> extension, etc) -- this will cut down the number of rules to look at.
> - the Stop exits upon a match or keeps processing more rules in the same
> group.
>  
> 
> 
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of John X
> Sent: Monday, January 31, 2005 7:53 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: [Nutch-dev] make URLFilter as plugin
> 
> Hi, All,
> 
> I propose to define plugin extension point for URLFilter, and convert
> current RegexURLFilter.java, PrefixURLFilter.java, etc., into plugins.
> However there is one requirement, different from other plugin extensions: we
> should be able to specify the order by which plugins are loaded and applied.
> I have not checked, but I assume, by default, we can always name plugins in
> alphabetical order.
> Stefan: any better way to do this?
> 
> If no one thinks this is a bad idea, I am going to start work on it right
> way.
> 
> John
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool
> for open source databases. Create drag-&-drop reports. Save time by over
> 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Nutch-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/nutch-developers
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Nutch-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/nutch-developers
> 
__________________________________________
http://www.neasys.com - A Good Place to Be
Come to visit us today!


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to