Giampaolo Tomassoni wrote:
> From: Matt Kettler [mailto:[EMAIL PROTECTED]
>   
>> 1) perl has a substantial base of text parsing and utility libraries
>> that no other language can match.. Java does have native regex support,
>> so it has a leg up over the others,
>>     
>
> Right, but both langs are not that much suited for scoring a message: they 
> apply all the rules to the very same piece of text.
>
> It would be interesting, instead, to "invert" this approach by designing a 
> finite state machine which is basicly a pre-compiled version of the whole 
> rule body. You feed once the message in, and you get the results (i.e.: fired 
> rules and/or message score).
>
> I believe that this approach would reduce memory consumption as well as 
> execution time a lot.
>
> It would not be suitable for custom plugins, however. But all the standard 
> rules (even the "expensive" ones in terms of computational power and memory 
> footprint) would probably perform better this way.
>
> The basic idea in the FSM model is that the pre-compiler is going to run just 
> sometimes, maybe when a rule gets changed, added or deleted to the rule body. 
> The pre-compiler could eventually even optimize the resulting FSM, perhaps by 
> "merging" together paths shared by different rules. The .cf files syntax 
> would not even need to be changed and this method could even allow for 
> injecting a new, pre-compiled rule body version into an alive spamassassin.
>
> Optionally, the FSM approach could be implemented the well-appreciated, 
> actual perl by use of an external perl module.
>
> Did anybody heard or thought of something like this?
>   
Nope..
> Do you believe that an FSM would really improve SA performances?
>   
Maybe, maybe not.. It could definitely lead to some cross-regex
optimzations, but I don't know that there are enough of them of them
that it would make a substantial (>10%) difference.
> What's your point?
>   
I am pointless :)
> giampaolo
>
>
>   

Reply via email to