Justin Mason wrote:
Fidelis Assis writes:
Cool! Both OSBF and Winnow use OSB for tokenization, but other than that
they're quite different filters :). For more details on OSBF, please check
http://osbf-lua.luaforge.net/papers/OSBF-Lua_VBFeb07.pdf and
http://osbf-lua.luaforge.net/papers/osbf-eddc.pdf.
I hadn't seen the VB article, it's a good overview, very helpful.
so OSBF-Lua isn't using Winnow anymore, right?
Yes, in fact it never did. OSBF-Lua is a Bayesian filter with some
techniques for improved accuracy (EDDC, TONE-HR), while Winnow is
based on Littlestone's Winnow algorithm.
I was puzzling about how
EDDC fit into Winnow for a while, so this is good to know... ;)
EDDC is specific to OSBF, present in both OSBF-Lua and OSBF-CRM114.
TONE-HR is only available in OSBF-Lua.
If you need any help with pluginizing OSBF-Lua, I'll be glad to assist :).
thanks, but I'm afraid I'm going to try to avoid that, and do a pure-perl
reimplementation instead.
Performance might be a problem, but I'm not very skilled in Perl programming
to say so. Perhaps a C module would be the way to go - OSBF-Perl, like OSBF-Lua.
We already have enough of a memory footprint in
SpamAssassin, without adding another language's runtime as well!
OK, I see, anyway I've heard of Inline::Lua, Lua in Perl :), it might be an
option for quick testings.
--
Fidelis Assis