Re: [gentoo-user] Re: [lists] [gentoo-user] Bayesian spam filtering

2003-06-06 Thread Luke Ravitch
On 2003-06-05 07:15, Anthony Ventimiglia <[EMAIL PROTECTED]> wrote:
> Bogofilter should be more *nixy since it was written by esr, but I
> don't know if it supports more than two buckets.

Not sure what you mean by buckets.  Bogofilter doesn't really care how
many mailboxes you use.  It operates on one message at a time.  Then
you filter on the X-Bogosity header.

If you mean classification buckets, Bogofilter can be set to classify
mail as spam, not spam (ham), and not sure.  That's three.

-- 
Luke


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Bayesian spam filtering

2003-06-06 Thread Lincoln A. Baxter
If you are using qmail, consdier qconfirm:  It requires you to expose
your qmail to the net but as long as you have an always on connection,
it is easy to do with dyndns.org.  It virtual elminates SPAM, because
only those that have explicitly responded to a confirmation request, or 
you have explicitly placed in a whitelist, are allowed to delivered to
your inbox.

You can find an ebuild compressed tarball here:

http://bugs.gentoo.org/attachment.cgi?id=12442&action=view

Put it in a PORTAGE_OVERLAY_DIR and just emerge.

Lincoln


On Thu, 2003-06-05 at 09:31, Larry Wright wrote:
> I am currently running qmail + fetchmail + procmail + spamassassin. Everything 
> works pretty well, but I'm a little dissapointed in spamassassin's accuracy. 
> I'd ideally like something similar to spambayes, which is trainable. 
> Unfortunately spambayes does not appear to work with maildir. I'd really like 
> to keep qmail, does anyone know of a bayesian filter that works with 
> qmail/maildir?
> 
> Thanks,
> Larry
> 
> 
> --
> [EMAIL PROTECTED] mailing list
> 
-- 
Lincoln A. Baxter <[EMAIL PROTECTED]>


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Bayesian spam filtering

2003-06-06 Thread Chris I
On 2003.06.05 09:31, Larry Wright wrote:
I am currently running qmail + fetchmail + procmail + spamassassin.
Everything
works pretty well, but I'm a little dissapointed in spamassassin's
accuracy.
I'd ideally like something similar to spambayes, which is trainable.
Unfortunately spambayes does not appear to work with maildir. I'd
really like
to keep qmail, does anyone know of a bayesian filter that works with
qmail/maildir?
Just to tag a "me too" on here.

I'm using Spam Assassin and rarely get any spam in my inbox. I am very 
impressed with this product.

As for maildir support, they way I set up SA with procmail negates SA's 
need to even touch maildirs. I'd assume a similar setup is possible for 
any spam filter.

-Chris I

I bought some used paint. It was in the shape of a house.
-- Steven Wright

pgp0.pgp
Description: PGP signature


Re: [gentoo-user] Bayesian spam filtering

2003-06-06 Thread Steven Knight
On Thu, 2003-06-05 at 09:40, Ryan wrote: 
> I've been using bogofilter (http://bogofilter.sourceforge.net/) with
> procmail with great success. A few false negatives now and then but I
> haven't seen a false positive yet. The only thing I don't like about the
> setup is that I have to go to a commandline to mark the spam that slipped
> through bogofilter as spam. I wish that I could just throw something in a
> cron job that every couple hours it would re-compute its wordlists based
> on the contents of my spam folders and my other folders. Does anyone have
> a script that does this? I hear that spamprobe
> (http://spamprobe.sourceforge.net/) (not used it yet) has this capability
> built-in. Maybe I'll try switching to see how it works.


I've been using spamprobe for months now.  It is wonderful.  Move spam
to spam mbox and spamprobe does the rest.  You can find an ebuild here:

http://bugs.gentoo.org/show_bug.cgi?id=19192

Once installed, I use the following cron entries:

spamprobe good /home/skk/mail/inbox
spamprobe spam /home/skk/mail/spam

and the procmail entry:

:0
SCORE=| /usr/bin/spamprobe receive
:0 wf
| formail -I "X-SpamProbe: $SCORE"
:0 a:
*^X-SpamProbe: SPAM
spam

I'm using mbox, I have not used this is with maildir, or other mail
storage methods.

--
Steven Knight  [EMAIL PROTECTED]   IM : skkataim

and tho' We are not now that strength which in old days
Moved earth and heaven, that which we are, we are,--
One equal temper of heroic hearts,
Made weak by time and fate, but strong in will
To strive, to seek, to find, and not to yield.

-- Ulysses by Alfred Lord Tennyson


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Bayesian spam filtering

2003-06-06 Thread Kurt Lieber
On Thu, Jun 05, 2003 at 08:31:48AM -0500 or thereabouts, Larry Wright wrote:
> I am currently running qmail + fetchmail + procmail + spamassassin. Everything 
> works pretty well, but I'm a little dissapointed in spamassassin's accuracy. 

As Jens said, the latest version of SpamAssassin supports bayesian
filtering.  I've been using it for almost 2 weeks now and, after taking the
time to train it properly, it has been remarkably accurate.

--kurt


pgp0.pgp
Description: PGP signature


Re: [gentoo-user] Bayesian spam filtering

2003-06-06 Thread Jens Mayer
* On Thu, Jun 05, 2003 at 08:31:48 -0500, Larry Wright wrote:

> I am currently running qmail + fetchmail + procmail + spamassassin. Everything 
> works pretty well, but I'm a little dissapointed in spamassassin's accuracy. 
> I'd ideally like something similar to spambayes, which is trainable. 
> Unfortunately spambayes does not appear to work with maildir.

Try to use the latest SpamAssassin (2.55), which supports 
Maildirs (you can train your Bayesian filter by 'sa-learn --dir 
--ham Maildir/.foobar/cur' resp. 'sa-learn --dir --spam...'). 

Read http://au.spamassassin.org/doc/sa-learn.html before 
starting to train your filters.

Using SpamAssassin, new mail above and beyond a specified score
(defaults: +15 and -2, IIRC) will train SpamAssassin's bayesian
filter automagically with ham/spam. After a certain amount of
learned mails, the bayes-filter will start to work.

I'm using this setup on a Debian box, installing and updating
SpamAssassin via CPAN. 

Regards,
Jens

-- 
BOFH Excuse #420:
Feature was not beta tested

--
[EMAIL PROTECTED] mailing list



[gentoo-user] Re: [lists] [gentoo-user] Bayesian spam filtering

2003-06-06 Thread Anthony Ventimiglia
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thursday 05 June 2003 09:31 am, Larry Wright wrote:
> I am currently running qmail + fetchmail + procmail + spamassassin.
> Everything works pretty well, but I'm a little dissapointed in
> spamassassin's accuracy. I'd ideally like something similar to spambayes,
> which is trainable. Unfortunately spambayes does not appear to work with
> maildir. I'd really like to keep qmail, does anyone know of a bayesian
> filter that works with qmail/maildir?
>

Drop that Spamassassin, it's tacking Bayesian filters on in hopes to stay 
alive, because Bayesian is the only way to do it.

I just installed Gentoo a couple weeks ago, on my last box I'd been using my 
own Bayesian filter which basically worked via procmail, mine was kind of 
crude, so I installed popfile (popfile.sf.net) when I set up Gentoo. Popfile 
has a real nice HTTP based control interface, but it is unfortunatley does 
not do things the Linux way. 

Bogofilter should be more *nixy since it was written by esr, but I don't know 
if it supports more than two buckets.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.2 (GNU/Linux)

iQCVAwUBPt9O7gqNYTLzAsoIAQLtwQQAno2mVWR4LFyRF3OaePqKjkQ9EjBxd366
bgS9Yh8Gbj2T+xy7TRzsJ7FDlGxh4otmfRhYfOH1GXYaQzrBNLGyZCwJuL8PGnXI
FnhykI+PlYxa8Mm5TRraJH49h7VD0+yTr11pFNwYACM1EiMu0TXiUX3mC3NewwtW
JVJrftC/yzY=
=C/Ol
-END PGP SIGNATURE-


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Bayesian spam filtering

2003-06-05 Thread MrPaulAR
Try the current spamassassin (2.54).

ACCEPT_KEYWORDS=~x86 emerge dev-perl/Mail-SpamAssassin

Looking for bayesian only then bogofilter may be a better choice (I prefer 
spamassassin on the server & popfile on the client).

emerge net-mail/bogofilter

Paul

At 08:31 AM 06/05/2003 -0500, you wrote:
I am currently running qmail + fetchmail + procmail + spamassassin. 
Everything
works pretty well, but I'm a little dissapointed in spamassassin's accuracy.
I'd ideally like something similar to spambayes, which is trainable.
Unfortunately spambayes does not appear to work with maildir. I'd really like
to keep qmail, does anyone know of a bayesian filter that works with
qmail/maildir?

Thanks,
Larry
--
[EMAIL PROTECTED] mailing list


--
[EMAIL PROTECTED] mailing list


Re: [gentoo-user] Bayesian spam filtering

2003-06-05 Thread Ryan
I've been using bogofilter (http://bogofilter.sourceforge.net/) with
procmail with great success. A few false negatives now and then but I
haven't seen a false positive yet. The only thing I don't like about the
setup is that I have to go to a commandline to mark the spam that slipped
through bogofilter as spam. I wish that I could just throw something in a
cron job that every couple hours it would re-compute its wordlists based
on the contents of my spam folders and my other folders. Does anyone have
a script that does this? I hear that spamprobe
(http://spamprobe.sourceforge.net/) (not used it yet) has this capability
built-in. Maybe I'll try switching to see how it works.

Ryan
[EMAIL PROTECTED]

> I am currently running qmail + fetchmail + procmail + spamassassin.
> Everything
> works pretty well, but I'm a little dissapointed in spamassassin's
> accuracy.
> I'd ideally like something similar to spambayes, which is trainable.
> Unfortunately spambayes does not appear to work with maildir. I'd really
> like
> to keep qmail, does anyone know of a bayesian filter that works with
> qmail/maildir?
>
> Thanks,
> Larry
>
>
> --
> [EMAIL PROTECTED] mailing list
>
>


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Bayesian spam filtering

2003-06-05 Thread wes chow

bogofilter might be what you're looking for.  I'm using it on my 
postfix+mailfilter system, but it should work fine with procmail.

Wes


On Thu, 5 Jun 2003, Larry Wright wrote:

> I am currently running qmail + fetchmail + procmail + spamassassin. Everything 
> works pretty well, but I'm a little dissapointed in spamassassin's accuracy. 
> I'd ideally like something similar to spambayes, which is trainable. 
> Unfortunately spambayes does not appear to work with maildir. I'd really like 
> to keep qmail, does anyone know of a bayesian filter that works with 
> qmail/maildir?
> 
> Thanks,
> Larry
> 
> 
> --
> [EMAIL PROTECTED] mailing list
> 
> 

--
[EMAIL PROTECTED] mailing list



[gentoo-user] Bayesian spam filtering

2003-06-05 Thread Larry Wright
I am currently running qmail + fetchmail + procmail + spamassassin. Everything 
works pretty well, but I'm a little dissapointed in spamassassin's accuracy. 
I'd ideally like something similar to spambayes, which is trainable. 
Unfortunately spambayes does not appear to work with maildir. I'd really like 
to keep qmail, does anyone know of a bayesian filter that works with 
qmail/maildir?

Thanks,
Larry


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Bayesian spam filtering

2003-06-05 Thread Anthony Ventimiglia
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Friday 06 June 2003 01:09 am, Luke Ravitch wrote:
> On 2003-06-05 20:45, Anthony Ventimiglia <[EMAIL PROTECTED]> wrote:
> > That's what I mean by buckets, Popfile lets you have as many as you want,
> > which I like so I can sort normal mail, work related, mailing lists,
> > spam, etc.. as many categories as I want.
>
> Well, you were right about bogofilter being "more *nixy" ;-)
>
> Normally I would use procmail rules for sorting based on mailing
> lists, work related, etc.  On the other hand, doing all that sorting
> based on Bayesian filtering sounds pretty cool.  And, for that matter,
> such a capability probably wouldn't make bogofilter any less *nixy.
> It would still be doing one thing (Bayesian filtering) and (hopefully)
> doing it well.
>

It's a fairly simple thing to enable it to work with more "buckets" 
the Bayesian formula allows for any amount. When I first read about
Bayesian filters, I wrote my own fairly quickly, and I actually have
an almost complete filter that allows for any amount of 'buckets', I
just recently set up this Gentoo system, so I installed Popfile so I
didn't have to install my library and filter. I really like the
popfile interface, so I'll probably end up doing something similar
with my own filter, once I get back to finishing it.


> I guess simple things like To and From headers work well enough for
> mailing lists (and I would normally keep a separate address for work
> vs. home) that I hadn't even thought of using Bayesian filtering for
> anything but spam.  Only with spammers can I not trust the simple
> things!
>
> Spammers should [fate far worse than anything I can think of]!

Anyone who hasn't looked at Bayesian filters should, They give
excellent results with a very simple algorithm, and all you have to do
is train it when you start out. Using more categories makes it learn a
little slower, but usually after seeng just a handful, a Bayesian
filter already predicts with about 80% accuracy. The one I have setup
now has 5 categories and is at 94% accuracy after about 1200 messages.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.2 (GNU/Linux)

iQCVAwUBPuAovwqNYTLzAsoIAQI8mQP/W/rifRBlCI+ltlZ/Vqi9iSU/VlmnWdyP
vSjEYo1UJg7e5XxN3EQDFCsEDGLMD+Y6WeOamQ70tvX4VL39mAWW+pEF1+s5lOuM
MWGyKB0Jilc4y1Nk+L2t9XrcUuDqHXaUoLwWoKAyndGfF4wwQug61ILcMmu/8rRn
B33625ElVPA=
=RztF
-END PGP SIGNATURE-


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Bayesian spam filtering

2003-06-05 Thread Luke Ravitch
On 2003-06-05 20:45, Anthony Ventimiglia <[EMAIL PROTECTED]> wrote:
> That's what I mean by buckets, Popfile lets you have as many as you want, 
> which I like so I can sort normal mail, work related, mailing lists, spam, 
> etc.. as many categories as I want. 

Well, you were right about bogofilter being "more *nixy" ;-)

Normally I would use procmail rules for sorting based on mailing
lists, work related, etc.  On the other hand, doing all that sorting
based on Bayesian filtering sounds pretty cool.  And, for that matter,
such a capability probably wouldn't make bogofilter any less *nixy.
It would still be doing one thing (Bayesian filtering) and (hopefully)
doing it well.

I guess simple things like To and From headers work well enough for
mailing lists (and I would normally keep a separate address for work
vs. home) that I hadn't even thought of using Bayesian filtering for
anything but spam.  Only with spammers can I not trust the simple
things!

Spammers should [fate far worse than anything I can think of]!

-- 
Luke

--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: [lists] [gentoo-user] Bayesian spam filtering

2003-06-05 Thread Anthony Ventimiglia
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thursday 05 June 2003 09:43 pm, Luke Ravitch wrote:
> On 2003-06-05 07:15, Anthony Ventimiglia <[EMAIL PROTECTED]> wrote:
>
> If you mean classification buckets, Bogofilter can be set to classify
> mail as spam, not spam (ham), and not sure.  That's three.

That's what I mean by buckets, Popfile lets you have as many as you want, 
which I like so I can sort normal mail, work related, mailing lists, spam, 
etc.. as many categories as I want. 
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.2 (GNU/Linux)

iQCVAwUBPuANVQqNYTLzAsoIAQKrdAP+Lz6DDm79qJxRGu6iFqse4JYWw/sZB04V
qlYxMQs7kMkqFbuk+AmW/cg8Ge8aUAmjZ5xsNpcAL3LxwY+/oETq+z1NyFKCr9/3
lVoWeEpPlEqdUz1TxPnv6T43NLi2oR8NrsTJ+/OC2Iyd53cJOINX777AjlftMs/b
HcKbIYSDFOQ=
=mQeq
-END PGP SIGNATURE-


--
[EMAIL PROTECTED] mailing list