Re: Bayes db size....
- Original Message - From: "Dave Koontz" <[EMAIL PROTECTED]> To: "'spam mailling list'" Sent: Saturday, February 17, 2007 9:30 AM Subject: Re: Bayes db size Is there a consensus on this need? I deal with the seen db issue by scheduled deletion of that file. That said, with SA becoming more and more prominent all the time, I suspect the Average Joe will miss this oddity until they wind up with a sluggish system, out of drive space or other related issues. I was mostly curious of the logic on NOT doing maintenance on the Seen and AWL db files. If there is a consensus this needs to occur, then perhaps I can take the time to create a proper patch. I just want to make sure I am not missing something fundamental here Michael Parker wrote: Dave Koontz wrote: I use the SQL interface and expire the bayes_seen like this. I believe 6 months to be over conservative. I added a lastupdate column as a timestamp. In the perl DBM I would recommend you use a technique such as this and update the timestamp in perl. It converts nicely to SQL. Here is my query for cleaning bayes_seen: mysql -u$USER -p$PW -h$SERVER -e\ "DELETE FROM bayes_seen WHERE lastupdate <= DATE_SUB(SYSDATE(), INTERVAL 6 MONTH); " \ $DB Hope this helps, Ken
Re: Bayes db size....
Is there a consensus on this need? I deal with the seen db issue by scheduled deletion of that file. That said, with SA becoming more and more prominent all the time, I suspect the Average Joe will miss this oddity until they wind up with a sluggish system, out of drive space or other related issues. I was mostly curious of the logic on NOT doing maintenance on the Seen and AWL db files. If there is a consensus this needs to occur, then perhaps I can take the time to create a proper patch. I just want to make sure I am not missing something fundamental here Michael Parker wrote: > Dave Koontz wrote: > >> I am sure this has been asked numerous times before, but what is the logic >> in having auto expiry on the bayes DB, and not seen? Seems that once tokens >> have been removed from the DB there is little to no use for 'unlearning' any >> associated messages. Besides on a busy system, this seen file gets large >> very fast. I'd vote for auto expiry and maintenance on seen as well as AWL. >> >> > > Patches welcome. > > Michael > > >
Re: Bayes db size....
Dave Koontz wrote: > I am sure this has been asked numerous times before, but what is the logic > in having auto expiry on the bayes DB, and not seen? Seems that once tokens > have been removed from the DB there is little to no use for 'unlearning' any > associated messages. Besides on a busy system, this seen file gets large > very fast. I'd vote for auto expiry and maintenance on seen as well as AWL. > Patches welcome. Michael > > -Original Message- > From: Theo Van Dinter [mailto:[EMAIL PROTECTED] > Sent: Friday, February 16, 2007 7:19 PM > To: spam mailling list > Subject: Re: Bayes db size > > On Fri, Feb 16, 2007 at 06:17:36PM -0600, Robert Nicholson wrote: >> So you're saying that right now seen isn't capped like tokens right? > > seen has no max size nor expiry features. > > -- > Randomly Selected Tagline: > "Like any French restaurant in America, it was overpriced, noisy, moody, > and would put you in mortal danger if you had an accident with anything > larger than a croissant." - Unknown about the Renault LeCar > >
RE: Bayes db size....
I am sure this has been asked numerous times before, but what is the logic in having auto expiry on the bayes DB, and not seen? Seems that once tokens have been removed from the DB there is little to no use for 'unlearning' any associated messages. Besides on a busy system, this seen file gets large very fast. I'd vote for auto expiry and maintenance on seen as well as AWL. -Original Message- From: Theo Van Dinter [mailto:[EMAIL PROTECTED] Sent: Friday, February 16, 2007 7:19 PM To: spam mailling list Subject: Re: Bayes db size On Fri, Feb 16, 2007 at 06:17:36PM -0600, Robert Nicholson wrote: > So you're saying that right now seen isn't capped like tokens right? seen has no max size nor expiry features. -- Randomly Selected Tagline: "Like any French restaurant in America, it was overpriced, noisy, moody, and would put you in mortal danger if you had an accident with anything larger than a croissant." - Unknown about the Renault LeCar
Re: Bayes db size....
On Fri, Feb 16, 2007 at 06:45:51PM -0600, Robert Nicholson wrote: > Well then I only care about tokens and not repeated emails can I > disable seen? You can't disable it, but you can delete it, as previously stated. -- Randomly Selected Tagline: 54% of all statistics are made up. No, make that 82%... pgpJeszJhPLwp.pgp Description: PGP signature
Re: Bayes db size....
Well then I only care about tokens and not repeated emails can I disable seen? On Feb 16, 2007, at 6:19 PM, Theo Van Dinter wrote: On Fri, Feb 16, 2007 at 06:17:36PM -0600, Robert Nicholson wrote: So you're saying that right now seen isn't capped like tokens right? seen has no max size nor expiry features. -- Randomly Selected Tagline: "Like any French restaurant in America, it was overpriced, noisy, moody, and would put you in mortal danger if you had an accident with anything larger than a croissant." - Unknown about the Renault LeCar
Re: Bayes db size....
On Fri, Feb 16, 2007 at 06:17:36PM -0600, Robert Nicholson wrote: > So you're saying that right now seen isn't capped like tokens right? seen has no max size nor expiry features. -- Randomly Selected Tagline: "Like any French restaurant in America, it was overpriced, noisy, moody, and would put you in mortal danger if you had an accident with anything larger than a croissant." - Unknown about the Renault LeCar pgpoU1aLK9mxe.pgp Description: PGP signature
Re: Bayes db size....
So you're saying that right now seen isn't capped like tokens right? On Feb 16, 2007, at 5:45 PM, Theo Van Dinter wrote: On Fri, Feb 16, 2007 at 05:42:13PM -0600, Robert Nicholson wrote: Why then is my Bayes DB 20MEG in size right now if =item bayes_expiry_max_db_size (default: 15) That's in number of tokens, not physical size in bytes. 100,000 tokens, whichever has a larger value. 150,000 tokens is roughly equivalent to a 8Mb database file. That's an estimate, but depends on your platforms, libraries, etc. How do I control the size of the _seen file? You can delete it if you want to. You'll be able to release messages again, but that may not be an issue for you. -- Randomly Selected Tagline: "Truly unencumbered by the engineering process." - Unknown about the Renault Dauphine
Re: Bayes db size....
On Fri, Feb 16, 2007 at 05:42:13PM -0600, Robert Nicholson wrote: > Why then is my Bayes DB 20MEG in size right now if > =item bayes_expiry_max_db_size (default: 15) That's in number of tokens, not physical size in bytes. > 100,000 tokens, whichever has a larger value. 150,000 tokens is roughly > equivalent to a 8Mb database file. That's an estimate, but depends on your platforms, libraries, etc. > How do I control the size of the _seen file? You can delete it if you want to. You'll be able to release messages again, but that may not be an issue for you. -- Randomly Selected Tagline: "Truly unencumbered by the engineering process." - Unknown about the Renault Dauphine pgp5XYTaI5E5C.pgp Description: PGP signature