RE: Very big auto-whitelist file

2006-09-01 Thread Stéphane LEPREVOST

Well, a few more information :

Output of sa-learn --dump magic -D :

[22420] dbg: bayes: found bayes db version 3
[22420] dbg: bayes: DB journal sync: last sync: 1157102359
[22420] dbg: config: score set 3 chosen.
0.000  0  3  0  non-token data: bayes db version
0.000  01189366  0  non-token data: nspam
0.000  0 197582  0  non-token data: nham
0.000  0 387408  0  non-token data: ntokens
0.000  0 1157049872  0  non-token data: oldest atime
0.000  0 1157102360  0  non-token data: newest atime
0.000  0 1157102359  0  non-token data: last journal sync
atime
0.000  0 1157093142  0  non-token data: last expiry atime
0.000  0  43200  0  non-token data: last expire atime
delta
0.000  0 295143  0  non-token data: last expire
reduction count
[22420] dbg: bayes: untie-ing
[22420] dbg: bayes: untie-ing db_toks
[22420] dbg: bayes: untie-ing db_seen

If I read well, there's 387408 tokens in the DB... Despite there's no
bayes_expiry_max_db_size specified anywhere and the defalut value is 15
(??)

Shall I issue a sa-learn --force-expire command ?
Does it supposed to work ?

Stephane

-Message d'origine-
De : Stéphane LEPREVOST [mailto:[EMAIL PROTECTED] 
Envoyé : vendredi 1 septembre 2006 10:18
À : 'users@spamassassin.apache.org'
Objet : RE: Very big auto-whitelist file


One more question in the same way : my bayes_seen file is quite huge too
(about 160Mb)

Googling around about this I saw there was some bugs with versions prior to
3.1 but despite I'm using version 3.1.1 (a bit late on upgrading too, I'm
afraid :-\ ) I think there's something wrong here too... Is there a way to
fix it or to trim the file ?

Stephane

-Message d'origine-
De : Stéphane LEPREVOST [mailto:[EMAIL PROTECTED]
Envoyé : jeudi 31 août 2006 22:39
À : 'users@spamassassin.apache.org'
Objet : RE: Very big auto-whitelist file


Thanks Kris for this usefull tool, I'll try it tommorow (and thanks to Roger
too who noticed the existence of your tool)

As you noticed, I get worried very very very late... But in fact I wasn't in
charge of spamassassin when we first saw this growth, that's why I'm back on
the problem only now... I guess I'll pay more attention to this now ;D

Stephane

-Message d'origine-
De : Kris Deugau [mailto:[EMAIL PROTECTED] Envoyé : jeudi 31 août 2006
21:58 À : users@spamassassin.apache.org Objet : Re: Very big auto-whitelist
file

Roger Taranto wrote:
> There's an additional tool to run after you run check_whitelist.  It's 
> called trim_whitelist, and it compacts the db file.  I can't remember 
> where I found it, but you should be able to google for it.  It should 
> reduce the size of your db file quite a bit.

That would be the ancient creaky tool I wrote ~2 years ago.Make sure
to read the notes and caveats regarding DB_File/AnyDBM_File.

Google seems to have lost, or *very* heavily downrated, the direct link to
the space I posted it (and a few other tools) to, so:

http://www.deepnet.cx/~kdeugau/spamtools/

And I wrote it because of this exact problem of AWL files growing
indefinitely...  although I got worried around 5M instead of 1.2G.  ;)

-kgd




RE: Very big auto-whitelist file

2006-09-01 Thread Stéphane LEPREVOST

One more question in the same way : my bayes_seen file is quite huge too
(about 160Mb)

Googling around about this I saw there was some bugs with versions prior to
3.1 but despite I'm using version 3.1.1 (a bit late on upgrading too, I'm
afraid :-\ ) I think there's something wrong here too... Is there a way to
fix it or to trim the file ?

Stephane

-Message d'origine-
De : Stéphane LEPREVOST [mailto:[EMAIL PROTECTED] 
Envoyé : jeudi 31 août 2006 22:39
À : 'users@spamassassin.apache.org'
Objet : RE: Very big auto-whitelist file


Thanks Kris for this usefull tool, I'll try it tommorow (and thanks to Roger
too who noticed the existence of your tool)

As you noticed, I get worried very very very late... But in fact I wasn't in
charge of spamassassin when we first saw this growth, that's why I'm back on
the problem only now... I guess I'll pay more attention to this now ;D

Stephane

-Message d'origine-
De : Kris Deugau [mailto:[EMAIL PROTECTED] Envoyé : jeudi 31 août 2006
21:58 À : users@spamassassin.apache.org Objet : Re: Very big auto-whitelist
file

Roger Taranto wrote:
> There's an additional tool to run after you run check_whitelist.  It's 
> called trim_whitelist, and it compacts the db file.  I can't remember 
> where I found it, but you should be able to google for it.  It should 
> reduce the size of your db file quite a bit.

That would be the ancient creaky tool I wrote ~2 years ago.Make sure
to read the notes and caveats regarding DB_File/AnyDBM_File.

Google seems to have lost, or *very* heavily downrated, the direct link to
the space I posted it (and a few other tools) to, so:

http://www.deepnet.cx/~kdeugau/spamtools/

And I wrote it because of this exact problem of AWL files growing
indefinitely...  although I got worried around 5M instead of 1.2G.  ;)

-kgd




Re: Very big auto-whitelist file

2006-08-31 Thread Kris Deugau

Stéphane LEPREVOST wrote:

As you noticed, I get worried very very very late... But in fact I wasn't in
charge of spamassassin when we first saw this growth, that's why I'm back on
the problem only now... I guess I'll pay more attention to this now ;D


  It became a problem for me with a 10G hard drive in the server 
supporting ~250-300 accounts with 20M "not-the-INBOX" quotas.


My *personal* server, where I've long had much more disk, far fewer 
accounts, and no quotas, has been less of a concern - but even there the 
AWL file has sort of levelled off at ~10M (still on SA2.64).


-kgd


RE: Very big auto-whitelist file

2006-08-31 Thread Stéphane LEPREVOST

Thanks Kris for this usefull tool, I'll try it tommorow (and thanks to Roger
too who noticed the existence of your tool)

As you noticed, I get worried very very very late... But in fact I wasn't in
charge of spamassassin when we first saw this growth, that's why I'm back on
the problem only now... I guess I'll pay more attention to this now ;D

Stephane

-Message d'origine-
De : Kris Deugau [mailto:[EMAIL PROTECTED] 
Envoyé : jeudi 31 août 2006 21:58
À : users@spamassassin.apache.org
Objet : Re: Very big auto-whitelist file

Roger Taranto wrote:
> There's an additional tool to run after you run check_whitelist.  It's 
> called trim_whitelist, and it compacts the db file.  I can't remember 
> where I found it, but you should be able to google for it.  It should 
> reduce the size of your db file quite a bit.

That would be the ancient creaky tool I wrote ~2 years ago.Make sure
to read the notes and caveats regarding DB_File/AnyDBM_File.

Google seems to have lost, or *very* heavily downrated, the direct link to
the space I posted it (and a few other tools) to, so:

http://www.deepnet.cx/~kdeugau/spamtools/

And I wrote it because of this exact problem of AWL files growing
indefinitely...  although I got worried around 5M instead of 1.2G.  ;)

-kgd




RE: Very big auto-whitelist file

2006-08-31 Thread Stéphane LEPREVOST
 
Thanks Logan, it was a good idea to check the du -k :

696046  auto-whitelist

Looks like the file is half used in fact...

Regarding the volume, I have about 4 messages by day including spam, and
if I remember well, I thing this file has never been cleared...

Stephane

-Message d'origine-
De : Logan Shaw [mailto:[EMAIL PROTECTED] 
Envoyé : jeudi 31 août 2006 19:09
À : users@spamassassin.apache.org
Objet : Re: Very big auto-whitelist file

On Thu, 31 Aug 2006, Stéphane LEPREVOST wrote:
> A little question about AWL : I have an auto_whitelist how looks VERY 
> HUGE to me :
> -rw---1 root root 1241124864 Aug 31 17:51 auto-whitelist
>
> Do you think a 1.2 Gb AWL file is NORMAL ?

You might try typing "du -k auto-whitelist".  It could be a sparse file, and
the amount of disk it's actually using isn't as large as what you think.

It does seem a little large, but it's hard to tell.  Mine is this size:

 -rw---   1 root root 5234688 2006-08-31 12:04 auto-whitelist

but then, I have a fairly low-volume site (less than 1000 messages a day,
including spam) with not all that many users.

   - Logan




Re: Very big auto-whitelist file

2006-08-31 Thread Kris Deugau

Roger Taranto wrote:

There's an additional tool to run after you run check_whitelist.  It's
called trim_whitelist, and it compacts the db file.  I can't remember
where I found it, but you should be able to google for it.  It should
reduce the size of your db file quite a bit.


That would be the ancient creaky tool I wrote ~2 years ago.Make 
sure to read the notes and caveats regarding DB_File/AnyDBM_File.


Google seems to have lost, or *very* heavily downrated, the direct link 
to the space I posted it (and a few other tools) to, so:


http://www.deepnet.cx/~kdeugau/spamtools/

And I wrote it because of this exact problem of AWL files growing 
indefinitely...  although I got worried around 5M instead of 1.2G.  ;)


-kgd


Re: Very big auto-whitelist file

2006-08-31 Thread Roger Taranto
On Thu, 2006-08-31 at 09:00, Stéphane LEPREVOST wrote:
> A little question about AWL : I have an auto_whitelist how looks VERY
> HUGE to me :
> -rw---1 root root 1241124864 Aug 31 17:51
> auto-whitelist
>  
> Do you think a 1.2 Gb AWL file is NORMAL ?
>  
> I don't think so and plan to use check_whitelist tool to clean it,
> something like :
> check_whitelist --clean --min 2
>  
> Does it looks right for you ? I'm a bit afraid it might be a very long
> process because of it's size ...
>  
> Any advice or information from someone who experienced it is welcome

There's an additional tool to run after you run check_whitelist.  It's
called trim_whitelist, and it compacts the db file.  I can't remember
where I found it, but you should be able to google for it.  It should
reduce the size of your db file quite a bit.

-Roger


Re: Very big auto-whitelist file

2006-08-31 Thread Logan Shaw

On Thu, 31 Aug 2006, St?phane LEPREVOST wrote:

A little question about AWL : I have an auto_whitelist how looks VERY HUGE
to me :
-rw---1 root root 1241124864 Aug 31 17:51 auto-whitelist

Do you think a 1.2 Gb AWL file is NORMAL ?


You might try typing "du -k auto-whitelist".  It could be a
sparse file, and the amount of disk it's actually using isn't
as large as what you think.

It does seem a little large, but it's hard to tell.  Mine is
this size:

-rw---   1 root root 5234688 2006-08-31 12:04 auto-whitelist

but then, I have a fairly low-volume site (less than 1000
messages a day, including spam) with not all that many users.

  - Logan

Very big auto-whitelist file

2006-08-31 Thread Stéphane LEPREVOST



A little question 
about AWL : I have an auto_whitelist how looks VERY HUGE to me 
:
-rw---    1 root 
root 1241124864 Aug 31 17:51 
auto-whitelist
 
Do you think a 1.2 
Gb AWL file is NORMAL ?
 
I don't think so and 
plan to use check_whitelist tool to clean it, something like 
:
check_whitelist 
--clean --min 2
 
Does it looks right 
for you ? I'm a bit afraid it might be a very long process because of it's size 
...
 
Any advice or 
information from someone who experienced it is welcome
 
Regards,
Stephane