RE: Huge size of bayes_journal

2006-03-27 Thread MJ
Hi Gary V, 

Sorry for the delayed reply.

A clue was given to you in a previous post:
http://marc.theaimsgroup.com/?l=spamassassin-usersm=114269315700923w=2

But I am not getting the full picture. You said earlier that the bayes
files 
were in /var/amavis/.spamassassin yet you say you changed the home
directory 
of 'clamav' to /var/amavisd. Are your bayes files in 
/var/amavis/.spamassassin or /var/amavisd/.spamassassin? What user is 
running amavisd-new? Please do this for me as I asked before:

My bayes files are under /var/amavis/.spamassassin (sorry for the
typo)

What do you have $daemon_user and $daemon_group set to in amavisd.conf?

$daemon_user = 'clamav'
$daemon_group = 'clamav'

What does this say?: cat /etc/passwd | grep amavis

clamav:x:1005:103::/var/amavis:/bin/sh


When you ran /usr/local/bin/sa-learn -D --sync, what user were you
running 
this as?

I was logged in as user clamav after changing the initially assigned
home directory of user clamav from something else to /var/avamvis

Another hint, you also need to run 'sa-learn --sync --force-expire' as
your 
amavisd-new user. If it takes some time to complete then it is probably

working.




RE: Huge size of bayes_journal

2006-03-27 Thread Gary V

Hi Gary V,

My bayes files are under /var/amavis/.spamassassin (sorry for the
typo)



$daemon_user = 'clamav'
$daemon_group = 'clamav'



clamav:x:1005:103::/var/amavis:/bin/sh



Thanks for clarifying, everything should be OK then. Then you should
create a cron job to run sa-learn --sync --force-expire each day
(as clamav, or su to clamav). The 3.1 documentation says
a --force-expire will also run --sync.

Gary V

_
Don’t just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/




RE: Huge size of bayes_journal

2006-03-21 Thread Gary V

Hi,
Many thanks Theo Van Dinter, Gary V and others who helped. As suggested
by Theo Van Dinter, I change the home directory of clamav user as
/var/amavisd and then executed /usr/local/bin/sa-learn -D --sync it
took around 6 hours but now bayes_journal has been reduced from 3.5 GB
42 KB. How I can prevent this problem from reoccuring?

___
Mohammad Junaid


A clue was given to you in a previous post:
http://marc.theaimsgroup.com/?l=spamassassin-usersm=114269315700923w=2

But I am not getting the full picture. You said earlier that the bayes files 
were in /var/amavis/.spamassassin yet you say you changed the home directory 
of 'clamav' to /var/amavisd. Are your bayes files in 
/var/amavis/.spamassassin or /var/amavisd/.spamassassin? What user is 
running amavisd-new? Please do this for me as I asked before:


What do you have $daemon_user and $daemon_group set to in amavisd.conf?
What does this say?:
cat /etc/passwd | grep amavis

When you ran /usr/local/bin/sa-learn -D --sync, what user were you running 
this as?


Another hint, you also need to run 'sa-learn --sync --force-expire' as your 
amavisd-new user. If it takes some time to complete then it is probably 
working.


Gary V

_
Is your PC infected? Get a FREE online computer virus scan from McAfee® 
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963




RE: Huge size of bayes_journal

2006-03-20 Thread MJ
Hi,
Many thanks Theo Van Dinter, Gary V and others who helped. As suggested
by Theo Van Dinter, I change the home directory of clamav user as
/var/amavisd and then executed /usr/local/bin/sa-learn -D --sync it
took around 6 hours but now bayes_journal has been reduced from 3.5 GB
42 KB. How I can prevent this problem from reoccuring?

___
Mohammad Junaid





Huge size of bayes_journal

2006-03-18 Thread MJ

Hi,
I am running postfix 2.2.4 on Solaris 8 with amavisd-new.2.3.2,
SpamAssassin 3.1.0 and Clamav 0.8.7.1 as an AV/AS gateway to my main
email system. The problem is that in our /var/amavis/.spamassassin
directory most of the files are increasing specially bayes_jornal is
reaching to 3.4 GB. I have read that this file should not be more than
few KB, Can anyone help what could be the reason? 

Here is the ls -l output for this directory.

==
bash-2.03# ls -l /var/amavis/.spamassassin

-rw---   1 clamav   clamav   335904768 Mar 18 12:01 auto-whitelist
-rw---   1 clamav   clamav 6 Mar 18 12:01
auto-whitelist.mutex
-rw---   1 clamav   clamav  2196 Mar 18 12:01 bayes.mutex
-rw---   1 clamav   clamav   3441876576 Mar 18 12:01 bayes_journal
-rw---   1 clamav   clamav   167813120 Mar 18 12:01 bayes_seen
-rw---   1 clamav   clamav   336117760 Mar 18 12:01 bayes_toks
==

Thanks,
Mohammad Junaid.




Re: Huge size of bayes_journal

2006-03-18 Thread Theo Van Dinter
On Sat, Mar 18, 2006 at 12:07:35PM +0300, MJ wrote:
 I am running postfix 2.2.4 on Solaris 8 with amavisd-new.2.3.2,

I don't know if amavisd does something special wrt bayes,

 reaching to 3.4 GB. I have read that this file should not be more than
 few KB, Can anyone help what could be the reason? 

As the appropriate user, run sa-learn -D --sync and see what happens.

 -rw---   1 clamav   clamav   3441876576 Mar 18 12:01 bayes_journal
 -rw---   1 clamav   clamav   167813120 Mar 18 12:01 bayes_seen
 -rw---   1 clamav   clamav   336117760 Mar 18 12:01 bayes_toks

These are all extremely large.  It looks like auto-expire and/or auto-sync may
be disabled.

-- 
Randomly Generated Tagline:
Cold Boot: What a programmer puts on feet in winter.


pgpH62mAeCS3d.pgp
Description: PGP signature


RE: Huge size of bayes_journal

2006-03-18 Thread MJ
Hi Theo Van Dinter

I don't know if amavisd does something special wrt bayes,

Do you need me to send amavisd.conf? usually they (mailing list of
amavisd-new) suggest to post SA related issues to this list and not on
Amavisd-new list.


As the appropriate user, run sa-learn -D --sync and see what happens.

I did but still the same size, following is the output.
=
bash-2.03# /usr/local/bin/sa-learn -D --sync
[17329] dbg: logger: adding facilities: all
[17329] dbg: logger: logging level is DBG
[17329] dbg: generic: SpamAssassin version 3.1.0
[17329] dbg: config: score set 0 chosen.
[17329] dbg: util: running in taint mode? yes
[17329] dbg: util: taint mode: deleting unsafe environment variables,
resetting PATH
[17329] dbg: util: PATH included 'PATH', which is not absolute, dropping
[17329] dbg: util: PATH included '/usr/sbin', keeping
[17329] dbg: util: PATH included '/usr/bin', keeping
[17329] dbg: util: PATH included '/export/home/mg1', keeping
[17329] dbg: util: final PATH set to:
/usr/sbin:/usr/bin:/export/home/mg1
[17329] dbg: dns: is Net::DNS::Resolver available? yes
[17329] dbg: dns: Net::DNS version: 0.52
[17329] dbg: dns: name server: 212.119.64.2, family: 2, ipv6: 0
[17329] dbg: config: using /etc/mail/spamassassin for site rules pre
files
[17329] dbg: config: read file /etc/mail/spamassassin/init.pre
[17329] dbg: config: read file /etc/mail/spamassassin/v310.pre
[17329] dbg: config: using /usr/local/share/spamassassin for sys rules
pre files
[17329] dbg: config: using /usr/local/share/spamassassin for default
rules dir
[17329] dbg: config: read file /usr/local/share/spamassassin/10_misc.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_advance_fee.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_anti_ratware.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_body_tests.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_compensate.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_dnsbl_tests.cf
[17329] dbg: config: read file /usr/local/share/spamassassin/20_drugs.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_fake_helo_tests.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_head_tests.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_html_tests.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_meta_tests.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_net_tests.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_phrases.cf
[17329] dbg: config: read file /usr/local/share/spamassassin/20_porn.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_ratware.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/20_uri_tests.cf
[17329] dbg: config: read file /usr/local/share/spamassassin/23_bayes.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/25_accessdb.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/25_antivirus.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/25_body_tests_es.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/25_body_tests_pl.cf
[17329] dbg: config: read file /usr/local/share/spamassassin/25_dcc.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/25_domainkeys.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/25_hashcash.cf
[17329] dbg: config: read file /usr/local/share/spamassassin/25_pyzor.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/25_razor2.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/25_replace.cf
[17329] dbg: config: read file /usr/local/share/spamassassin/25_spf.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/25_textcat.cf
[17329] dbg: config: read file /usr/local/share/spamassassin/25_uribl.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/30_text_de.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/30_text_fr.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/30_text_it.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/30_text_nl.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/30_text_pl.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/30_text_pt_br.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/50_scores.cf
[17329] dbg: config: read file /usr/local/share/spamassassin/60_awl.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/60_whitelist.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/60_whitelist_spf.cf
[17329] dbg: config: read file
/usr/local/share/spamassassin/60_whitelist_subject.cf
[17329] dbg: config: using /etc/mail/spamassassin for site rules dir
[17329] dbg: config: read file /etc/mail/spamassassin/cyberia.cf
[17329] dbg: config: read file /etc/mail/spamassassin/local.cf
[17329] dbg: config: using //.spamassassin/user_prefs for user prefs
file
[17329] 

Re: Huge size of bayes_journal

2006-03-18 Thread Theo Van Dinter
On Sat, Mar 18, 2006 at 01:07:30PM +0300, MJ wrote:
 I did but still the same size, following is the output.
 bash-2.03# /usr/local/bin/sa-learn -D --sync

The # implies you're running as root.  Is that the same user as amavis
runs as?

[...]
 [17329] dbg: bayes: tie-ing to DB file R/O //.spamassassin/bayes_toks
 [17329] dbg: bayes: tie-ing to DB file R/O //.spamassassin/bayes_seen

This isn't the same path as you posted before, so I'm not surprised those
files didn't change.



 Do you need me to send amavisd.conf? usually they (mailing list of
 amavisd-new) suggest to post SA related issues to this list and not on
 Amavisd-new list.

Not specifically related to this thread, but just as a FWIW, the general
policy here is that unless the problem is reproducable with the standard
SpamAssassin tools (spamassassin, spamc/spamd, etc,) you'd need to talk to
the third parties involved (amavis, qmail-scanner, spamass-milter, etc.)

-- 
Randomly Generated Tagline:
Bit - The increment by which programmers slowly go mad.


pgp2rC8wkD6jf.pgp
Description: PGP signature


RE: Huge size of bayes_journal

2006-03-18 Thread MJ
Hi Theo van Dinter,

Thanks for you quick response.

The # implies you're running as root.  Is that the same user as
amavis runs as?

No, there is another user for daemon with a false shell, can't be use to
login as a normal user.

This isn't the same path as you posted before, so I'm not surprised
those files didn't change.

You mean which path? My bayes_* files are under
/var/amavis/.spamassassin.

Many thanks for your time.
___
Mohammad Junaid





Re: Huge size of bayes_journal

2006-03-18 Thread Theo Van Dinter
On Sat, Mar 18, 2006 at 01:51:45PM +0300, MJ wrote:
 Thanks for you quick response.

:)

 The # implies you're running as root.  Is that the same user as
 amavis runs as?
 
 No, there is another user for daemon with a false shell, can't be use to
 login as a normal user.

You need to somehow access that user's database files.  The usual method
would be switch to the appropriate user and run the previously stated
sa-learn command, then look at the debug output.

Another possibility is to use another user and setting bayes_path to
access those files, but that may lead to ownership/permission issues,
generally if an expire occurs.

If amavis has a spamassassin debug option, you could enable that,
and then look at the logs to see what the problem is, but the output
may be very large before you see the problem.

 This isn't the same path as you posted before, so I'm not surprised
 those files didn't change.
 You mean which path? My bayes_* files are under /var/amavis/.spamassassin.

Exactly.  As shown in the debug output you sent previously, by running
as root, sa-learn was using the files in /.spamassassin which isn't the
same as /var/amavis/.spamassassin.

-- 
Randomly Generated Tagline:
Dying is the leading cause of death in the world.


pgpI5NuD0HfAj.pgp
Description: PGP signature


Re: Huge size of bayes_journal

2006-03-18 Thread Michael Monnerie
On Samstag, 18. März 2006 11:51 MJ wrote:
 No, there is another user for daemon with a false shell, can't be use
 to login as a normal user.

su -l $USER_AMAVIS_RUNS_AS -s /bin/bash

That way you can run as the user with bash.

mfg zmi 
-- 
// Michael Monnerie, Ing.BSc  ---   it-management Michael Monnerie
// http://zmi.at   Tel: 0660/4156531  Linux 2.6.11
// PGP Key:   lynx -source http://zmi.at/zmi2.asc | gpg --import
// Fingerprint: EB93 ED8A 1DCD BB6C F952  F7F4 3911 B933 7054 5879
// Keyserver: www.keyserver.net Key-ID: 0x70545879


pgpi5QH04VMnq.pgp
Description: PGP signature


RE: Huge size of bayes_journal

2006-03-18 Thread MJ
Title: RE: Huge size of bayes_journal






Hi Theo,


I manage to switch to that user and executed the sa-learn command but since it has it' own home directory it created new .spamassassin directory under it's home directory. Actually /var/amavisd/.spamassassin which has these files is not a home directory for any user. So how to tell sa-learn command to read from this location?

I am afraid that my filesystem soon will be full.


Mohammad Junaid.






Re: Huge size of bayes_journal

2006-03-18 Thread Loren Wilton
Title: RE: Huge size of bayes_journal



You can and probably should remove the journal file. These are 
unlearned tokens, so they aren't affecting the classification of mail. The 
journal is so huge that it might take days to learn, and it also indicates that 
you are accumulating new material fairly quickly. So losing the current 
journal file shouldn't hurt anything.

  Loren



RE: Huge size of bayes_journal

2006-03-18 Thread MJ
You can and probably should remove the journal file.  These are
unlearned tokens, so they aren't affecting the classification of mail.
The journal is so huge that it might take days to learn, and it also
indicates that you are accumulating new material fairly quickly.  So
losing the current journal file shouldn't hurt anything.

Hi Loren Wilton,

Are you sure that it will not have any adverse effect on my system, I am
not in the position to take any chance?

Many thanks.

Regards,
___
Mohammad Junaid






Re: Huge size of bayes_journal

2006-03-18 Thread Gary V

On Samstag, 18. März 2006 11:51 MJ wrote:
 No, there is another user for daemon with a false shell, can't be use
 to login as a normal user.


The user 'clamav' should have a home dir of /var/amavis otherwise I wouldn't 
think the spamassasin files would end up in /var/amavis/.spamassassin.


what does this say?
cat /etc/passwd | grep clamav

To run sa-learn as this user (who does not have a shell), I would run:
sudo -H -u clamav sa-learn --sync --force-expire

I would set up a cron job to run this daily as it seems you have disabled 
auto expire and sync as noted. You can't do that without manually cleaning 
up on a regular basis. You can do that if you do.


But as noted, I think you are in a bit of a pickle now. The files are so 
hugh that I'm not sure how your system will handle it when the sync and 
expire are performed. One thing I'm reasonably confident of, the procedure 
will slow your system down for a considerable period of time.


Gary V

_
Express yourself instantly with MSN Messenger! Download today - it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/




RE: Huge size of bayes_journal

2006-03-18 Thread MJ
Hi Gary,

The user 'clamav' should have a home dir of /var/amavis otherwise I
wouldn't 
think the spamassasin files would end up in /var/amavis/.spamassassin.

what does this say?
cat /etc/passwd | grep clamav

clamav:x:1005:103::/home/clamav:/bin/false


To run sa-learn as this user (who does not have a shell), I would run:
sudo -H -u clamav sa-learn --sync --force-expire

You want me to try above command?

I would set up a cron job to run this daily as it seems you have
disabled 
auto expire and sync as noted. You can't do that without manually
cleaning 
up on a regular basis. You can do that if you do.

I didn't change anything related to auto expire or sync in amavisd.conf,
infact another machine with same configuration doesn't have such a huge
bayes_* files.

Any idea how to resolve this issue.

Many thanks,

Mohammad Junaid.




RE: Huge size of bayes_journal

2006-03-18 Thread Gary V

Hi Gary,

The user 'clamav' should have a home dir of /var/amavis otherwise I
wouldn't
think the spamassasin files would end up in /var/amavis/.spamassassin.

what does this say?
cat /etc/passwd | grep clamav

clamav:x:1005:103::/home/clamav:/bin/false


To run sa-learn as this user (who does not have a shell), I would run:
sudo -H -u clamav sa-learn --sync --force-expire

You want me to try above command?



Hmm, I don't know at this point. It is strange that the files are owned by 
'clamav'. Due to the fact that they are, I assumed you were running 
amavisd-new as user 'clamav'. What do you have

$daemon_user and $daemon_group set to in amavisd.conf? What does this say?:

cat /etc/passwd | grep amavis


I would set up a cron job to run this daily as it seems you have
disabled
auto expire and sync as noted. You can't do that without manually
cleaning
up on a regular basis. You can do that if you do.

I didn't change anything related to auto expire or sync in amavisd.conf,
infact another machine with same configuration doesn't have such a huge
bayes_* files.



The settings would be in local.cf, not amavisd.conf.
If the other machine has the same configuration, are the same files on that 
machine also owned by 'clamav'?



Any idea how to resolve this issue.

Many thanks,

Mohammad Junaid.



Gary V

_
Express yourself instantly with MSN Messenger! Download today - it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/




Re: Huge size of bayes_journal

2006-03-18 Thread mouss
MJ a écrit :
 Hi Gary,
 
 
The user 'clamav' should have a home dir of /var/amavis otherwise I
 
 wouldn't 
 
think the spamassasin files would end up in /var/amavis/.spamassassin.
 
 
what does this say?
cat /etc/passwd | grep clamav
 
 
 clamav:x:1005:103::/home/clamav:/bin/false
 

I guess amavisd-new is running as clamav.

 
 
To run sa-learn as this user (who does not have a shell), I would run:
 
 sudo -H -u clamav sa-learn --sync --force-expire
 
 You want me to try above command?
 

yes, if you have sudo. otherwise, use your imagination:


1- change the login shell to a valid one, purge the file, and then
change the login shell back to /bin/false.

2- cp the file to $user/.spamassassin/ for some user with a valid shell,
purge as this user, then rename the resulting file to the original one.







Re: Huge size of bayes_journal

2006-03-18 Thread jdow

From: MJ [EMAIL PROTECTED]


You can and probably should remove the journal file.  These are
unlearned tokens, so they aren't affecting the classification of mail.
The journal is so huge that it might take days to learn, and it also
indicates that you are accumulating new material fairly quickly.  So
losing the current journal file shouldn't hurt anything.

Hi Loren Wilton,

Are you sure that it will not have any adverse effect on my system, I am
not in the position to take any chance?


Rename it and see if the system runs OK. If it does, delete it.

{^_^}