Re: [Clamav-users] high clamd CPU load on Solaris

2006-08-30 Thread David Blank-Edelman

Hi Nigel-

On Aug 28, 2006, at 6:42 AM, Nigel Horne wrote:

Repeating advice already given here: the engine in 0.88 is *old*.  
If performance is

an issue upgrade to the code in CVS.


Thanks for your response. We'll definitely look at doing that.

Just to confirm: you would expect clamd processes to spike to as much  
as 85% load in even a reasonably cpu/memory-laden machine under load?  
If that's normal or expected behavior (which hasn't seem to be the  
case for as long as we've been running this great software). we can  
deal.


 -- dNb

___
http://lurker.clamav.net/list/clamav-users.html


[Clamav-users] high clamd CPU load on Solaris

2006-08-27 Thread David Blank-Edelman

Howdy-

We've recently been seeing our clamd processes run very hot (spiking  
up to 85% of the CPU as reported by prtstat and top) on two different  
Solaris 9 boxes. For example, here's a few lines from prtstat -L  
(showing the two clamav threads who are together eating 66% of the  
CPU) .


PID USERNAME  SIZE   RSS STATE  PRI NICE  TIME  CPU PROCESS/LWPID
18447 root   61M   60M run  00   0:27:51  34% clamd/14
18447 root   61M   60M cpu0200   0:25:40  32% clamd/22

Both machines are running 0.88.2 clamds being fed by exim. We'll  
upgrade to 88.4 shortly, but I haven't seen anything in the release  
notes that mentions this issue. I've looked at all of the obvious  
things I can think of (we are not swapping due to lack of memory as  
far as I can tell, not out of any resources like file descriptors,  
and so on) so I was hoping someone here could offer some advice on  
any other avenues to pursue. I've even done things like moving the  
temp directory to a tmpfs filesystem, but that hasn't helped the load  
any.


I've included gdb backtraces for all threads below and also a copy of  
the truss counts for the process so you can see which system calls  
are being made (and which are giving errors, probably normally). I've  
included our non-default config options below. I've also included an  
ldd of the binary so you can see that we have it linked against  
almost all standard Solaris libs (all of which I think are patched  
properly, let me know if there is something I should check there).


If there is anything else I should be checking, please let me know.  
Thanks for any help you can offer.


 -- dNb

GDB:

(gdb) info thread
  5 LWP 63  0xff0858f4 in __lwp_park () from /usr/lib/libthread.so.1
  4 LWP 14  0xff330d98 in getline_from_mbox ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
  3 LWP 62  0xff0858f4 in __lwp_park () from /usr/lib/libthread.so.1
  2 LWP 22  0xff309c98 in cli_ac_scanbuff ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
* 1 LWP 1  0xff01c934 in _so_accept () from /usr/lib/libc.so.1
(gdb) bt
#0  0xff01c934 in _so_accept () from /usr/lib/libc.so.1
#1  0x0001a95c in acceptloop_th ()
#2  0x00018514 in localserver ()
#3  0x00017be0 in clamd ()
#4  0x00016758 in main ()
(gdb) thread 2
[Switching to thread 2 (LWP 22)]#0  0xff309c98 in cli_ac_scanbuff ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
(gdb) bt
#0  0xff309c98 in cli_ac_scanbuff ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#1  0xff30bd50 in cli_scandesc ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#2  0xff31cca8 in cli_scanraw ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#3  0xff31d864 in cli_magic_scandesc ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#4  0xff31daa4 in cli_scanfile ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#5  0xff31b134 in cli_scandir ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#6  0xff31cb90 in cli_scanmail ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#7  0xff31d4d8 in cli_magic_scandesc ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#8  0xff31daa4 in cli_scanfile ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#9  0xff31b134 in cli_scandir ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#10 0xff31cb90 in cli_scanmail ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#11 0xff31d4d8 in cli_magic_scandesc ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#12 0xff31daa4 in cli_scanfile ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#13 0xff31b134 in cli_scandir ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#14 0xff31cb90 in cli_scanmail ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#15 0xff31d4d8 in cli_magic_scandesc ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#16 0xff31daa4 in cli_scanfile ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#17 0xff31b134 in cli_scandir ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#18 0xff31cb90 in cli_scanmail ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#19 0xff31d4d8 in cli_magic_scandesc ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#20 0xff31daa4 in cli_scanfile ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#21 0xff31b134 in cli_scandir ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#22 0xff31cb90 in cli_scanmail ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#23 0xff31d4d8 in cli_magic_scandesc ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#24 0xff31daa4 in cli_scanfile ()
   from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1
#25 0xff31b134 in cli_scandir ()
   from 

Re: [Clamav-users] clamd on Solaris ceases functioning after a while

2005-03-24 Thread David Blank-Edelman
On Mar 24, 2005, at 1:12 PM, Elizabeth Schwartz wrote:
This sounds like exactly what I was experiencing. Did the latest build
fix it for you? Turning off clamd and running clamav-milter without
the --external flag seems to have fixed it for me.
Hi Betsy-
  See my followup message on March 17th, but the short answer is yes, 
the CVS version did fix things for us. No issues since that message, 
either.

I would note that we're not using clamav in a milter configuration (we 
are using exim/exiscan+clamd) so I'm not sure if we've seen precisely 
the same issue you have experienced.

  -- dNb
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] submission of phishing emails

2005-03-24 Thread David Blank-Edelman
Hi Jeremy-
   If you run the message back through SA with the -d or 
--remove-markup switch, it will undo its encapsulation of the spam 
message.

-- dNb
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] clamd on Solaris ceases functioning after a while (FIXED)

2005-03-17 Thread David Blank-Edelman
Howdy-
   Now that  a week has gone by with absolutely no problems with our 
clamd hanging, I thought I would write in to provide the good news that 
I think we have this problem licked. Though we also rev'd exim on Wed, 
I think it was the upgrade for 0.83 to devel-20050308 that solved our 
problems. Many thanks to all the people who helped out with our issues.

 -- dNb
P.S. If this message doesn't tempt fate enough to cause our entire mail 
server to burst into flames, I don't know what will.

___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Re: clamd on Solaris ceases functioning after a while

2005-02-18 Thread David Blank-Edelman
On Feb 18, 2005, at 3:04 AM, Fajar A. Nugraha wrote:
4.34? That's old. If I remember correctly, I had some problem with 
that version as well.
Use (at least) exim 4.41. That's what I use here, and it runs fine. 
Both Solaris 8 and 9.
Yes, the version of exim is a little behind (they rev'd through to the 
current number pretty fast), but I'm not convinced the issue is the 
MTA. My rationale here is that when my clamd gets into its unhappy 
state, other clients besides my MTA can't talk to it either (e.g. 
clamdmon, posted on this list a little while back). If other clients 
start to fail against the daemon, that makes me want to look hard at 
the daemon and why it started to fail.

I'm willing to believe that an older version of the MTA, even though it 
works the vast majority of the time, could find the edge condition to 
expose a bug (in somebody's code, I'm not pointing finger at clamav by 
any means). I don't want to be argumentative, but I'd really like to 
know where that bug is and preferably squash it without having to 
change my entire MTA in the hopes it will not trigger the bug. I'd be a 
bit nervous that whatever is leaving clamd in an unresponsive state 
could surface again later.

I've got to sleep now, but I will work hard tomorrow to try and rebuild 
clamav to bring it closer to the working configurations already 
mentioned here. I'm very grateful to everyone who has tried to help so 
far.

   -- dNb
P.S. If you'd like to pass on the build details of your working setup 
off-list, I'd appreciate it.

___
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Re: [Clamav-users] clamd on Solaris ceases functioning after a while

2005-02-18 Thread David Blank-Edelman
On Feb 18, 2005, at 2:49 PM, Andy Fiddaman wrote:
The accept debug will at least tell us if you're running out of file
descriptors..
Roger.
# ndd /dev/tcp tcp_time_wait_interval
6
# pfiles `pgrep clamd` | grep rlimit
  Current rlimit: 256 file descriptors
  Current rlimit: 256 file descriptors
# netstat -an | grep 3310 | awk '{print $7}' | sort | uniq -c
  10 ESTABLISHED
   1 LISTEN
  33 TIME_WAIT
Also any lines in the /etc/system file which contain 'rlim'
 # grep rlim /etc/system
#
(i.e. none)
He's another interesting thing: right now the process is running a 
little hot:

load averages:  2.12,  1.89,  1.71 
15:41:36
157 processes: 153 sleeping, 4 on cpu
CPU states:  1.0% idle, 55.5% user, 24.9% kernel, 18.7% iowait,  0.0% 
swap
Memory: 2048M real, 243M free, 1662M swap in use, 7863M swap free

   PID USERNAME THR PRI NICE  SIZE   RES STATETIMECPU COMMAND
 29368 root   9  590   30M   28M sleep   47:18 24.33% clamd
(ie. the top process)
Important note: it seems to be working just fine at the moment so this 
is an example of it working well under load. You can also see that the 
machine is ok for memory and swap.

Here's what it is doing:
# truss -f -c -p 29368
^C
syscall   seconds   calls  errors
read .1406427
write.075 740
open .1521618
close.0691818
link .000   2  2
unlink   .081 395  2
time .000  53
chmod.007 238
stat .042 694237
lseek.0565336
getpid   .004 420
fstat.001  55
access   .000  26
dup  .003 184
times.0697616
ioctl.006 604604
fcntl.005 501
lwp_park .004 163
lwp_unpark   .003 163
rmdir.079 508218
mkdir.145 293
poll .002 145
lstat.029 849
sigprocmask  .000  52
mmap .003  55
munmap   .008  53
yield.000  13
lwp_create   .000   4
lwp_continue .000   4
lwp_kill .000   4  4
llseek   .009 851
lwp_schedctl .000   4
getdents64   .026 961
lstat64  .026 420420
fstat64  .0291709
accept   .005  26
recvmsg  .000  26
   --   
sys totals: 1.096   33030   1487
usr time:   5.360
elapsed:   44.900
Thanks for taking the time to look into this with me.
   -- dNb
___
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


[Clamav-users] clamd on Solaris ceases functioning after a while

2005-02-17 Thread David Blank-Edelman
Hi-
  Thanks for such a great program and all of the work being put into 
it. We're having a nasty problem with clamd 0.8x (even with 0.83 which 
we just installed yesterday). After running for a while, it will decide 
to just stop functioning and return failures or refuse connect from the 
MTA. Here are some specifics:

Solaris 9, gcc built, Solaris 9 stock zlib (1.1.4)
Here's a sample part of our clamd.log:
Tue Feb 15 10:16:43 2005 - SelfCheck: Database status OK.
Tue Feb 15 10:19:53 2005 - 
/var/spool/exim/scan/1D14UQ-0005c9-DG/1D14UQ-0005c9-
DG.eml: Unable to open file or directory ERROR
Tue Feb 15 10:19:53 2005 - Client disconnected
Tue Feb 15 10:19:53 2005 - ERROR: accept() failed
Tue Feb 15 10:19:53 2005 - ERROR: accept() failed
Tue Feb 15 10:19:53 2005 - ERROR: accept() failed
Tue Feb 15 10:25:18 2005 - 
/var/spool/exim/scan/1D14ZC-0006sJ-9Q/1D14ZC-0006sJ-
9Q.eml: Worm.Lovgate.T FOUND

(which eventually turns into all accept() failed, though it doesn't 
always say this. Sometimes it just reports Thu Feb 17 13:41:11 2005 - 
No stats for Database check - forcing reload as the last line before 
being autorestarted by my monitoring cronjob)

Our exim logs have shown:
2005-02-17 08:24:23 1D1kTd-0005w0-32 malware acl condition: clamd: 
unable to read from socket (No such file or directory)

or
2005-02-17 08:24:28 1D1ldr-0004nH-Qm malware acl condition: clamd: 
connection to
 127.0.0.1, port 3310 failed (Bad file number)

When it is in this state, a truss of the process shows several threads 
apparently continuing to run but it won't accept new connections as 
seen above. I haven't seen any indication in the log that I've reached 
a threading limit, but I don't know if I should expect one.

I haven't been able to determine a specific pattern to when this 
happens and I can't seem to get it to repeat at will. The closest thing 
I've seen to a pattern is I've seen it happen several times when: a) 
the server has been started, b) it hasn't performed a successful 
SelfCheck yet. It doesn't always happen in this state (i.e. the first 
check doesn't always fail). I wish I could tell what the difference was 
between when it will work and when it will fail. I wonder if it happens 
to center around load, but I have no data to back that supposition up. 
I think I've seen this situation also happen (but I'm not sure) after a 
freshclam update that actually touched the database.

Here's my non default answer in our config file:
LogFile /priv/log/clamd/clamd.log
LogFileMaxSize 100M
LogTime
LogSyslog
LogFacility LOG_MAIL
LogVerbose
DatabaseDirectory /priv/daemons/packages/clamav-0.83/share/clamav
TCPSocket 3310
TCPAddr 127.0.0.1
MaxConnectionQueueLength 30
StreamMaxLength 20M
MaxThreads 20
# this set to 600 in the hopes I could cause the problem to surface 
faster, was set to default
SelfCheck 600
Debug
ScanRAR
ArchiveBlockMax

Any suggestions on where to look? Any other information I should gather 
for you? Should I try the current snapshot? Thanks for any help you can 
offer.

 -- dNb
___
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Re: [Clamav-users] clamd on Solaris ceases functioning after a while

2005-02-17 Thread David Blank-Edelman
From: Igor Brezac [EMAIL PROTECTED]:
  How much memory does your clamd process consume when it stops 
running?
Hi Igor-
  I haven't checked (the machine it is running on has plenty of memory 
and swap), but I will check next time this happens. Would you be 
willing to share your build configuration with me off list so I can 
compare?

-- dNb
___
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Re: [Clamav-users] clamd on Solaris ceases functioning after a while

2005-02-17 Thread David Blank-Edelman
(sorry for the weird quoting format and the breaking of the threading, 
I just switched myself back off digest mode so I don't have an easy way 
to respond to single messages for another message or two)

From: James Lick [EMAIL PROTECTED]:
I'm running clamd 0.83 on Solaris 9 compiled with gcc 3.4.2 and zlib 
1.2.2.  The older zlib releases have been known to cause clamd to 
crash, so you might want to try that first.  I use clamd over a named 
socket instead of TCP, dunno if that would be a difference.  You might 
also want to check if your kernel and thread library patches are up to 
date.

Hi James-
  I'm certainly willing to try a zlib rebuild. We're also building 
using an earlier version of gcc. I'm reading the mail archive in 
earnest, but I'm curious if the problems with the zlib version 
presented symptoms that resembled mine (e.g. did they crash vs. hang 
the server?).

The other thing that is peculiar to me is our problems are happening 
even though nothing on the machine, the basic clamd config (i.e. 
network socket), or our build steps have changed as we've upgraded from 
one version of clamav to the next. For example, we've built past 
versions of the server against the exact same zlib with no prior 
problems like this one.

I'll also check our patch levels.  I thought we were at the latest 
recommended set as of just a little while ago.

It is at least comforting to hear that other are having success with a 
similar configuration.

-- dNb
___
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Re: [Clamav-users] Re: clamd on Solaris ceases functioning after a while

2005-02-17 Thread David Blank-Edelman
On Feb 18, 2005, at 1:40 AM, René Berber wrote:
Nobody is pointing the obvious, the Debug option is not for production 
use, it could hang your clamd daemon under load.
Thanks for looking carefully at the config. This was turned on only 
after I started having problems in the hopes it would provide something 
more to investigate. I'll turn it off again.

   -- dNb
___
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Re: [Clamav-users] Re: clamd on Solaris ceases functioning after a while

2005-02-17 Thread David Blank-Edelman
On Feb 18, 2005, at 2:24 AM, René Berber wrote:
Another idea: exiscan has some known problems, depending on the 
version, for instance see the following thread:

  
http://www.gossamer-threads.com/lists/clamav/users/12973?nohighlight=1
Hmm, that's an interesting thought, though I'm running 
exiscan-acl-4.34-21.patch which reportedly fixed that problem.

I'm wondering if I'm seeing a real live instance of this problem:
http://www.gossamer-threads.com/lists/clamav/devel/17069
I haven't tried the patch (waiting to see how the developers respond to 
it).

It is a little late here, I'll be doing some more debugging (and some 
rebuilding) when I'm more conscious.

Thanks for your help!
  -- dNb
___
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users