Re: [Clamav-users] high clamd CPU load on Solaris
Hi Nigel- On Aug 28, 2006, at 6:42 AM, Nigel Horne wrote: Repeating advice already given here: the engine in 0.88 is *old*. If performance is an issue upgrade to the code in CVS. Thanks for your response. We'll definitely look at doing that. Just to confirm: you would expect clamd processes to spike to as much as 85% load in even a reasonably cpu/memory-laden machine under load? If that's normal or expected behavior (which hasn't seem to be the case for as long as we've been running this great software). we can deal. -- dNb ___ http://lurker.clamav.net/list/clamav-users.html
[Clamav-users] high clamd CPU load on Solaris
Howdy- We've recently been seeing our clamd processes run very hot (spiking up to 85% of the CPU as reported by prtstat and top) on two different Solaris 9 boxes. For example, here's a few lines from prtstat -L (showing the two clamav threads who are together eating 66% of the CPU) . PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/LWPID 18447 root 61M 60M run 00 0:27:51 34% clamd/14 18447 root 61M 60M cpu0200 0:25:40 32% clamd/22 Both machines are running 0.88.2 clamds being fed by exim. We'll upgrade to 88.4 shortly, but I haven't seen anything in the release notes that mentions this issue. I've looked at all of the obvious things I can think of (we are not swapping due to lack of memory as far as I can tell, not out of any resources like file descriptors, and so on) so I was hoping someone here could offer some advice on any other avenues to pursue. I've even done things like moving the temp directory to a tmpfs filesystem, but that hasn't helped the load any. I've included gdb backtraces for all threads below and also a copy of the truss counts for the process so you can see which system calls are being made (and which are giving errors, probably normally). I've included our non-default config options below. I've also included an ldd of the binary so you can see that we have it linked against almost all standard Solaris libs (all of which I think are patched properly, let me know if there is something I should check there). If there is anything else I should be checking, please let me know. Thanks for any help you can offer. -- dNb GDB: (gdb) info thread 5 LWP 63 0xff0858f4 in __lwp_park () from /usr/lib/libthread.so.1 4 LWP 14 0xff330d98 in getline_from_mbox () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 3 LWP 62 0xff0858f4 in __lwp_park () from /usr/lib/libthread.so.1 2 LWP 22 0xff309c98 in cli_ac_scanbuff () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 * 1 LWP 1 0xff01c934 in _so_accept () from /usr/lib/libc.so.1 (gdb) bt #0 0xff01c934 in _so_accept () from /usr/lib/libc.so.1 #1 0x0001a95c in acceptloop_th () #2 0x00018514 in localserver () #3 0x00017be0 in clamd () #4 0x00016758 in main () (gdb) thread 2 [Switching to thread 2 (LWP 22)]#0 0xff309c98 in cli_ac_scanbuff () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 (gdb) bt #0 0xff309c98 in cli_ac_scanbuff () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #1 0xff30bd50 in cli_scandesc () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #2 0xff31cca8 in cli_scanraw () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #3 0xff31d864 in cli_magic_scandesc () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #4 0xff31daa4 in cli_scanfile () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #5 0xff31b134 in cli_scandir () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #6 0xff31cb90 in cli_scanmail () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #7 0xff31d4d8 in cli_magic_scandesc () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #8 0xff31daa4 in cli_scanfile () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #9 0xff31b134 in cli_scandir () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #10 0xff31cb90 in cli_scanmail () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #11 0xff31d4d8 in cli_magic_scandesc () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #12 0xff31daa4 in cli_scanfile () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #13 0xff31b134 in cli_scandir () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #14 0xff31cb90 in cli_scanmail () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #15 0xff31d4d8 in cli_magic_scandesc () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #16 0xff31daa4 in cli_scanfile () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #17 0xff31b134 in cli_scandir () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #18 0xff31cb90 in cli_scanmail () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #19 0xff31d4d8 in cli_magic_scandesc () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #20 0xff31daa4 in cli_scanfile () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #21 0xff31b134 in cli_scandir () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #22 0xff31cb90 in cli_scanmail () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #23 0xff31d4d8 in cli_magic_scandesc () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #24 0xff31daa4 in cli_scanfile () from /arch/daemons/packages/clamav-0.88.2/lib/libclamav.so.1 #25 0xff31b134 in cli_scandir () from
Re: [Clamav-users] clamd on Solaris ceases functioning after a while
On Mar 24, 2005, at 1:12 PM, Elizabeth Schwartz wrote: This sounds like exactly what I was experiencing. Did the latest build fix it for you? Turning off clamd and running clamav-milter without the --external flag seems to have fixed it for me. Hi Betsy- See my followup message on March 17th, but the short answer is yes, the CVS version did fix things for us. No issues since that message, either. I would note that we're not using clamav in a milter configuration (we are using exim/exiscan+clamd) so I'm not sure if we've seen precisely the same issue you have experienced. -- dNb ___ http://lurker.clamav.net/list/clamav-users.html
Re: [Clamav-users] submission of phishing emails
Hi Jeremy- If you run the message back through SA with the -d or --remove-markup switch, it will undo its encapsulation of the spam message. -- dNb ___ http://lurker.clamav.net/list/clamav-users.html
Re: [Clamav-users] clamd on Solaris ceases functioning after a while (FIXED)
Howdy- Now that a week has gone by with absolutely no problems with our clamd hanging, I thought I would write in to provide the good news that I think we have this problem licked. Though we also rev'd exim on Wed, I think it was the upgrade for 0.83 to devel-20050308 that solved our problems. Many thanks to all the people who helped out with our issues. -- dNb P.S. If this message doesn't tempt fate enough to cause our entire mail server to burst into flames, I don't know what will. ___ http://lurker.clamav.net/list/clamav-users.html
Re: [Clamav-users] Re: clamd on Solaris ceases functioning after a while
On Feb 18, 2005, at 3:04 AM, Fajar A. Nugraha wrote: 4.34? That's old. If I remember correctly, I had some problem with that version as well. Use (at least) exim 4.41. That's what I use here, and it runs fine. Both Solaris 8 and 9. Yes, the version of exim is a little behind (they rev'd through to the current number pretty fast), but I'm not convinced the issue is the MTA. My rationale here is that when my clamd gets into its unhappy state, other clients besides my MTA can't talk to it either (e.g. clamdmon, posted on this list a little while back). If other clients start to fail against the daemon, that makes me want to look hard at the daemon and why it started to fail. I'm willing to believe that an older version of the MTA, even though it works the vast majority of the time, could find the edge condition to expose a bug (in somebody's code, I'm not pointing finger at clamav by any means). I don't want to be argumentative, but I'd really like to know where that bug is and preferably squash it without having to change my entire MTA in the hopes it will not trigger the bug. I'd be a bit nervous that whatever is leaving clamd in an unresponsive state could surface again later. I've got to sleep now, but I will work hard tomorrow to try and rebuild clamav to bring it closer to the working configurations already mentioned here. I'm very grateful to everyone who has tried to help so far. -- dNb P.S. If you'd like to pass on the build details of your working setup off-list, I'd appreciate it. ___ http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users
Re: [Clamav-users] clamd on Solaris ceases functioning after a while
On Feb 18, 2005, at 2:49 PM, Andy Fiddaman wrote: The accept debug will at least tell us if you're running out of file descriptors.. Roger. # ndd /dev/tcp tcp_time_wait_interval 6 # pfiles `pgrep clamd` | grep rlimit Current rlimit: 256 file descriptors Current rlimit: 256 file descriptors # netstat -an | grep 3310 | awk '{print $7}' | sort | uniq -c 10 ESTABLISHED 1 LISTEN 33 TIME_WAIT Also any lines in the /etc/system file which contain 'rlim' # grep rlim /etc/system # (i.e. none) He's another interesting thing: right now the process is running a little hot: load averages: 2.12, 1.89, 1.71 15:41:36 157 processes: 153 sleeping, 4 on cpu CPU states: 1.0% idle, 55.5% user, 24.9% kernel, 18.7% iowait, 0.0% swap Memory: 2048M real, 243M free, 1662M swap in use, 7863M swap free PID USERNAME THR PRI NICE SIZE RES STATETIMECPU COMMAND 29368 root 9 590 30M 28M sleep 47:18 24.33% clamd (ie. the top process) Important note: it seems to be working just fine at the moment so this is an example of it working well under load. You can also see that the machine is ok for memory and swap. Here's what it is doing: # truss -f -c -p 29368 ^C syscall seconds calls errors read .1406427 write.075 740 open .1521618 close.0691818 link .000 2 2 unlink .081 395 2 time .000 53 chmod.007 238 stat .042 694237 lseek.0565336 getpid .004 420 fstat.001 55 access .000 26 dup .003 184 times.0697616 ioctl.006 604604 fcntl.005 501 lwp_park .004 163 lwp_unpark .003 163 rmdir.079 508218 mkdir.145 293 poll .002 145 lstat.029 849 sigprocmask .000 52 mmap .003 55 munmap .008 53 yield.000 13 lwp_create .000 4 lwp_continue .000 4 lwp_kill .000 4 4 llseek .009 851 lwp_schedctl .000 4 getdents64 .026 961 lstat64 .026 420420 fstat64 .0291709 accept .005 26 recvmsg .000 26 -- sys totals: 1.096 33030 1487 usr time: 5.360 elapsed: 44.900 Thanks for taking the time to look into this with me. -- dNb ___ http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users
[Clamav-users] clamd on Solaris ceases functioning after a while
Hi- Thanks for such a great program and all of the work being put into it. We're having a nasty problem with clamd 0.8x (even with 0.83 which we just installed yesterday). After running for a while, it will decide to just stop functioning and return failures or refuse connect from the MTA. Here are some specifics: Solaris 9, gcc built, Solaris 9 stock zlib (1.1.4) Here's a sample part of our clamd.log: Tue Feb 15 10:16:43 2005 - SelfCheck: Database status OK. Tue Feb 15 10:19:53 2005 - /var/spool/exim/scan/1D14UQ-0005c9-DG/1D14UQ-0005c9- DG.eml: Unable to open file or directory ERROR Tue Feb 15 10:19:53 2005 - Client disconnected Tue Feb 15 10:19:53 2005 - ERROR: accept() failed Tue Feb 15 10:19:53 2005 - ERROR: accept() failed Tue Feb 15 10:19:53 2005 - ERROR: accept() failed Tue Feb 15 10:25:18 2005 - /var/spool/exim/scan/1D14ZC-0006sJ-9Q/1D14ZC-0006sJ- 9Q.eml: Worm.Lovgate.T FOUND (which eventually turns into all accept() failed, though it doesn't always say this. Sometimes it just reports Thu Feb 17 13:41:11 2005 - No stats for Database check - forcing reload as the last line before being autorestarted by my monitoring cronjob) Our exim logs have shown: 2005-02-17 08:24:23 1D1kTd-0005w0-32 malware acl condition: clamd: unable to read from socket (No such file or directory) or 2005-02-17 08:24:28 1D1ldr-0004nH-Qm malware acl condition: clamd: connection to 127.0.0.1, port 3310 failed (Bad file number) When it is in this state, a truss of the process shows several threads apparently continuing to run but it won't accept new connections as seen above. I haven't seen any indication in the log that I've reached a threading limit, but I don't know if I should expect one. I haven't been able to determine a specific pattern to when this happens and I can't seem to get it to repeat at will. The closest thing I've seen to a pattern is I've seen it happen several times when: a) the server has been started, b) it hasn't performed a successful SelfCheck yet. It doesn't always happen in this state (i.e. the first check doesn't always fail). I wish I could tell what the difference was between when it will work and when it will fail. I wonder if it happens to center around load, but I have no data to back that supposition up. I think I've seen this situation also happen (but I'm not sure) after a freshclam update that actually touched the database. Here's my non default answer in our config file: LogFile /priv/log/clamd/clamd.log LogFileMaxSize 100M LogTime LogSyslog LogFacility LOG_MAIL LogVerbose DatabaseDirectory /priv/daemons/packages/clamav-0.83/share/clamav TCPSocket 3310 TCPAddr 127.0.0.1 MaxConnectionQueueLength 30 StreamMaxLength 20M MaxThreads 20 # this set to 600 in the hopes I could cause the problem to surface faster, was set to default SelfCheck 600 Debug ScanRAR ArchiveBlockMax Any suggestions on where to look? Any other information I should gather for you? Should I try the current snapshot? Thanks for any help you can offer. -- dNb ___ http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users
Re: [Clamav-users] clamd on Solaris ceases functioning after a while
From: Igor Brezac [EMAIL PROTECTED]: How much memory does your clamd process consume when it stops running? Hi Igor- I haven't checked (the machine it is running on has plenty of memory and swap), but I will check next time this happens. Would you be willing to share your build configuration with me off list so I can compare? -- dNb ___ http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users
Re: [Clamav-users] clamd on Solaris ceases functioning after a while
(sorry for the weird quoting format and the breaking of the threading, I just switched myself back off digest mode so I don't have an easy way to respond to single messages for another message or two) From: James Lick [EMAIL PROTECTED]: I'm running clamd 0.83 on Solaris 9 compiled with gcc 3.4.2 and zlib 1.2.2. The older zlib releases have been known to cause clamd to crash, so you might want to try that first. I use clamd over a named socket instead of TCP, dunno if that would be a difference. You might also want to check if your kernel and thread library patches are up to date. Hi James- I'm certainly willing to try a zlib rebuild. We're also building using an earlier version of gcc. I'm reading the mail archive in earnest, but I'm curious if the problems with the zlib version presented symptoms that resembled mine (e.g. did they crash vs. hang the server?). The other thing that is peculiar to me is our problems are happening even though nothing on the machine, the basic clamd config (i.e. network socket), or our build steps have changed as we've upgraded from one version of clamav to the next. For example, we've built past versions of the server against the exact same zlib with no prior problems like this one. I'll also check our patch levels. I thought we were at the latest recommended set as of just a little while ago. It is at least comforting to hear that other are having success with a similar configuration. -- dNb ___ http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users
Re: [Clamav-users] Re: clamd on Solaris ceases functioning after a while
On Feb 18, 2005, at 1:40 AM, René Berber wrote: Nobody is pointing the obvious, the Debug option is not for production use, it could hang your clamd daemon under load. Thanks for looking carefully at the config. This was turned on only after I started having problems in the hopes it would provide something more to investigate. I'll turn it off again. -- dNb ___ http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users
Re: [Clamav-users] Re: clamd on Solaris ceases functioning after a while
On Feb 18, 2005, at 2:24 AM, René Berber wrote: Another idea: exiscan has some known problems, depending on the version, for instance see the following thread: http://www.gossamer-threads.com/lists/clamav/users/12973?nohighlight=1 Hmm, that's an interesting thought, though I'm running exiscan-acl-4.34-21.patch which reportedly fixed that problem. I'm wondering if I'm seeing a real live instance of this problem: http://www.gossamer-threads.com/lists/clamav/devel/17069 I haven't tried the patch (waiting to see how the developers respond to it). It is a little late here, I'll be doing some more debugging (and some rebuilding) when I'm more conscious. Thanks for your help! -- dNb ___ http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users