Re: [clamav-users] clamd taking too long to restart?
On 8/13/2013 9:46 PM, Matt Olney wrote: OK...I'll do some testing tomorrow and see if we can't come up with some information for you. Mainly I want MX pool heavy on signatures. I tested shorter list on SMTP pool: ss_dbs= blurl.ndb bofhland_malware_URL.ndb bofhland_phishing_URL.ndb junk.ndb jurlbl.ndb jurlbla.ndb lott.ndb phish.ndb phishtank.ndb rogue.hdb sanesecurity.ftm scam.ndb sigwhitelist.ign2 spam.ldb spamimg.hdb winnow_malware.hdb winnow_phish_complete_url.ndb Which got it back down to about 30 seconds. There are 3 signatures that seem a large drag on startup: bofhland cracked, scamnailer, and securiteinfo Thanks! ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
OK...I'll do some testing tomorrow and see if we can't come up with some information for you. Matt in the last few days a lot of spam is (ab)using t.co shortened URLs in the payload, so these are ending up in bofhland_cracked_URL.ndb (~7K distinct URLs atm) Sorry for the cross post... Hi, In doing a very small single file test using the bofhland_cracked_URL.ndb, it look ** 66 seconds ** to scan the file. Having a quick look at repeating pattens in the file, 77 (www) was common, so just for testing I tried this... sed s/(B)772E/2E/g bofhland_cracked_URL.ndb bofhland_cracked_URL_test.ndb This will remove the beginning boundary check and the www. bit... and replace with a single ., which hopefully will be a simple boundary separator: If I now scan the same file, but using the bofhland_cracked_URL_test.ndb database, it only takes ** 5 seconds ** :O Not sure if this is the workaround... but certainly food for thought. Cheers, Steve Sanesecurity ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
OK...I'll do some testing tomorrow and see if we can't come up with some information for you. Hi Matt In additional testing: a) Replacing (B)772E with (B)772E also brings the speed down... (6.5 secs) b) Replacing (B)772E with (B)77??772E also brings the speed down...(10.2 secs) c) Replacing (B)772E with 772E (w.) also brings the speed down... (10.5 secs) very odd.. but maybe option a) could be used, instead of (B)772E which slows down db loading times. Cheers, Steve Sanesecurity Cheers, Steve Sanesecurity ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
Hi there, On Wed, 14 Aug 2013, Vincent Fox wrote: Re: clamd taking too long to restart? Previously I was using a short list of signatures and startup time of 30 seconds which was acceptable. Well it didn't get noticed much. However recently I added a kitchen sink of extra databases like winnow etc. Now startup time is 2.5 minutes, which becomes noticeable. The kitchen sink of databases is very useful, I see more trash being caught by them than I see viruses being caught by main and daily. Any way to ameliorate this? Are you using separate processes on each VM? If so you might want to consider using only one of them to run a clamd daemon, and have the others contact it for the service. You could conceivably arrange the clamd daemon to be able to run on any one of the VMs, and then one of them could be providing the service while another was restarted when necessary. When the newly started clamd is ready, switching from one network connection to another will be very quick. You could instead do something similar, but set up another two VMs to provide the clamd service. Then you could stop the whole VM when it isn't being used to provide the clamd service, saving resources. The VMs which provide clamd could be stripped down so that they're small and use minimal resources. I would guess that 200M-300M of RAM and a gigabyte of disc space would be plenty for one of the VMs, all it will ever really do is run a few regex matches. -- 73, Ged. ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
On 8/14/2013 7:58 AM, G.W. Haywood wrote: Hi there, On Wed, 14 Aug 2013, Vincent Fox wrote: Re: clamd taking too long to restart? Previously I was using a short list of signatures and startup time of 30 seconds which was acceptable. Well it didn't get noticed much. However recently I added a kitchen sink of extra databases like winnow etc. Now startup time is 2.5 minutes, which becomes noticeable. The kitchen sink of databases is very useful, I see more trash being caught by them than I see viruses being caught by main and daily. Actually the vast bulk of the problem seems to come from bofhland Cracked URL. Removing that database on my SMTP servers, cut restart time to 34 seconds. Any way to ameliorate this? Are you using separate processes on each VM? If so you might want to consider using only one of them to run a clamd daemon, and have the others contact it for the service. You could conceivably arrange the clamd daemon to be able to run on any one of the VMs, and then one of them could be providing the service while another was restarted when necessary. When the newly started clamd is ready, switching from one network connection to another will be very quick. You could instead do something similar, but set up another two VMs to provide the clamd service. Then you could stop the whole VM when it isn't being used to provide the clamd service, saving resources. The VMs which provide clamd could be stripped down so that they're small and use minimal resources. I would guess that 200M-300M of RAM and a gigabyte of disc space would be plenty for one of the VMs, all it will ever really do is run a few regex matches. Hmmm yes. We originally had a pool of mail routers, talking to a pool of ClamAV machines. Hardware load balancer made things resilient. However for simplicity of management we collapsed things down so each mail router talked to it's localhost copy of ClamAV. It also allows differentiation, you can easily have differing ClamAV databases for MX, SMTP, MSA hosts. I see now how this led to this particular problem, as the moment sendmail can't contact it's oneonly ClamAV it starts throwing errors. Stupid of me to overlook this deficiency before. With LDAP clients I can define a failover list on a host, so if it can't contact it's primary server it goes to next one. Perhaps something like that here? Thanks for pointing this out. ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
OK, we've been able to reproduce the problem and it is, as you all suspected revolving around the www. matching. I've asked one of the developers to look at it, and we should be able to provide some best-practice guidelines on how to construct rules to avoid this situation. We'll also review if code changes are appropriate, but given how the tree operates, I don't immediately expect that to be the case. Matt ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
OK, we've been able to reproduce the problem and it is, as you all suspected revolving around the www. matching. I've asked one of the developers to look at it, and we should be able to provide some best-practice guidelines on how to construct rules to avoid this situation. Thanks Matt, glad you'd spotted an issue too. We'll also review if code changes are appropriate, but given how the tree operates, I don't immediately expect that to be the case. Out of interest are there any roadmaps/future improvements for ClamAV that are being discussed, as the last changelog update was May (before the takeover)? Cheers, Steve Sanesecurity ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
On Aug 14, 2013, at 2:34 PM, Steve Basford steveb_cla...@sanesecurity.com wrote: We'll also review if code changes are appropriate, but given how the tree operates, I don't immediately expect that to be the case. Out of interest are there any roadmaps/future improvements for ClamAV that are being discussed, as the last changelog update was May (before the takeover)? Steve, Just to clarify, at this time we’ve just announced Cisco acquiring Sourcefire. It takes time for the deal to be approved and go through. I’ll let Matt speak to the specifics of the roadmap. -- Joel Esler Senior Research Engineer, VRT OpenSource Community Manager Sourcefire ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
On Aug 14, 2013, at 1:54 PM, Joel Esler jes...@sourcefire.com wrote: On Aug 14, 2013, at 2:34 PM, Steve Basford steveb_cla...@sanesecurity.com wrote: We'll also review if code changes are appropriate, but given how the tree operates, I don't immediately expect that to be the case. Out of interest are there any roadmaps/future improvements for ClamAV that are being discussed, as the last changelog update was May (before the takeover)? Steve, Just to clarify, at this time we’ve just announced Cisco acquiring Sourcefire. It takes time for the deal to be approved and go through. I’ll let Matt speak to the specifics of the roadmap. So I gather the 0.98 release that was announced back in February is in a holding pattern pending final approval once the Cisco acquisition has been approved and their processes put into place? -Al- -- Al Varnell Mountain View, CA ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
I've done some analysis of ClamAV with just this signature set, and the loading is simply slowing down as it runs through the list. This is mainly because of the significant amounts of overlap at the beginnings of these strings and the length thereafter. The slowdown is occurring even before the tries are created (after all signatures are loaded). It has to do with the way the signatures are getting sorted and managed in an intermediate state between loading the signature file and the final scanning trie. I will try to strike a balance between technical and TL;DR but here are some details. First, some qualities within the dataset of the current bofhland_cracked_URL.ndb file: 89 thousand signatures, (really ~44,500 written twice to get them loaded into the HTML and MAIL targets) 53 thousand start with www. (772E) 14 thousand start with t.co (742E636F) Each signature is a single unique pattern, within the 44,500. No subpatterns, no wildcards, nothing for ClamAV to break it into pieces. Why some wildcard replacement works to gain load speed: The reason that adding replacing certain bytes with wildcards works is that ClamAV can treat that wildcard as a subpattern breakpoint. Even better, when the initial subpattern is an exact repeat then 100%-matching overlaps can be handled differently by the code. Best loadtime bang-for-the-buck I got was replacing this: 772E with this: 77{1} End-to-end clamscan runtime in a VM before the simple replacement: Time: 62.540 sec (1 m 2 s) End-to-end clamscan runtime in a VM after the simple replacement: Time: 2.965 sec (0 m 2 s) All I did was a replace command in vim. The trade-off, because everything has a cost: (1) Mildly less accurate, since the dot has been replaced with 1 of any character. But with strings that are all this long it should still be very specific. (2) More subpatterns equals more memory: Original report --- LibClamAV debug: pool memory used: 36.734 MB After replacement --- LibClamAV debug: pool memory used: 48.855 MB With more subpatterns to track, the extra tracking comes with a price, and that price is in memory. I'll look a bit more at how we are loading the interim signature state and see what else we could do with the sorting. Meanwhile, this is a change you could put into practice now and get faster startup times. Before making any change on a server directly, you can test a modified DB with clamscan to see the difference. My testing VM is 64-bit Debian, if it matters. Hope this helps, Dave R. On Wed, Aug 14, 2013 at 12:40 PM, Matt Olney mol...@sourcefire.com wrote: OK, we've been able to reproduce the problem and it is, as you all suspected revolving around the www. matching. I've asked one of the developers to look at it, and we should be able to provide some best-practice guidelines on how to construct rules to avoid this situation. We'll also review if code changes are appropriate, but given how the tree operates, I don't immediately expect that to be the case. Matt ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml -- --- Dave Raynor Sourcefire Vulnerability Research Team dray...@sourcefire.com ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
On 8/14/13 2:23:28PM, David Raynor wrote: I'll look a bit more at how we are loading the interim signature state and see what else we could do with the sorting. Meanwhile, this is a change you could put into practice now and get faster startup times. Before making any change on a server directly, you can test a modified DB with clamscan to see the difference. My testing VM is 64-bit Debian, if it matters. Hope this helps, Dave R. I presume reloading signatures into an existing daemon instance is a blocking event. Is it possible to instantiate a new instance and migrate the socket to that instance once the update is completed, and then kill the stale daemon? I really don't care much about load times to be honest ( 5 minutes on older SPARC systems) but do care about memory as I have it running on some pared down but very reliable hardware here and there. dp ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
Re: [clamav-users] clamd taking too long to restart?
Nope. 0.98 is getting patches applied to it and will then move to QA regression and finally to release engineering. There is a lot going on in 0.98, and we'll have more information once we finalize a build. Matt On Wed, Aug 14, 2013 at 5:03 PM, A K Varnell alvarn...@mac.com wrote: On Aug 14, 2013, at 1:54 PM, Joel Esler jes...@sourcefire.com wrote: On Aug 14, 2013, at 2:34 PM, Steve Basford steveb_cla...@sanesecurity.com wrote: We'll also review if code changes are appropriate, but given how the tree operates, I don't immediately expect that to be the case. Out of interest are there any roadmaps/future improvements for ClamAV that are being discussed, as the last changelog update was May (before the takeover)? Steve, Just to clarify, at this time we’ve just announced Cisco acquiring Sourcefire. It takes time for the deal to be approved and go through. I’ll let Matt speak to the specifics of the roadmap. So I gather the 0.98 release that was announced back in February is in a holding pattern pending final approval once the Cisco acquisition has been approved and their processes put into place? -Al- -- Al Varnell Mountain View, CA ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml ___ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml