Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread Vincent Fox

On 8/13/2013 9:46 PM, Matt Olney wrote:

OK...I'll do some testing tomorrow and see if we can't come up with some
information for you.

Mainly I want MX pool heavy on signatures.  I tested shorter list on 
SMTP pool:


ss_dbs=
   blurl.ndb
   bofhland_malware_URL.ndb
   bofhland_phishing_URL.ndb
   junk.ndb
   jurlbl.ndb
   jurlbla.ndb
   lott.ndb
   phish.ndb
   phishtank.ndb
   rogue.hdb
   sanesecurity.ftm
   scam.ndb
   sigwhitelist.ign2
   spam.ldb
   spamimg.hdb
   winnow_malware.hdb
   winnow_phish_complete_url.ndb


Which got it back down to about 30 seconds.  There are 3 signatures that
seem a large drag on startup:  bofhland cracked, scamnailer, and 
securiteinfo


Thanks!


___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread Steve Basford

 OK...I'll do some testing tomorrow and see if we can't come up with some
 information for you.

 Matt



 in the last few days a lot of spam is (ab)using t.co shortened URLs in
 the payload, so these are ending up in bofhland_cracked_URL.ndb (~7K
 distinct URLs atm)


Sorry for the cross post...

Hi,

In doing a very small single file test using the bofhland_cracked_URL.ndb,
it look ** 66 seconds ** to scan the file.

Having a quick look at repeating pattens in the file, 77 (www) was
common, so just for testing I tried this...

sed s/(B)772E/2E/g bofhland_cracked_URL.ndb 
bofhland_cracked_URL_test.ndb

This will remove the beginning boundary check and the www. bit... and
replace with a single ., which hopefully will be a simple boundary
separator:

If I now scan the same file, but using the bofhland_cracked_URL_test.ndb
database, it only takes ** 5 seconds ** :O

Not sure if this is the workaround... but certainly food for thought.

Cheers,

Steve
Sanesecurity

___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread Steve Basford

 OK...I'll do some testing tomorrow and see if we can't come up with some
 information for you.

Hi Matt

In additional testing:

a) Replacing (B)772E with (B)772E also brings the speed
down... (6.5 secs)

b) Replacing (B)772E with (B)77??772E also brings the speed
down...(10.2 secs)

c) Replacing (B)772E with 772E (w.) also brings the speed down...
(10.5 secs)

very odd.. but maybe option a) could be used, instead of (B)772E
which slows down db loading times.

Cheers,

Steve
Sanesecurity


Cheers,

Steve
Sanesecurity

___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread G.W. Haywood

Hi there,

On Wed, 14 Aug 2013, Vincent Fox wrote:

Re: clamd taking too long to restart?


Previously I was using a short list of signatures and startup time of 30
seconds which was acceptable.  Well it didn't get noticed much.

However recently I added a kitchen sink of extra databases like winnow etc.
Now startup time is 2.5 minutes, which becomes noticeable.


The kitchen sink of databases is very useful, I see more trash being
caught by them than I see viruses being caught by main and daily.


Any way to ameliorate this?


Are you using separate processes on each VM?  If so you might want to
consider using only one of them to run a clamd daemon, and have the
others contact it for the service.  You could conceivably arrange the
clamd daemon to be able to run on any one of the VMs, and then one of
them could be providing the service while another was restarted when
necessary.  When the newly started clamd is ready, switching from one
network connection to another will be very quick.

You could instead do something similar, but set up another two VMs to
provide the clamd service.  Then you could stop the whole VM when it
isn't being used to provide the clamd service, saving resources.  The
VMs which provide clamd could be stripped down so that they're small
and use minimal resources.  I would guess that 200M-300M of RAM and a
gigabyte of disc space would be plenty for one of the VMs, all it will
ever really do is run a few regex matches.

--

73,
Ged.
___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread Vincent Fox

On 8/14/2013 7:58 AM, G.W. Haywood wrote:

Hi there,

On Wed, 14 Aug 2013, Vincent Fox wrote:

Re: clamd taking too long to restart?


Previously I was using a short list of signatures and startup time of 30
seconds which was acceptable.  Well it didn't get noticed much.

However recently I added a kitchen sink of extra databases like 
winnow etc.

Now startup time is 2.5 minutes, which becomes noticeable.


The kitchen sink of databases is very useful, I see more trash being
caught by them than I see viruses being caught by main and daily.


Actually the vast bulk of the problem seems to come from bofhland 
Cracked URL.

Removing that database on my SMTP servers, cut restart time to 34 seconds.




Any way to ameliorate this?


Are you using separate processes on each VM?  If so you might want to
consider using only one of them to run a clamd daemon, and have the
others contact it for the service.  You could conceivably arrange the
clamd daemon to be able to run on any one of the VMs, and then one of
them could be providing the service while another was restarted when
necessary.  When the newly started clamd is ready, switching from one
network connection to another will be very quick.

You could instead do something similar, but set up another two VMs to
provide the clamd service.  Then you could stop the whole VM when it
isn't being used to provide the clamd service, saving resources. The
VMs which provide clamd could be stripped down so that they're small
and use minimal resources.  I would guess that 200M-300M of RAM and a
gigabyte of disc space would be plenty for one of the VMs, all it will
ever really do is run a few regex matches.


Hmmm yes.

We originally had a pool of mail routers, talking to a pool of ClamAV 
machines.

Hardware load balancer made things resilient.

However for simplicity of management we collapsed things down so each mail
router talked to it's localhost copy of ClamAV. It also allows 
differentiation, you can
easily have differing ClamAV databases for MX, SMTP, MSA hosts.   I see 
now how this
led to this particular problem, as the moment sendmail can't contact 
it's oneonly
ClamAV it starts throwing errors.  Stupid of me to overlook this 
deficiency before.


With LDAP clients I can define a failover list on a host, so if it can't 
contact it's

primary server it goes to next one.  Perhaps something like that here?

Thanks for pointing this out.


___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread Matt Olney
OK, we've been able to reproduce the problem and it is, as you all
suspected revolving around the www. matching.  I've asked one of the
developers to look at it, and we should be able to provide some
best-practice guidelines on how to construct rules to avoid this situation.
 We'll also review if code changes are appropriate, but given how the tree
operates, I don't immediately expect that to be the case.

Matt
___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread Steve Basford

 OK, we've been able to reproduce the problem and it is, as you all
 suspected revolving around the www. matching.  I've asked one of the
 developers to look at it, and we should be able to provide some
 best-practice guidelines on how to construct rules to avoid this
 situation.

Thanks Matt, glad you'd spotted an issue too.

  We'll also review if code changes are appropriate, but given how the tree
 operates, I don't immediately expect that to be the case.

Out of interest are there any roadmaps/future improvements for ClamAV
that are being discussed, as the last changelog update was May (before the
takeover)?

Cheers,

Steve
Sanesecurity

___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread Joel Esler
On Aug 14, 2013, at 2:34 PM, Steve Basford steveb_cla...@sanesecurity.com 
wrote:

 We'll also review if code changes are appropriate, but given how the tree
 operates, I don't immediately expect that to be the case.
 
 Out of interest are there any roadmaps/future improvements for ClamAV
 that are being discussed, as the last changelog update was May (before the
 takeover)?

Steve,

Just to clarify, at this time we’ve just announced Cisco acquiring Sourcefire.  
It takes time for the deal to be approved and go through.

I’ll let Matt speak to the specifics of the roadmap.

--
Joel Esler
Senior Research Engineer, VRT
OpenSource Community Manager
Sourcefire
___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread A K Varnell
On Aug 14, 2013, at 1:54 PM, Joel Esler jes...@sourcefire.com wrote:
 On Aug 14, 2013, at 2:34 PM, Steve Basford steveb_cla...@sanesecurity.com 
 wrote:
 
 We'll also review if code changes are appropriate, but given how the tree
 operates, I don't immediately expect that to be the case.
 
 Out of interest are there any roadmaps/future improvements for ClamAV
 that are being discussed, as the last changelog update was May (before the
 takeover)?
 
 Steve,
 
 Just to clarify, at this time we’ve just announced Cisco acquiring 
 Sourcefire.  It takes time for the deal to be approved and go through.
 
 I’ll let Matt speak to the specifics of the roadmap.

So I gather the 0.98 release that was announced back in February is in a 
holding pattern pending final approval once the Cisco acquisition has been 
approved and their processes put into place?


-Al-
-- 
Al Varnell
Mountain View, CA




___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread David Raynor
I've done some analysis of ClamAV with just this signature set, and the
loading is simply slowing down as it runs through the list. This is mainly
because of the significant amounts of overlap at the beginnings of these
strings and the length thereafter. The slowdown is occurring even before
the tries are created (after all signatures are loaded). It has to do with
the way the signatures are getting sorted and managed in an intermediate
state between loading the signature file and the final scanning trie. I
will try to strike a balance between technical and TL;DR but here are some
details.

First, some qualities within the dataset of the current
bofhland_cracked_URL.ndb file:
89 thousand signatures, (really ~44,500 written twice to get them loaded
into the HTML and MAIL targets)
53 thousand start with www. (772E)
14 thousand start with t.co (742E636F)
Each signature is a single unique pattern, within the 44,500. No
subpatterns, no wildcards, nothing for ClamAV to break it into pieces.

Why some wildcard replacement works to gain load speed:
The reason that adding replacing certain bytes with wildcards works is that
ClamAV can treat that wildcard as a subpattern breakpoint. Even better,
when the initial subpattern is an exact repeat then 100%-matching overlaps
can be handled differently by the code. Best loadtime bang-for-the-buck I
got was replacing this: 772E with this: 77{1}
End-to-end clamscan runtime in a VM before the simple replacement: Time:
62.540 sec (1 m 2 s)
End-to-end clamscan runtime in a VM after the simple replacement: Time:
2.965 sec (0 m 2 s)
All I did was a replace command in vim.

The trade-off, because everything has a cost:
(1) Mildly less accurate, since the dot has been replaced with 1 of any
character. But with strings that are all this long it should still be very
specific.
(2) More subpatterns equals more memory:
Original report --- LibClamAV debug: pool memory used: 36.734 MB
After replacement --- LibClamAV debug: pool memory used: 48.855 MB
With more subpatterns to track, the extra tracking comes with a price, and
that price is in memory.

I'll look a bit more at how we are loading the interim signature state and
see what else we could do with the sorting. Meanwhile, this is a change you
could put into practice now and get faster startup times. Before making any
change on a server directly, you can test a modified DB with clamscan to
see the difference.

My testing VM is 64-bit Debian, if it matters.

Hope this helps,

Dave R.



On Wed, Aug 14, 2013 at 12:40 PM, Matt Olney mol...@sourcefire.com wrote:

 OK, we've been able to reproduce the problem and it is, as you all
 suspected revolving around the www. matching.  I've asked one of the
 developers to look at it, and we should be able to provide some
 best-practice guidelines on how to construct rules to avoid this situation.
  We'll also review if code changes are appropriate, but given how the tree
 operates, I don't immediately expect that to be the case.

 Matt
 ___
 Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
 http://www.clamav.net/support/ml




-- 
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread Dennis Peterson

On 8/14/13 2:23:28PM, David Raynor wrote:


I'll look a bit more at how we are loading the interim signature state and
see what else we could do with the sorting. Meanwhile, this is a change you
could put into practice now and get faster startup times. Before making any
change on a server directly, you can test a modified DB with clamscan to
see the difference.

My testing VM is 64-bit Debian, if it matters.

Hope this helps,

Dave R.


I presume reloading signatures into an existing daemon instance is a 
blocking event. Is it possible to instantiate a new instance and migrate 
the socket to that instance once the update is completed, and then kill 
the stale daemon? I really don't care much about load times to be honest 
( 5 minutes on older SPARC systems) but do care about memory as I have 
it running on some pared down but very reliable hardware here and there.


dp
___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml


Re: [clamav-users] clamd taking too long to restart?

2013-08-14 Thread Matt Olney
Nope.  0.98 is getting patches applied to it and will then move to QA 
regression and finally to release engineering.  There is a lot going on in
0.98, and we'll have more information once we finalize a build.

Matt


On Wed, Aug 14, 2013 at 5:03 PM, A K Varnell alvarn...@mac.com wrote:

 On Aug 14, 2013, at 1:54 PM, Joel Esler jes...@sourcefire.com wrote:
  On Aug 14, 2013, at 2:34 PM, Steve Basford 
 steveb_cla...@sanesecurity.com wrote:
 
  We'll also review if code changes are appropriate, but given how the
 tree
  operates, I don't immediately expect that to be the case.
 
  Out of interest are there any roadmaps/future improvements for ClamAV
  that are being discussed, as the last changelog update was May (before
 the
  takeover)?
 
  Steve,
 
  Just to clarify, at this time we’ve just announced Cisco acquiring
 Sourcefire.  It takes time for the deal to be approved and go through.
 
  I’ll let Matt speak to the specifics of the roadmap.

 So I gather the 0.98 release that was announced back in February is in a
 holding pattern pending final approval once the Cisco acquisition has been
 approved and their processes put into place?


 -Al-
 --
 Al Varnell
 Mountain View, CA




 ___
 Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
 http://www.clamav.net/support/ml

___
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml