Re: [Clamav-devel] Virus DB Repo

2010-05-18 Thread David F. Skoll
 From: Mohammed Al-Saleh moealsa...@gmail.com

 Is the virus database updated through a repository (for example svn or cvs)?
 I would like to see how virus database changes over time.

Interesting.  I guess there's nothing stopping you from putting the
database in a git repo or similar.  You could do a commit each time just
before calling freshclam.

Edwin's response:

 No.

I found surprising, I must say... wouldn't it make sense to use git or
similar so you could easily revert bad signature additions? :)

Regards,

David.
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Virus DB Repo

2010-05-18 Thread Török Edwin
On 05/18/2010 03:13 PM, David F. Skoll wrote:
 From: Mohammed Al-Saleh moealsa...@gmail.com
 
 Is the virus database updated through a repository (for example svn or cvs)?
 I would like to see how virus database changes over time.
 
 Interesting.  I guess there's nothing stopping you from putting the
 database in a git repo or similar.  You could do a commit each time just
 before calling freshclam.
 
 Edwin's response:
 
 No.
 
 I found surprising, I must say... wouldn't it make sense to use git or
 similar 

We have a database of signatures and logs of what each update did.
Using git (or another VCS) here would just complicate things, since we'd
need to synchronize that with our internal DB both ways (in case of
reverts).

 so you could easily revert bad signature additions? :)

We can already drop the signature in the next update easily.

Best regards,
--Edwin
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Virus DB Repo

2010-05-18 Thread Brandon Perry
I agree that this would be neat as long as the current way is still
available. I don't want to have to install git or svn on the servers just to
be able to update my clam sigs.

On Tue, May 18, 2010 at 7:13 AM, David F. Skoll d...@roaringpenguin.comwrote:

  From: Mohammed Al-Saleh moealsa...@gmail.com

  Is the virus database updated through a repository (for example svn or
 cvs)?
  I would like to see how virus database changes over time.

 Interesting.  I guess there's nothing stopping you from putting the
 database in a git repo or similar.  You could do a commit each time just
 before calling freshclam.

 Edwin's response:

  No.

 I found surprising, I must say... wouldn't it make sense to use git or
 similar so you could easily revert bad signature additions? :)

 Regards,

 David.
 ___
 http://lurker.clamav.net/list/clamav-devel.html
 Please submit your patches to our Bugzilla: http://bugs.clamav.net




-- 
http://volatile-minds.blogspot.com -- blog
http://www.volatileminds.net -- website
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Virus DB Repo

2010-05-18 Thread Török Edwin
On 05/18/2010 03:18 PM, Brandon Perry wrote:
 I agree that this would be neat as long as the current way is still
 available. I don't want to have to install git or svn on the servers just to
 be able to update my clam sigs.

Distributing the virus DB via git/svn wouldn't scale, I don't think
David was suggesting that.

 
 On Tue, May 18, 2010 at 7:13 AM, David F. Skoll 
 d...@roaringpenguin.comwrote:
 
 From: Mohammed Al-Saleh moealsa...@gmail.com

 Is the virus database updated through a repository (for example svn or
 cvs)?
 I would like to see how virus database changes over time.

 Interesting.  I guess there's nothing stopping you from putting the
 database in a git repo or similar.  You could do a commit each time just
 before calling freshclam.

 Edwin's response:

 No.

 I found surprising, I must say... wouldn't it make sense to use git or
 similar so you could easily revert bad signature additions? :)

 Regards,

 David.
 ___
 http://lurker.clamav.net/list/clamav-devel.html
 Please submit your patches to our Bugzilla: http://bugs.clamav.net

 
 
 

___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Adding targets for the bytecode interpreter

2010-05-18 Thread Mark Allan


On 17 May 2010, at 7:32 pm, Török Edwin wrote:
A real fix would be to detect the Apple-style universal build  
(configure

does this already), and build both ppc and x86 then.
If you open a bugreport I'll try to do that for 0.96.2.


OK, filed as bug 2030.  https://wwws.clamav.net/bugzilla/show_bug.cgi?id=2030

Thanks very much for the explanation and the --enable-all-jit-targets  
fix in the meantime.


Mark
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Question

2010-05-18 Thread Mohammed Al-Saleh
Hi Edwin,

On Apr 27, 2010, at 7:19 AM, Török Edwin wrote:

 On 04/26/2010 10:20 PM, Mohammed Al-Saleh wrote:
 Hi Edwin,
 
 Thanks for your reply.
 I need to know the cases where ClamAV has performance bottlenecks or issues.
 
 The best way to do that is by measuring it.
 Read the last part of this reply:
 http://lurker.clamav.net/message/20081204.212941.c9fa45c2.en.html
 
 What kind of texts that could make ClamAV takes more time than usual. 
 
 That question is hard to answer, since the signatures change each day,
 thus the AC trie changes, the prefiltering patterns change ...
 
 Aho-Corasick and Boyer-Moore might have some situations that cause 
 performance issue.
 
 There is also a prefiltering step now.
 You can search bugzilla on why it was introduced.
 
 I might consider doing improvements or study performance impact.
 
 Don't expect it to be easy to make improvements.
 
 I spent quite a lot of time on the prefiltering step, and the problem is
 that some signatures falsely match a lot of times (like 'PE' from the PE
 signature), but the entire signature usually doesn't.
 So ClamAV has to stop the trie lookup, test the match, continue the trie
 lookup lots of times.

My understanding (please correct me if I am wrong) is that the first step in 
matching (let's ignore the filetype recognition and such) is the prefiltering 
step.
If the filter matches then further matching (using either AC or BM) is needed 
to make sure that it is not a false positive because the filter could contain 
more patterns than it should (and the filter matches at most 8 characters of 
the original signature so the other parts might not match).
I am not sure if I understand your point here and I really want to understand 
it:
So ClamAV has to stop the trie lookup, test the match, continue the trie 
lookup lots of times.
Can you please explain this to me more?
If the filter matches but AC or BM does not, would we return back to the filter 
to continue from the point it matches?


 Although the actual test is fast enough, if it happens a million times
 it does slow things down.
 
 Also the AC and BM are not textbook versions, they contain extensions
 (like wildcards).
 It is important that you study the performance with the actual
 signatures from main/daily.cvd, and on real files (both clean and infected).
 
 Do you think that this could be a realistic problem to study?
 
 That depends if you have some specific ideas on how to improve AC/BM, or
 you just want to try improving it, and give up if its not possible.
 
 Best regards,
 --Edwin
 ___
 http://lurker.clamav.net/list/clamav-devel.html
 Please submit your patches to our Bugzilla: http://bugs.clamav.net

Thanks much,

~Moe

___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Question

2010-05-18 Thread Török Edwin
On 05/18/2010 09:09 PM, Mohammed Al-Saleh wrote:
 Hi Edwin,
 
 On Apr 27, 2010, at 7:19 AM, Török Edwin wrote:
 
 On 04/26/2010 10:20 PM, Mohammed Al-Saleh wrote:
 Hi Edwin,

 Thanks for your reply.
 I need to know the cases where ClamAV has performance bottlenecks or issues.

 The best way to do that is by measuring it.
 Read the last part of this reply:
 http://lurker.clamav.net/message/20081204.212941.c9fa45c2.en.html

 What kind of texts that could make ClamAV takes more time than usual. 

 That question is hard to answer, since the signatures change each day,
 thus the AC trie changes, the prefiltering patterns change ...

 Aho-Corasick and Boyer-Moore might have some situations that cause 
 performance issue.

 There is also a prefiltering step now.
 You can search bugzilla on why it was introduced.

 I might consider doing improvements or study performance impact.

 Don't expect it to be easy to make improvements.

 I spent quite a lot of time on the prefiltering step, and the problem is
 that some signatures falsely match a lot of times (like 'PE' from the PE
 signature), but the entire signature usually doesn't.
 So ClamAV has to stop the trie lookup, test the match, continue the trie
 lookup lots of times.
 
 My understanding (please correct me if I am wrong) is that the first step in 
 matching (let's ignore the filetype recognition and such) is the prefiltering 
 step.
 If the filter matches then further matching (using either AC or BM) is needed 
 to make sure that it is not a false positive because the filter could contain 
 more patterns than it should (and the filter matches at most 8 characters of 
 the original signature so the other parts might not match).

Yes.

 I am not sure if I understand your point here and I really want to understand 
 it:
 So ClamAV has to stop the trie lookup, test the match, continue the trie 
 lookup lots of times.
 Can you please explain this to me more?
 If the filter matches but AC or BM does not, would we return back to the 
 filter to continue from the point it matches?

No, I was refering to how AC works.

After the AC trie detects a match it needs to check it, the AC trie
contains only a tiny part of the entire signature (up to ac_max_depth),
and the trie itself doesn't contain wildcards etc.

Best regards,
--Edwin
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net