Re: [Clamav-devel] Virus DB Repo
From: Mohammed Al-Saleh moealsa...@gmail.com Is the virus database updated through a repository (for example svn or cvs)? I would like to see how virus database changes over time. Interesting. I guess there's nothing stopping you from putting the database in a git repo or similar. You could do a commit each time just before calling freshclam. Edwin's response: No. I found surprising, I must say... wouldn't it make sense to use git or similar so you could easily revert bad signature additions? :) Regards, David. ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net
Re: [Clamav-devel] Virus DB Repo
On 05/18/2010 03:13 PM, David F. Skoll wrote: From: Mohammed Al-Saleh moealsa...@gmail.com Is the virus database updated through a repository (for example svn or cvs)? I would like to see how virus database changes over time. Interesting. I guess there's nothing stopping you from putting the database in a git repo or similar. You could do a commit each time just before calling freshclam. Edwin's response: No. I found surprising, I must say... wouldn't it make sense to use git or similar We have a database of signatures and logs of what each update did. Using git (or another VCS) here would just complicate things, since we'd need to synchronize that with our internal DB both ways (in case of reverts). so you could easily revert bad signature additions? :) We can already drop the signature in the next update easily. Best regards, --Edwin ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net
Re: [Clamav-devel] Virus DB Repo
I agree that this would be neat as long as the current way is still available. I don't want to have to install git or svn on the servers just to be able to update my clam sigs. On Tue, May 18, 2010 at 7:13 AM, David F. Skoll d...@roaringpenguin.comwrote: From: Mohammed Al-Saleh moealsa...@gmail.com Is the virus database updated through a repository (for example svn or cvs)? I would like to see how virus database changes over time. Interesting. I guess there's nothing stopping you from putting the database in a git repo or similar. You could do a commit each time just before calling freshclam. Edwin's response: No. I found surprising, I must say... wouldn't it make sense to use git or similar so you could easily revert bad signature additions? :) Regards, David. ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net -- http://volatile-minds.blogspot.com -- blog http://www.volatileminds.net -- website ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net
Re: [Clamav-devel] Virus DB Repo
On 05/18/2010 03:18 PM, Brandon Perry wrote: I agree that this would be neat as long as the current way is still available. I don't want to have to install git or svn on the servers just to be able to update my clam sigs. Distributing the virus DB via git/svn wouldn't scale, I don't think David was suggesting that. On Tue, May 18, 2010 at 7:13 AM, David F. Skoll d...@roaringpenguin.comwrote: From: Mohammed Al-Saleh moealsa...@gmail.com Is the virus database updated through a repository (for example svn or cvs)? I would like to see how virus database changes over time. Interesting. I guess there's nothing stopping you from putting the database in a git repo or similar. You could do a commit each time just before calling freshclam. Edwin's response: No. I found surprising, I must say... wouldn't it make sense to use git or similar so you could easily revert bad signature additions? :) Regards, David. ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net
Re: [Clamav-devel] Adding targets for the bytecode interpreter
On 17 May 2010, at 7:32 pm, Török Edwin wrote: A real fix would be to detect the Apple-style universal build (configure does this already), and build both ppc and x86 then. If you open a bugreport I'll try to do that for 0.96.2. OK, filed as bug 2030. https://wwws.clamav.net/bugzilla/show_bug.cgi?id=2030 Thanks very much for the explanation and the --enable-all-jit-targets fix in the meantime. Mark ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net
Re: [Clamav-devel] Question
Hi Edwin, On Apr 27, 2010, at 7:19 AM, Török Edwin wrote: On 04/26/2010 10:20 PM, Mohammed Al-Saleh wrote: Hi Edwin, Thanks for your reply. I need to know the cases where ClamAV has performance bottlenecks or issues. The best way to do that is by measuring it. Read the last part of this reply: http://lurker.clamav.net/message/20081204.212941.c9fa45c2.en.html What kind of texts that could make ClamAV takes more time than usual. That question is hard to answer, since the signatures change each day, thus the AC trie changes, the prefiltering patterns change ... Aho-Corasick and Boyer-Moore might have some situations that cause performance issue. There is also a prefiltering step now. You can search bugzilla on why it was introduced. I might consider doing improvements or study performance impact. Don't expect it to be easy to make improvements. I spent quite a lot of time on the prefiltering step, and the problem is that some signatures falsely match a lot of times (like 'PE' from the PE signature), but the entire signature usually doesn't. So ClamAV has to stop the trie lookup, test the match, continue the trie lookup lots of times. My understanding (please correct me if I am wrong) is that the first step in matching (let's ignore the filetype recognition and such) is the prefiltering step. If the filter matches then further matching (using either AC or BM) is needed to make sure that it is not a false positive because the filter could contain more patterns than it should (and the filter matches at most 8 characters of the original signature so the other parts might not match). I am not sure if I understand your point here and I really want to understand it: So ClamAV has to stop the trie lookup, test the match, continue the trie lookup lots of times. Can you please explain this to me more? If the filter matches but AC or BM does not, would we return back to the filter to continue from the point it matches? Although the actual test is fast enough, if it happens a million times it does slow things down. Also the AC and BM are not textbook versions, they contain extensions (like wildcards). It is important that you study the performance with the actual signatures from main/daily.cvd, and on real files (both clean and infected). Do you think that this could be a realistic problem to study? That depends if you have some specific ideas on how to improve AC/BM, or you just want to try improving it, and give up if its not possible. Best regards, --Edwin ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net Thanks much, ~Moe ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net
Re: [Clamav-devel] Question
On 05/18/2010 09:09 PM, Mohammed Al-Saleh wrote: Hi Edwin, On Apr 27, 2010, at 7:19 AM, Török Edwin wrote: On 04/26/2010 10:20 PM, Mohammed Al-Saleh wrote: Hi Edwin, Thanks for your reply. I need to know the cases where ClamAV has performance bottlenecks or issues. The best way to do that is by measuring it. Read the last part of this reply: http://lurker.clamav.net/message/20081204.212941.c9fa45c2.en.html What kind of texts that could make ClamAV takes more time than usual. That question is hard to answer, since the signatures change each day, thus the AC trie changes, the prefiltering patterns change ... Aho-Corasick and Boyer-Moore might have some situations that cause performance issue. There is also a prefiltering step now. You can search bugzilla on why it was introduced. I might consider doing improvements or study performance impact. Don't expect it to be easy to make improvements. I spent quite a lot of time on the prefiltering step, and the problem is that some signatures falsely match a lot of times (like 'PE' from the PE signature), but the entire signature usually doesn't. So ClamAV has to stop the trie lookup, test the match, continue the trie lookup lots of times. My understanding (please correct me if I am wrong) is that the first step in matching (let's ignore the filetype recognition and such) is the prefiltering step. If the filter matches then further matching (using either AC or BM) is needed to make sure that it is not a false positive because the filter could contain more patterns than it should (and the filter matches at most 8 characters of the original signature so the other parts might not match). Yes. I am not sure if I understand your point here and I really want to understand it: So ClamAV has to stop the trie lookup, test the match, continue the trie lookup lots of times. Can you please explain this to me more? If the filter matches but AC or BM does not, would we return back to the filter to continue from the point it matches? No, I was refering to how AC works. After the AC trie detects a match it needs to check it, the AC trie contains only a tiny part of the entire signature (up to ac_max_depth), and the trie itself doesn't contain wildcards etc. Best regards, --Edwin ___ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net