Re: [Clamav-devel] clamav daily.cvd definition file size history

2015-08-25 Thread David Raynor
We do not have a publicly accessible history of all daily.cvd files. I
think the mirrors focus on serving the current versions of files. Given
what you said about downloading cvd files and not cdiff files, this should
give you a reasonable approximation for your purposes.

Based on my notes for 2015, daily.cvd was approximately:
- 32M Jan 1 to Mar 5 [version 19864 to 20152]
- 33M Mar 5 to May 4 [version 20153 to 20415]
- 34M May 4 to Jun 9 [version 20416 to 20576]
- 35M Jun 17 to Jul 10 [version 20577 to 20674]
- 36M Jul 10 to Aug 10 [version 20651 to 20773]
- 37M Aug 10 to current day [version 20774+]

Hope this helps,

Dave R.

On Thu, Aug 20, 2015 at 12:09 PM, Darrell Dwelley 
wrote:

> I am involved in a project in which I need to calculate bandwidth usage of
> a device over past six months.  The device includes clamav, and the device
> pulls daily.cvd definition file from a local repo.  That local repo has no
> activity history from which I can deduce daily.cvd file size history over
> period of time of interest.  I cannot find a public mirror that simply
> lists daily.cvd file.   Where can I get a simple list of the daily.cvd file
> daily release history that includes daily file size from mid Feb to current?
>
> 
>
> This transmission may contain information that is privileged,
> confidential, and/or exempt from disclosure under applicable law. If you
> are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution, or use of the information contained
> herein (including any reliance thereon) is strictly prohibited. If you
> received this transmission in error, please immediately contact the sender
> and destroy the material in its entirety, whether in electronic or hard
> copy format.
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>
> http://www.clamav.net/contact.html#ml
>



-- 
---
Dave Raynor
Talos Security Intelligence and Research Group
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net

http://www.clamav.net/contact.html#ml


Re: [Clamav-devel] enabling DMG and XAR support

2014-03-19 Thread David Raynor
On Wed, Mar 19, 2014 at 11:34 AM, Rafael Ferreira wrote:

> Interesting... let me run some tests and get back to you.
>
> On Mar 19, 2014, at 8:33 AM, Mark Allan  wrote:
>
> > Just out of interest, did you test to see if it *actually* worked?
> >
> > My configure output shows that dmg and xar are supported, but it doesn't
> actually detect the Eicar test file within a disk image.
> >
> > configure: Summary of engine detection features
> >  autoit_ea06 : yes
> >  bzip2   : ok
> >  zlib: /usr
> >  unrar   : yes
> >  dmg and xar : yes, from /usr
> >
> > When I create a new disk image, copy the Eicar test file in, and scan
> the dmg, it shows up as being clean.
> >
> >> clamscan test.dmg
> >> test.dmg: OK
> >>
> >> --- SCAN SUMMARY ---
> >> Known viruses: 3259558
> >> Engine version: 0.98.1
> >> Scanned directories: 0
> >> Scanned files: 1
> >> Infected files: 0
> >> Data scanned: 10.07 MB
> >> Data read: 10.02 MB (ratio 1.01:1)
> >> Time: 4.845 sec (0 m 4 s)
> >
> > Does this work as expected for anyone else?
> >
> > Mark
> >
> > On 10 Feb 2014, at 23:38, Rafael Ferreira  wrote:
> >
> >> That worked, thanks!
> >>
> >> On February 10, 2014 at 4:29:41 PM, Steven Morgan (
> smor...@sourcefire.com) wrote:
> >>
> >> Rafael,
> >>
> >> Probably all you need to do install libxml&libxml2-dev, which is used by
> >> dmg and xar, then do your configure/make.
> >>
> >> Steve
> >>
> >>
> >> On Mon, Feb 10, 2014 at 6:05 PM, Rafael Ferreira  >wrote:
> >>
> >>>
> >>> Folks,
> >>>
> >>> I'm compiling clamav 0.98.1 on Linux (Ubuntu 12.04 LTS) and I'm not
> >>> getting the new super awesome DMG and XAR file support:
> >>>
> >>> configure: Summary of detected features follows
> >>> OS : linux-gnu
> >>> pthreads : yes (-lpthread)
> >>> configure: Summary of miscellaneous features
> >>> check : no (auto)
> >>> fanotify : yes
> >>> fdpassing : 1
> >>> IPv6 : yes
> >>> configure: Summary of optional tools
> >>> clamdtop : (auto)
> >>> milter : yes (disabled)
> >>> configure: Summary of engine performance features)
> >>> release mode: yes
> >>> jit : yes (auto)
> >>> mempool : yes
> >>> configure: Summary of engine detection features
> >>> autoit_ea06 : yes
> >>> bzip2 : ok
> >>> zlib : /usr
> >>> unrar : yes
> >>> dmg and xar : no
> >>>
> >>> Am I missing a configure flag or third party library?
> >>>
> >>> Thanks in advance,
> >>>
> >>> - Rafael
> >>>
> >>> 
> >>> scanii.com - the web friendly malware scanner!
> >>> ___
> >>> http://lurker.clamav.net/list/clamav-devel.html
> >>> Please submit your patches to our Bugzilla: http://bugs.clamav.net
> >> ___
> >> http://lurker.clamav.net/list/clamav-devel.html
> >> Please submit your patches to our Bugzilla: http://bugs.clamav.net
> >> ___
> >> http://lurker.clamav.net/list/clamav-devel.html
> >> Please submit your patches to our Bugzilla: http://bugs.clamav.net
> >
> > ___
> > http://lurker.clamav.net/list/clamav-devel.html
> > Please submit your patches to our Bugzilla: http://bugs.clamav.net
>
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

DMG is an odd filetype, since there are really 2 or 3 different filetypes
lumped into that category.

What we have included in ClamAV 0.98.1 is scanning of UDIF format DMG
files, which have a definitive trailer block and may have compressed
sections.
We have not yet included support for scanning raw disk format DMG files,
which are nearly indistinguishable from disk dumps. No separate compression
is allowed.

So let me ask you this question. How did you create your DMG? Most software
packagers create UDIF format to reduce the file size for downloads. Disk
Utility and the hdiutil command can create a raw disk unless another format
is checked.

To find out what format your testfile is really in, you can use the
imageinfo sub-command of hdiutil (e.g. hdiutil imageinfo yourfile.dmg).
Then you can use the convert sub-command of hdiutil to switch the format.

Hope this helps,

Dave R.

-- 
---
Dave Raynor
Vulnerability Research Team
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] 0.98.1 not compiling on OS/X Mavericks

2014-03-18 Thread David Raynor
On Tue, Mar 18, 2014 at 10:36 AM, David Raynor wrote:

>
> On Tue, Mar 18, 2014 at 4:39 AM, Remi Mommsen wrote:
>
>> Hi,
>>
>> On 18 Mar 2014, at 07:12, Brian Reiter  wrote:
>>
>> >
>> > On Mar 18, 2014, at 6:05 AM, Zack  wrote:
>> >
>> >> It used to compile on OSX just fine as recently as a month ago.
>> >
>> > I haven't built from source manually in a while but it does 0.98.1 did
>> build for me using MacPorts on Mavericks back in January. That was XCode 5
>> but not 5.1. MacPorts builds with CFLAGS -O0.
>>
>> I can confirm that the update from xcode 5 to 5.1 broke the compilation
>> on Mac OS X.
>>
>> Remi
>> ___
>> http://lurker.clamav.net/list/clamav-devel.html
>> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>>
>
> Thanks for the research on this problem. I added this to our Bugzilla as
> bug # 10757 so we can track it.
>
> Dave R.
>
> --
> ---
> Dave Raynor
> Vulnerability Research Team
>

This error is reported from upstream LLVM code. The LLVM team has made
bigger changes to this header since this file was included in ClamAV, so I
cannot simply apply changes from them or the patch impact increases beyond
this file.

The root problem is the handling of the iterators and templates in this one
spot. In short, clang is checking something earlier than gcc, even though
the type should be available in the end.

Please try this candidate patch to the LoopInfo.h header.

--- a/libclamav/c++/llvm/include/llvm/Analysis/LoopInfo.h
+++ b/libclamav/c++/llvm/include/llvm/Analysis/LoopInfo.h
@@ -814,8 +814,12 @@ public:
 typedef GraphTraits > InvBlockTraits;

 // Add all of the predecessors of X to the end of the work stack...
-TodoStack.insert(TodoStack.end(), InvBlockTraits::child_begin(X),
- InvBlockTraits::child_end(X));
+for (typename InvBlockTraits::ChildIteratorType PI =
+  InvBlockTraits::child_begin(X), PE =
InvBlockTraits::child_end(X);
+  PI != PE; ++PI) {
+  typename InvBlockTraits::NodeType *N = *PI;
+  TodoStack.push_back(N);
+}
   }
 }

Dealing with a larger LLVM upgrade is a task for a future release, but this
should let you move forward under Xcode 5.1 in the near term.

Let us know how it goes,

Dave R.

-- 
---
Dave Raynor
Vulnerability Research Team
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] 0.98.1 not compiling on OS/X Mavericks

2014-03-18 Thread David Raynor
On Tue, Mar 18, 2014 at 4:39 AM, Remi Mommsen wrote:

> Hi,
>
> On 18 Mar 2014, at 07:12, Brian Reiter  wrote:
>
> >
> > On Mar 18, 2014, at 6:05 AM, Zack  wrote:
> >
> >> It used to compile on OSX just fine as recently as a month ago.
> >
> > I haven't built from source manually in a while but it does 0.98.1 did
> build for me using MacPorts on Mavericks back in January. That was XCode 5
> but not 5.1. MacPorts builds with CFLAGS -O0.
>
> I can confirm that the update from xcode 5 to 5.1 broke the compilation on
> Mac OS X.
>
> Remi
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

Thanks for the research on this problem. I added this to our Bugzilla as
bug # 10757 so we can track it.

Dave R.

-- 
---
Dave Raynor
Vulnerability Research Team
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] stub/example cvd files

2014-01-15 Thread David Raynor
On Wed, Jan 15, 2014 at 12:22 PM, Jeff Minelli  wrote:

> Hi, my group currently uses clamav in our rails apps. We are trying to
> find ways to speed up testing on TravisCI. One of our issues is that
> freshclam needs to run prior to our testing. This seems to add a lot of
> buidl/testing time.
>
> Is there a test or stub main.cvd (and daily.cvd?) that we can use or
> create for our needs (to reduce download times of the current cvd files)?
> We don’t need to test for specific viruses, just test that the clamav call
> is working.
>
> Does anyone have suggestions or stub files we can have?
>
> Thanks,
>
> -jeff
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>


If this is for testing the non-freshclam components only, you could go with
a non-CVD virus database. As long as ClamAV loads at least one signature,
it will run.

For example, you could have a "test.ndb" file that contains the following
signature. Then the only thing it will detect is EICAR, but it will load
much faster.
Eicar-Test-Signature:0:0:58354f2150254041505b345c505a58353428505e2937434329377d2445494341522d5354414e444152442d414e544956495255532d544553542d46494c452124482b482a

If you have other test files you choose to use, you could write signatures
to alert on those and use those for test cases.

Hope this helps,

Dave R.

-- 
---
Dave Raynor
Vulnerability Research Team
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] libclamav and INSTREAM

2013-11-20 Thread David Raynor
On Wed, Nov 20, 2013 at 2:45 PM, Erik Aigner  wrote:

> "No difference between a file handle to something in memory and to a file
> on the FS.”
>
> Apparently there is, otherwise it would work with a memory file handle.
>
> --
> Erik Aigner
>
>
> On Wednesday 20 November 2013 at 20:15, Brandon Perry wrote:
>
> > Why would you *have* to write to the disk? No difference between a file
> handle to something in memory and to a file on the FS.
> >
> > That being said, i actually used a ramdisk when building my clamav
> bindings (https://github.com/brandonprry/clam-sharp/).
> >
> > Sent from a computer
> >
> > > On Nov 20, 2013, at 12:42, Erik Aigner  aigner.e...@gmail.com)> wrote:
> > >
> > > Helo!
> > >
> > > The clamav daemon has an INSTREAM feature for scanning a stream of
> data.
> > > I’m developing Go bindings for libclamav (
> https://github.com/eaigner/clam) and was
> > > wondering why there isn’t such a feature in libclamav?
> > >
> > > I searched the libclamav headers for something equal but didn’t find
> anything similar.
> > > It seems I can only scan by file handle. If I use a pipe handle, it
> will fail.
> > >
> > > Is that correct? Do I really have to write (potentially huge) files to
> disk to scan for clamav?
> > >
> > > Cheers,
> > >
> > > --
> > > Erik Aigner
> > >
> > >
> > > ___
> > > http://lurker.clamav.net/list/clamav-devel.html
> > > Please submit your patches to our Bugzilla: http://bugs.clamav.net
> >
> >
> > ___
> > http://lurker.clamav.net/list/clamav-devel.html
> > Please submit your patches to our Bugzilla: http://bugs.clamav.net
>
>
>
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

Based on the operations ClamAV performs and the way it does them, ClamAV
needs to be able to seek and rewind. Not every "stream" supports that.
Sockets cannot. There are also certain features, like some of the
callbacks, that were designed for file-based access and pass descriptors.

So right now libclamav does not expose functions for scanning blocks of
memory. ClamAV only uses maps that it created and makes sure it releases
them. Under the hood, a lot of the code has been switched over to using
maps ... so perhaps with the right setup call and symbols ... you might be
able to write code to what you are looking to do. Just be aware that you
may want to avoid or turn off certain features. Things like filetyping will
give much different results when dealing with memory blocks instead of
discrete files.

Good luck,

Dave R.

-- 
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] build for s390x system

2013-09-26 Thread David Raynor
On Thu, Sep 26, 2013 at 7:24 AM, Tsutomu Oyamada wrote:

> Hi, Dave
>
> I've re-build of clamAV.
> As a result, confirmed that work just fine.
> Your advice was valid.
> Thank you very much.
>
> T.Oyamada
>
> On Thu, 26 Sep 2013 09:07:53 +0900
> Tsutomu Oyamada  wrote:
>
> > Hi, Dave.
> >
> > Thanks your advice.
> >
> > I changed type of fp_digit.
> > and I tried test program.
> >
> > fp_digit: 4
> > unsigned long: 8
> > FP_64BIT not defined
> > DIGIT_BIT: 32
> > CHAR_BIT: 8
> > FP_MAX_SIZE: 8448
> > FP_SIZE: 264
> > TFM_ASM not defined
> > fp_word: 8
> > ulong64: 8
> > unsigned long long: 8
> > CRYPT not defined
> >
> > It works good.
> > I will try re-compile clamav.
> > Later report the results.
> >
> > T.Oyamada
> >
> > On Wed, 25 Sep 2013 17:13:58 -0400
> > David Raynor  wrote:
> >
> > > On Wed, Sep 25, 2013 at 2:55 PM, David Raynor  >wrote:
> > >
> > > >
> > > > On Tue, Sep 24, 2013 at 9:47 PM, Tsutomu Oyamada <
> oyam...@promark-inc.com>wrote:
> > > >
> > > >> Hi, Dave.
> > > >>
> > > >> Thanks for your advice and quick respons.
> > > >> I tried it.
> > > >>
> > > >> 
> > > >> fp_digit: 8
> > > >> unsigned long: 8
> > > >> FP_64BIT not defined
> > > >> DIGIT_BIT: 64
> > > >> CHAR_BIT: 8
> > > >> FP_MAX_SIZE: 8704
> > > >> FP_SIZE: 136
> > > >> TFM_ASM not defined
> > > >> fp_word: 8
> > > >> ulong64: 8
> > > >> unsigned long long: 8
> > > >> CRYPT not defined
> > > >> 
> > > >>
> > > >> What should I just do in order to fix this problem?
> > > >> Please teach me how to set size of fp_word to 16.
> > > >>
> > > >> T.Oyamada
> > > >>
> > > >> On Tue, 24 Sep 2013 18:16:26 -0400
> > > >> David Raynor  wrote:
> > > >>
> > > >> > On Tue, Sep 24, 2013 at 4:37 PM, David Raynor <
> dray...@sourcefire.com
> > > >> >wrote:
> > > >> >
> > > >> > > On Tue, Sep 24, 2013 at 2:05 AM, Tsutomu Oyamada <
> > > >> oyam...@promark-inc.com>wrote:
> > > >> > >
> > > >> > >> Hi,
> > > >> > >>
> > > >> > >> I investigated the value using the following programs.
> > > >> > >>
> > > >> > >> 
> > > >> > >> #include "stdlib.h"
> > > >> > >> #include "bignum_fast.h"
> > > >> > >>
> > > >> > >> int main(int argc, char **argv) {
> > > >> > >>
> > > >> > >> printf("fp_digit: %d\n",sizeof(fp_digit));
> > > >> > >> printf("unsigned long: %d\n",sizeof(unsigned long));
> > > >> > >>
> > > >> > >> #ifdef FP_64BIT
> > > >> > >> printf("FP_64BIT: %d\n",FP_64BIT);
> > > >> > >> #else
> > > >> > >> printf("FP_64BIT not defined\n");
> > > >> > >> #endif
> > > >> > >>
> > > >> > >> #ifdef DIGIT_BIT
> > > >> > >> printf("DIGIT_BIT: %d\n",DIGIT_BIT);
> > > >> > >> #else
> > > >> > >> printf("DIGIT_BIT not defined\n");
> > > >> > >> #endif
> > > >> > >>
> > > >> > >> #ifdef CHAR_BIT
> > > >> > >> printf("CHAR_BIT: %d\n",CHAR_BIT);
> > > >> > >> #else
> > > >> > >> printf("CHAR_BIT not defined\n");
> > > >> > >> #endif
> > > >> > >>
> > > >> > >> #ifdef FP_MAX_SIZE
> > > >> > >> printf("FP_MAX_SIZE: %d\n",FP_MAX_SIZE);
> > > >> > >> #else
> > > >> > >> printf("FP_MAX_SIZE not defined\n");
> > > >> > >> #endif
> > > >> > >>
> > > >> > &g

Re: [Clamav-devel] build for s390x system

2013-09-25 Thread David Raynor
On Wed, Sep 25, 2013 at 2:55 PM, David Raynor wrote:

>
> On Tue, Sep 24, 2013 at 9:47 PM, Tsutomu Oyamada 
> wrote:
>
>> Hi, Dave.
>>
>> Thanks for your advice and quick respons.
>> I tried it.
>>
>> 
>> fp_digit: 8
>> unsigned long: 8
>> FP_64BIT not defined
>> DIGIT_BIT: 64
>> CHAR_BIT: 8
>> FP_MAX_SIZE: 8704
>> FP_SIZE: 136
>> TFM_ASM not defined
>> fp_word: 8
>> ulong64: 8
>> unsigned long long: 8
>> CRYPT not defined
>> 
>>
>> What should I just do in order to fix this problem?
>> Please teach me how to set size of fp_word to 16.
>>
>> T.Oyamada
>>
>> On Tue, 24 Sep 2013 18:16:26 -0400
>> David Raynor  wrote:
>>
>> > On Tue, Sep 24, 2013 at 4:37 PM, David Raynor > >wrote:
>> >
>> > > On Tue, Sep 24, 2013 at 2:05 AM, Tsutomu Oyamada <
>> oyam...@promark-inc.com>wrote:
>> > >
>> > >> Hi,
>> > >>
>> > >> I investigated the value using the following programs.
>> > >>
>> > >> 
>> > >> #include "stdlib.h"
>> > >> #include "bignum_fast.h"
>> > >>
>> > >> int main(int argc, char **argv) {
>> > >>
>> > >> printf("fp_digit: %d\n",sizeof(fp_digit));
>> > >> printf("unsigned long: %d\n",sizeof(unsigned long));
>> > >>
>> > >> #ifdef FP_64BIT
>> > >> printf("FP_64BIT: %d\n",FP_64BIT);
>> > >> #else
>> > >> printf("FP_64BIT not defined\n");
>> > >> #endif
>> > >>
>> > >> #ifdef DIGIT_BIT
>> > >> printf("DIGIT_BIT: %d\n",DIGIT_BIT);
>> > >> #else
>> > >> printf("DIGIT_BIT not defined\n");
>> > >> #endif
>> > >>
>> > >> #ifdef CHAR_BIT
>> > >> printf("CHAR_BIT: %d\n",CHAR_BIT);
>> > >> #else
>> > >> printf("CHAR_BIT not defined\n");
>> > >> #endif
>> > >>
>> > >> #ifdef FP_MAX_SIZE
>> > >> printf("FP_MAX_SIZE: %d\n",FP_MAX_SIZE);
>> > >> #else
>> > >> printf("FP_MAX_SIZE not defined\n");
>> > >> #endif
>> > >>
>> > >> #ifdef FP_SIZE
>> > >> printf("FP_SIZE: %d\n",FP_SIZE);
>> > >> #else
>> > >> printf("FP_SIZE not defined\n");
>> > >> #endif
>> > >>
>> > >> #ifdef TFM_ASM
>> > >> printf("TFM_ASM: defined\n");
>> > >> #else
>> > >> printf("TFM_ASM not defined\n");
>> > >> #endif
>> > >>
>> > >> exit(0);
>> > >> }
>> > >> 
>> > >>
>> > >> The result was as follows.
>> > >>
>> > >> fp_digit: 8
>> > >> unsigned long: 8
>> > >> FP_64BIT not defined
>> > >> DIGIT_BIT: 64
>> > >> CHAR_BIT: 8
>> > >> FP_MAX_SIZE: 8704
>> > >> FP_SIZE: 136
>> > >> TFM_ASM not defined
>> > >>
>> > >> Can you find a problem by this result?
>> > >>
>> > >> Thanks.
>> > >>
>> > >> --
>> > >> T.Oyamada
>> > >>
>> > >> ___
>> > >> http://lurker.clamav.net/list/clamav-devel.html
>> > >> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>> > >>
>> > >
>> > > I do not have an immediate fix, but that information does give me some
>> > > leads.
>> > >
>> > > Basic issue: the tomsfastmath code must be falling through to the code
>> > > block on lines 280-302 of fp_mul_comba.c.
>> > > 1) The right shift causing the warning is DIGIT_BIT (64).
>> > > 2) The datatype being shifted is fp_word.
>> > > 3) fp_word defined as "typedef ulong64 fp_word;" from bignum_fast.h
>> line
>> > > 253
>> > > 4) ulong64 defined as "typedef unsigned long long ulong64;" from
>> > > bignum_fast.h line 248
>> >

Re: [Clamav-devel] build for s390x system

2013-09-25 Thread David Raynor
On Tue, Sep 24, 2013 at 9:47 PM, Tsutomu Oyamada wrote:

> Hi, Dave.
>
> Thanks for your advice and quick respons.
> I tried it.
>
> 
> fp_digit: 8
> unsigned long: 8
> FP_64BIT not defined
> DIGIT_BIT: 64
> CHAR_BIT: 8
> FP_MAX_SIZE: 8704
> FP_SIZE: 136
> TFM_ASM not defined
> fp_word: 8
> ulong64: 8
> unsigned long long: 8
> CRYPT not defined
> 
>
> What should I just do in order to fix this problem?
> Please teach me how to set size of fp_word to 16.
>
> T.Oyamada
>
> On Tue, 24 Sep 2013 18:16:26 -0400
> David Raynor  wrote:
>
> > On Tue, Sep 24, 2013 at 4:37 PM, David Raynor  >wrote:
> >
> > > On Tue, Sep 24, 2013 at 2:05 AM, Tsutomu Oyamada <
> oyam...@promark-inc.com>wrote:
> > >
> > >> Hi,
> > >>
> > >> I investigated the value using the following programs.
> > >>
> > >> 
> > >> #include "stdlib.h"
> > >> #include "bignum_fast.h"
> > >>
> > >> int main(int argc, char **argv) {
> > >>
> > >> printf("fp_digit: %d\n",sizeof(fp_digit));
> > >> printf("unsigned long: %d\n",sizeof(unsigned long));
> > >>
> > >> #ifdef FP_64BIT
> > >> printf("FP_64BIT: %d\n",FP_64BIT);
> > >> #else
> > >> printf("FP_64BIT not defined\n");
> > >> #endif
> > >>
> > >> #ifdef DIGIT_BIT
> > >> printf("DIGIT_BIT: %d\n",DIGIT_BIT);
> > >> #else
> > >> printf("DIGIT_BIT not defined\n");
> > >> #endif
> > >>
> > >> #ifdef CHAR_BIT
> > >> printf("CHAR_BIT: %d\n",CHAR_BIT);
> > >> #else
> > >> printf("CHAR_BIT not defined\n");
> > >> #endif
> > >>
> > >> #ifdef FP_MAX_SIZE
> > >> printf("FP_MAX_SIZE: %d\n",FP_MAX_SIZE);
> > >> #else
> > >> printf("FP_MAX_SIZE not defined\n");
> > >> #endif
> > >>
> > >> #ifdef FP_SIZE
> > >> printf("FP_SIZE: %d\n",FP_SIZE);
> > >> #else
> > >> printf("FP_SIZE not defined\n");
> > >> #endif
> > >>
> > >> #ifdef TFM_ASM
> > >> printf("TFM_ASM: defined\n");
> > >> #else
> > >> printf("TFM_ASM not defined\n");
> > >> #endif
> > >>
> > >> exit(0);
> > >> }
> > >> 
> > >>
> > >> The result was as follows.
> > >>
> > >> fp_digit: 8
> > >> unsigned long: 8
> > >> FP_64BIT not defined
> > >> DIGIT_BIT: 64
> > >> CHAR_BIT: 8
> > >> FP_MAX_SIZE: 8704
> > >> FP_SIZE: 136
> > >> TFM_ASM not defined
> > >>
> > >> Can you find a problem by this result?
> > >>
> > >> Thanks.
> > >>
> > >> --
> > >> T.Oyamada
> > >>
> > >> ___
> > >> http://lurker.clamav.net/list/clamav-devel.html
> > >> Please submit your patches to our Bugzilla: http://bugs.clamav.net
> > >>
> > >
> > > I do not have an immediate fix, but that information does give me some
> > > leads.
> > >
> > > Basic issue: the tomsfastmath code must be falling through to the code
> > > block on lines 280-302 of fp_mul_comba.c.
> > > 1) The right shift causing the warning is DIGIT_BIT (64).
> > > 2) The datatype being shifted is fp_word.
> > > 3) fp_word defined as "typedef ulong64 fp_word;" from bignum_fast.h
> line
> > > 253
> > > 4) ulong64 defined as "typedef unsigned long long ulong64;" from
> > > bignum_fast.h line 248
> > >
> > > I think the problem is one of three issues:
> > > A) fp_word is not defined as a 64-bit datatype.
> > > B) line 301 of tomsfastmath/mul/fp_mul_comba.c is mistakenly
> downcasting t
> > > from fp_word to fp_digit before shifting.
> > > C) s390 is not allowing use of all 64 bits of fp_word.
> > >
> > > Problems A or B are easier fixed than C.
> > >
> > > Please add these lines to your test and re-run:
> > >
> > > printf("fp_word: %d\n"

Re: [Clamav-devel] build for s390x system

2013-09-24 Thread David Raynor
On Tue, Sep 24, 2013 at 4:37 PM, David Raynor wrote:

> On Tue, Sep 24, 2013 at 2:05 AM, Tsutomu Oyamada 
> wrote:
>
>> Hi,
>>
>> I investigated the value using the following programs.
>>
>> 
>> #include "stdlib.h"
>> #include "bignum_fast.h"
>>
>> int main(int argc, char **argv) {
>>
>> printf("fp_digit: %d\n",sizeof(fp_digit));
>> printf("unsigned long: %d\n",sizeof(unsigned long));
>>
>> #ifdef FP_64BIT
>> printf("FP_64BIT: %d\n",FP_64BIT);
>> #else
>> printf("FP_64BIT not defined\n");
>> #endif
>>
>> #ifdef DIGIT_BIT
>> printf("DIGIT_BIT: %d\n",DIGIT_BIT);
>> #else
>> printf("DIGIT_BIT not defined\n");
>> #endif
>>
>> #ifdef CHAR_BIT
>> printf("CHAR_BIT: %d\n",CHAR_BIT);
>> #else
>> printf("CHAR_BIT not defined\n");
>> #endif
>>
>> #ifdef FP_MAX_SIZE
>> printf("FP_MAX_SIZE: %d\n",FP_MAX_SIZE);
>> #else
>> printf("FP_MAX_SIZE not defined\n");
>> #endif
>>
>> #ifdef FP_SIZE
>> printf("FP_SIZE: %d\n",FP_SIZE);
>> #else
>> printf("FP_SIZE not defined\n");
>> #endif
>>
>> #ifdef TFM_ASM
>> printf("TFM_ASM: defined\n");
>> #else
>> printf("TFM_ASM not defined\n");
>> #endif
>>
>> exit(0);
>> }
>> 
>>
>> The result was as follows.
>>
>> fp_digit: 8
>> unsigned long: 8
>> FP_64BIT not defined
>> DIGIT_BIT: 64
>> CHAR_BIT: 8
>> FP_MAX_SIZE: 8704
>> FP_SIZE: 136
>> TFM_ASM not defined
>>
>> Can you find a problem by this result?
>>
>> Thanks.
>>
>> --
>> T.Oyamada
>>
>> ___
>> http://lurker.clamav.net/list/clamav-devel.html
>> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>>
>
> I do not have an immediate fix, but that information does give me some
> leads.
>
> Basic issue: the tomsfastmath code must be falling through to the code
> block on lines 280-302 of fp_mul_comba.c.
> 1) The right shift causing the warning is DIGIT_BIT (64).
> 2) The datatype being shifted is fp_word.
> 3) fp_word defined as "typedef ulong64 fp_word;" from bignum_fast.h line
> 253
> 4) ulong64 defined as "typedef unsigned long long ulong64;" from
> bignum_fast.h line 248
>
> I think the problem is one of three issues:
> A) fp_word is not defined as a 64-bit datatype.
> B) line 301 of tomsfastmath/mul/fp_mul_comba.c is mistakenly downcasting t
> from fp_word to fp_digit before shifting.
> C) s390 is not allowing use of all 64 bits of fp_word.
>
> Problems A or B are easier fixed than C.
>
> Please add these lines to your test and re-run:
>
> printf("fp_word: %d\n",sizeof(fp_word));
> printf("ulong64: %d\n",sizeof(ulong64));
> printf("unsigned long long: %d\n",sizeof(unsigned long long));
> #ifdef CRYPT
> printf("CRYPT: defined\n");
> #else
> printf("CRYPT not defined\n");
> #endif
>
> I would like to see the config.log file generated by running configure. It
> would also be useful to have the full output from running make. The log
> snip shows line 91, but I expect that it first warned about line 15.
>
> The easiest way to continue and share logfiles is via Bugzilla. Please
> open a bug report on bugzilla.clamav.net on this issue. You can then
> attach the files to that bug.
>
> Hope this helps,
>
>
> Dave R.
>
> --
> ---
> Dave Raynor
> Sourcefire Vulnerability Research Team
> dray...@sourcefire.com
>

I think I read something wrong, and I think I have an idea. It still needs
to be confirmed by the additional test lines above. Re-reading the output,
I think that the code is depending on fp_word being twice the size of
fp_digit. Based on fp_digit size of 8 that means it would expect it to be
16. If sizeof(fp_word) is resolving to 8 and sizeof(fp_digit) is 8, that
could be the problem. Then the fix would depend on whether fp_word can be
made size 16 or must be constrained to size 8.

Let me know what you find,

Dave R.

-- 
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] build for s390x system

2013-09-24 Thread David Raynor
On Tue, Sep 24, 2013 at 2:05 AM, Tsutomu Oyamada wrote:

> Hi,
>
> I investigated the value using the following programs.
>
> 
> #include "stdlib.h"
> #include "bignum_fast.h"
>
> int main(int argc, char **argv) {
>
> printf("fp_digit: %d\n",sizeof(fp_digit));
> printf("unsigned long: %d\n",sizeof(unsigned long));
>
> #ifdef FP_64BIT
> printf("FP_64BIT: %d\n",FP_64BIT);
> #else
> printf("FP_64BIT not defined\n");
> #endif
>
> #ifdef DIGIT_BIT
> printf("DIGIT_BIT: %d\n",DIGIT_BIT);
> #else
> printf("DIGIT_BIT not defined\n");
> #endif
>
> #ifdef CHAR_BIT
> printf("CHAR_BIT: %d\n",CHAR_BIT);
> #else
> printf("CHAR_BIT not defined\n");
> #endif
>
> #ifdef FP_MAX_SIZE
> printf("FP_MAX_SIZE: %d\n",FP_MAX_SIZE);
> #else
> printf("FP_MAX_SIZE not defined\n");
> #endif
>
> #ifdef FP_SIZE
> printf("FP_SIZE: %d\n",FP_SIZE);
> #else
> printf("FP_SIZE not defined\n");
> #endif
>
> #ifdef TFM_ASM
> printf("TFM_ASM: defined\n");
> #else
> printf("TFM_ASM not defined\n");
> #endif
>
> exit(0);
> }
> 
>
> The result was as follows.
>
> fp_digit: 8
> unsigned long: 8
> FP_64BIT not defined
> DIGIT_BIT: 64
> CHAR_BIT: 8
> FP_MAX_SIZE: 8704
> FP_SIZE: 136
> TFM_ASM not defined
>
> Can you find a problem by this result?
>
> Thanks.
>
> --
> T.Oyamada
>
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

I do not have an immediate fix, but that information does give me some
leads.

Basic issue: the tomsfastmath code must be falling through to the code
block on lines 280-302 of fp_mul_comba.c.
1) The right shift causing the warning is DIGIT_BIT (64).
2) The datatype being shifted is fp_word.
3) fp_word defined as "typedef ulong64 fp_word;" from bignum_fast.h line 253
4) ulong64 defined as "typedef unsigned long long ulong64;" from
bignum_fast.h line 248

I think the problem is one of three issues:
A) fp_word is not defined as a 64-bit datatype.
B) line 301 of tomsfastmath/mul/fp_mul_comba.c is mistakenly downcasting t
from fp_word to fp_digit before shifting.
C) s390 is not allowing use of all 64 bits of fp_word.

Problems A or B are easier fixed than C.

Please add these lines to your test and re-run:

printf("fp_word: %d\n",sizeof(fp_word));
printf("ulong64: %d\n",sizeof(ulong64));
printf("unsigned long long: %d\n",sizeof(unsigned long long));
#ifdef CRYPT
printf("CRYPT: defined\n");
#else
printf("CRYPT not defined\n");
#endif

I would like to see the config.log file generated by running configure. It
would also be useful to have the full output from running make. The log
snip shows line 91, but I expect that it first warned about line 15.

The easiest way to continue and share logfiles is via Bugzilla. Please open
a bug report on bugzilla.clamav.net on this issue. You can then attach the
files to that bug.

Hope this helps,

Dave R.

-- 
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] build for s390x system

2013-09-23 Thread David Raynor
On Mon, Sep 23, 2013 at 11:00 AM, Tsutomu Oyamada
wrote:

> Hi, all.
>
> We are using ClamAV for IBM zLinux (s390x architecture)
> So many Warnings were output as follows, when we made a new build
> with using the new release 0.98.
>
> # ./configure --prefix=/usr/lib/clamav --exec-prefix=/usr/lib/clamav
> --bindir=/usr/lib/clamav --sbindir=/usr/lib/clamav --sysconfdir=/etc/clamav
> --libdir=/usr/lib/clamav --datarootdir=/usr/lib/clamav
> --with-dbdir=/usr/lib/clamav --disable-clamav --with-zlib=/usr/local
> --with-libbz2-prefix=/usr/local
> (snip)
> # make
> (snip)
> tomsfastmath/mul/fp_mul_comba_20.c:91: warning: right shift count >= width
> of type
> tomsfastmath/mul/fp_mul_comba_20.c:91: warning: right shift count >= width
> of type
> tomsfastmath/mul/fp_mul_comba_20.c:91: warning: right shift count >= width
> of type
>
> And, the binary did not work properly as a result.
>
> Is there any special settings in a case of make for s390?
> Can I have any advice/suggestion?
>
> Best Regards,
> Oyamada
>
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

That line (and other ones like it in fp_mul_comba_20.c) are doing
calculations using an array of elements of type "fp_digit". "fp_digit" is
defined in libclamav/bignum_fast.h.

I don't have access to a s390 to test with and I don't see any notes on
tomsfastmath and s390, so I will need your help to investigate. Can you get
me some values as calculated by bignum_fast.h?

1) Sizes of two types: sizeof(fp_digit), sizeof(unsigned long)
2) Values of these defined macros, if defined: FP_64BIT, DIGIT_BIT,
CHAR_BIT, FP_MAX_SIZE, FP_SIZE, TFM_ASM

Thanks,

Dave R.

-- 
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Does clamav work with hex or characters?

2013-03-25 Thread David Raynor
On Sat, Mar 23, 2013 at 3:34 PM, Kaushik Vaidyanathan <
kvaid...@andrew.cmu.edu> wrote:

> Hi Matt
>
> Thanks for your detailed explanation on how signature gets stored and
> interpreted.
>
> I was looking up the codes in libclamav to see what data formats get used
> for string compare. Some backtracking from cli_bm_scanbuff took me to str.c
> where I see there is a function" cli_hex2str", which if I understand
> correctly maps two hexs to one character (unsigned char). Would it fair to
> speculate that this function is used by the clamav engine to map two hexs
> read from a signature or scanned file into one char for string matching
> purposes?
>
> Thank you..
>
>
> On Sat, Mar 23, 2013 at 11:02 AM, Matt Olney 
> wrote:
>
> > Welldata is data.  There is no difference (from a storage
> perspective)
> > from an executable with an "inc ecx" instruction or a text document with
> an
> > "A".  Both are represented by the value 0x41.  So from Clam's
> perspective,
> > a signature matching a single A would be identical to a signature that
> > detected a single "inc ecx" instruction.  Both would look for 41.
> >
> > In short your statement "some files are hex and some are character-based"
> > isn't really accurate.  At the risk of painting with a broad brush, I
> would
> > say that all files are stored as a series of values, a series of bytes.
> >  How you display them is different.  When I used 010 Editor to view a
> file
> > as hex, I get a set of ascii-hex representations.  When I look at a file
> > with a web-browser I get ascii text.  But underlying all of that is the
> > same idea, a set of bytes.  And that is how ClamAV treats all files.
> >
> > A signature with a 41 in it would be converted in memory to look for
> 0x41,
> > a single byte of value 0x41.  A signature written like that would detect
> an
> > executable or pdf or a flash or anything that has 0x41 in the data.
> >
> > Hope that answers your question.
> >
> > Matt
> >
> >
> > On Fri, Mar 22, 2013 at 8:46 PM, Kaushik Vaidyanathan <
> > kvaid...@andrew.cmu.edu> wrote:
> >
> > > Hi
> > >
> > > I have a basic question. Most body-based signatures are hex based(lets
> > > focus on fixed string signatures alone for simplicity), whereas some of
> > the
> > > files are hex(EXE) or character-based(HTML).
> > >
> > > In the code I see unsigned chars used predominantly to represent
> patterns
> > > and file contents. At the very core, do the string matching algorithms,
> > > mainly extended Boyer Moore, I would like to understand how the
> datatypes
> > > gets manipulated.
> > >
> > > 1) Do the character based files get translated to hex to compare with
> > body
> > > based signatures?
> > >
> > > 2) Does the signature get treated as a string of chars?
> > > If yes,
> > > Does a toy signature "fe" gets treated as two chars(8 bits each) for
> "f"
> > > and "e" (or)
> > > Does the code read the signature "fe" and maps into one character based
> > on
> > > the ASCII table (for example)?
> > >
> > > Thank you..
> > > ___
> > > http://lurker.clamav.net/list/clamav-devel.html
> > > Please submit your patches to our Bugzilla: http://bugs.clamav.net
> > >
> > ___
> > http://lurker.clamav.net/list/clamav-devel.html
> > Please submit your patches to our Bugzilla: http://bugs.clamav.net
> >
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

Read from signature, yes. Read from file, no. To quickly compare bytes it
is better to do it using the in-file binary representation. It is more
direct to say that cli_hex2str() is converting human-readable
representation of a hexadecimal number into the binary equivalent. For any
byte pattern to match, the signature-format equivalent will take twice as
many bytes as the raw binary value.

Example: "Hex" in ASCII
Actual data is 3 bytes long. 1st byte: 0x48. 2nd byte: 0x65. 3rd byte: 0x78
Signature-format equivalent is 6 bytes long, one for each hex digit.

This is where the name of the function came from. Input and output are both
char arrays (i.e. strings). The function takes in the "hex"-format version
of the content [486578], and returns the content in a usable string format
[Hex]. Hence, from "hex" to string.

Dave R.

-- 
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] New Contributor - Loïc Maury

2013-01-07 Thread David Raynor
On Mon, Jan 7, 2013 at 7:57 AM, Loïc Maury  wrote:

> Hello,
>
> Happy new years
>
> I have recently downloaded by git the source code of project.
>
> But I have some issue with the test with make test.
>
> I launch the configure with ./configure --enable-check, make, make check,
>
> but some of test are just skipped ?
>
> PASS: check_clamav
> PASS: check_freshclam.sh
> PASS: check_sigtool.sh
> SKIP: check_unit_vg.sh
> PASS: check1_clamscan.sh
> PASS: check2_clamd.sh
> PASS: check3_clamd.sh
> PASS: check4_clamd.sh
> SKIP: check5_clamd_vg.sh
> SKIP: check6_clamd_vg.sh
> SKIP: check7_clamd_hg.sh
> SKIP: check8_clamd_hg.sh
> SKIP: check9_clamscan_vg.sh
> ==
> All 7 tests passed
> (6 tests were not run)
> ==
>
> is this normal ?
>
> Thank you
>
> Loïc Maury
>
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

Yes, this is normal. The valgrind-related tests are not run by default,
they are only run when explicitly requested with command-line argument
VG=1. Example:
make check VG=1 VERBOSE=1

Also, you must have valgrind installed on your machine to execute the tests.

Dave R.

-- 
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Libclamav - filetype detection functionality

2012-09-12 Thread David Raynor
On Tue, Sep 11, 2012 at 9:47 AM, Alexandre Dias  wrote:

> Hello,
>
> I need to detect the type of a given file (so that in my own scanner, which
> uses ClamAV's signature set, I can cross-reference it with any signature
> that matches).
>
> However, if I'm not mistaken, such functionality is not currently exposed
> in libclamav.
>
> Is there any way to access it besides directly modifying libclamav (and
> thus losing the access on any update to ClamAV) ?
>
> Thanks.
>
> -Alexandre Dias
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

Alexandre,

Short answer: There is no function that will take a file and return
ClamAV's filetype decision.

All the filetyping functions are all marked internal and expect access to
ClamAV's memory objects. There are ways to use them without modifying
libclamav, but they would all have varying levels of hacks involved. I
don't know which path would be best for you to take, but I wish you luck.

Dave R.

-- 
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Question about matcher-bm.c

2012-08-15 Thread David Raynor
On Wed, Aug 15, 2012 at 6:58 AM, Chatsiri Ratana wrote:

> Hello Dave R,
>
>1) How to ClamAV categories virus signature in SHA1, SHA256, MD5  and
> Hexdump  types?
>2) What's estimate signature types of virus load  to A-C and B-M on
> ClamAV? I see flags --ac-only for loading signature file to A-C tires, But
> I not sure how to selected virus types load to A-C and B-M algorithms when
> scanning virus in common mode.
>
>
>
>
> --
> :
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

1) Details on signature formats are in the signatures.pdf included in the
docs folder of the source.

2) This question is a little confusing. If you are asking about numbers of
signatures, the numbers change daily. If you run clamscan in debug mode, it
will report the size and contents of the tries with signature counts
grouped by the filetypes they will scan. There are counts for both BM and
AC.

Hope this helps,

Dave R.

-- 
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


Re: [Clamav-devel] Question about matcher-bm.c

2012-07-03 Thread David Raynor
On Mon, Jul 2, 2012 at 5:07 PM, Alexandre Dias  wrote:

> Hello,
>
> I'm studying multi-pattern matching and I was browsing the source code for
> ClamAV's implementation of a multi-pattern matcher (Wu-Maber based)
> algorithm.
>
> I've got a question regarding the block and minimum size values.
>
> At the moment, both the block size and the minimum pattern length are set
> to 3 bytes.
>
> If I understood the algorithm correctly, this means that the only possible
> shift values are either 0 (at which point a match is possible), or 1
> (minimum pattern size - block size + 1).
>
> If this is the case, given that the algorithm can only move at most one
> byte at a time, what is the advantage of using this algorithm instead of
> Aho-Corasick (besides space efficiency) ?
>
> Thank you for your time.
>
> Best regards,
>
> -Alexandre Dias
> ___
> http://lurker.clamav.net/list/clamav-devel.html
> Please submit your patches to our Bugzilla: http://bugs.clamav.net
>

Space efficiency is important. We do need to care about memory usage. But
ruling that out, consider that ClamAV has different places and different
ways it uses pattern matching. For the sake of consistency with how it is
named in the code, I'll refer to the two modified styles of matching as B-M
(for Boyer-Moore/Wu-Manber style) and A-C (for Aho-Corasick).

ClamAV has over 113,000 signatures right now and they are split between the
A-C and B-M categories. ClamAV is not using pure pattern matching of either
style and has pre-filtering steps. Some signatures are scanning direct file
content. Other signatures are matching hashes [or in some cases, hashes of
file segments]. Files can have wildly varying lengths, while the hashes
have predetermined lengths. There are logical signatures that require
certain combinations of matches. ClamAV even uses pattern matching when
checking signatures at load time to filter out those that have been added
to the ignore lists. Any optimization would be impacted daily with each new
signature that is added. To sum up, there are quite a variety of needles
and haystacks involved in the searching.

Back to your question. You are correct that the shift values will be 0 or
1. While I cannot give you an analytical defense to the choice of minimum
pattern size & block size, there is a natural tension between the two. From
what I read, Wu & Manber used a block size of 3 in their May 1994 paper.
And any efficiency gained from longer shifts (which would be based on
values which never appear in any signature) could be targeted by malware
writers to eliminate it by forcing creation of signatures that fill that
gap. I also don't know the difference in effective cost of frequent partial
matches between A-C and B-M. These are things that could be measurable but
I do not have statistics at hand.

There is more history on the topic of algorithms and their use in ClamAV to
be found in the back history of the mailing list. Discussions of everything
from extended Boyer-Moore to bloom filters.

Hope this helps,

Dave R.
---
Dave Raynor
Sourcefire Vulnerability Research Team
dray...@sourcefire.com
___
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net