> Well I can understand that features which are unique to ClamAV might
> demand something more flexible than the Yara specification, although I
> don't profess to have great insight into that.  I wonder if this means
> there's a case for "ClamAV *extensions* to the Yara language" or some
> variation on that theme.  I guess it wouldn't be too difficult to make
> the extensions sufficiently non-Yara like to avoid clashes with future
> developments of Yara itself.  In case it isn't obvious we already have
> a "ClamAV *version* of the Yara language" so this suggestion might not
> be as outrageous as it seems.

The status quo is a sub-set of what's possible with the Yara.  I think that's 
vastly different than adding features that won't make any sense to real-Yara. 
That said, ClamAV extensions to the Yara language isn't a terrible idea.  It's 
an idea my boss kicked around a bit when we were chatting last week.  Some 
context: He started the Yara signature support effort in ClamAV. And he has 
friends in the Yara community.
I don't personally think that Yara + ClamAV extensions will be sufficient for 
all the different features we'll need.  But I don't really have a vision in 
mind for how that would look.  I'd be happy to be proven wrong with some 
proof-of-concepts work demonstrating each of the different features we have now.

> (1) a plea for a way to test rules before they go live;

If you mean "for personal use" then I'd say, "What Maarten said."  But if you 
mean so Cisco-Talos malware analysts can do more extensive testing with like 
"hunting signatures" before publishing as "malware signatures" then the answer 
is different.  I'm probably not the best person to discuss what's in the works 
there.  I'll leave that question open to my colleagues on the malware research 
side.

> (2) another plea for a parser which is good at its job;

I'm not sure what you mean here.  Can you elaborate?  If you simply want ClamAV 
ignore garbage rules on load and continue with the rest of the file (see point 
#4) - that's something we can easily improve regardless of what we do. And 
that's how our yara rule loading logic works right now.

> (3) a way to specify that a rule is to match in
>     (a) mail headers only or
>     (b) mail body only or
>     (c) both;

This is a neat idea.  It is a new signature language feature request and is a 
great example of something that would be hard to implement in the current 
clamav signature language(s).  If you have any ideas on how this may be 
expressed either in the "clamav yara extensions" idea or in the proposed 
"KDL-based signature language" or some other proposed format, I'd love some 
examples.

> (4) it would be great to have a way to reload rulesets separately so
> it isn't necessary to reload ten million signatures when you've only
> added one Yara rule, only then to find clamd crashes the first time it
> tries to scan anything because you broke that rule.  I understand this
> might be asking a lot, and a decent parser which prevents attempts to
> load garbage rules (point 2) would do a lot to alleviate this pain.

Asking Clam to load additional rules to an existing engine while scans are 
ongoing is tricky, but potentially​​​ doable.  It throws a wrench into the 
works for some hardening ideas I'm proposing for scan process sandboxing.  Sort 
of.  It's a more to think about.

Asking Clam to unload specific rules from an active scanning engine has the 
same problem plus considerations about how to drop stuff from the trie 
structures without breaking anything.  It's also potentially doable.

Asking Clam to reload a modified signature database in-place is a different 
story.  Let's say you have a ClamD running that had database version A.  You 
modify the database file so now it's version B and version A is gone.  And you 
want ClamD to look at version B and figure out which signatures have been 
added/removed/changed and update accordingly.  I don't think that's something 
we can do.  When signatures are loaded, we store bits of patterns in a variety 
of structures like tries, lists, hashmaps, etc.  Figuring out which bits to 
remove and which to keep would be a bit of a nightmare.  I would imagine that 
you would have to build a reference as big as the loaded engine while doing it 
just to sort it all out.

Regards,
Micah


Micah Snyder
ClamAV Development
Talos
Cisco Systems, Inc.

________________________________
From: clamav-users <clamav-users-boun...@lists.clamav.net> on behalf of G.W. 
Haywood via clamav-users <clamav-users@lists.clamav.net>
Sent: Tuesday, March 15, 2022 10:51 AM
To: ClamAV users ML <clamav-users@lists.clamav.net>
Cc: G.W. Haywood <cla...@jubileegroup.co.uk>
Subject: Re: [clamav-users] human friendly signatures

Hi there,

On Tue, 15 Mar 2022, Laurent S. via clamav-users wrote:
> On Tuesday, March 15th, 2022 at 00:36, Micah Snyder wrote:
>
>> Starting with our own new language would let us maintain do that
>> but make it easier for new analysts to train up on ClamAV.
>
> I don't see at all the advantage of using a different, less used
> language. I don't know many people looking forward to learn a new
> language that is quite specific to one software and used more or
> less nowhere else.

Well I can understand that features which are unique to ClamAV might
demand something more flexible than the Yara specification, although I
don't profess to have great insight into that.  I wonder if this means
there's a case for "ClamAV *extensions* to the Yara language" or some
variation on that theme.  I guess it wouldn't be too difficult to make
the extensions sufficiently non-Yara like to avoid clashes with future
developments of Yara itself.  In case it isn't obvious we already have
a "ClamAV *version* of the Yara language" so this suggestion might not
be as outrageous as it seems.

>> using Yara's engine in clamav directly is something that has been
>> brought up time and again. It is possible. My understanding is that
>> the reason ClamAV's yara support isn't done this way is that it
>> would require a second pass over the file with a Yara's pattern
>> matcher, after ClamAV's pattern matcher, and that the performance
>> concern made it make more sense to try and load yara rules into
>> ClamAV's matcher instead.

Speaking selfishly I wouldn't be greatly inconvenienced by an increase
in the scan times (even if it doubles) caused by separating the Yara
engine from the ClamAV engine.  That's because I only scan mail, and
the clamd server is well on top of it.  I can understand that people
who scan filesystems might have a different point of view; maybe both
could be accommodated with a config option.

>> I honestly don't have any numbers to back up this argument. It
>> sounds reasonable, but I'd love to see the numbers.

I occasionally run more than one clamd instance and I've seriously
considered running a separate one purely so that that Yara rules are
kept separate from the rest.  I always log scan times.  It will be a
bit fiddly, but when I get a minute I'll set something up to try to
give you some numbers.

> One big reason I like to use ClamAV is that it's possible to add
> other sources of signatures. Lots of people use the sanesecurity
> ones. I add a lot of my own.

+1

Finally, unashamed repetition:

(1) a plea for a way to test rules before they go live;

(2) another plea for a parser which is good at its job;

(3) a way to specify that a rule is to match in
    (a) mail headers only or
    (b) mail body only or
    (c) both;

and lastly

(4) it would be great to have a way to reload rulesets separately so
it isn't necessary to reload ten million signatures when you've only
added one Yara rule, only then to find clamd crashes the first time it
tries to scan anything because you broke that rule.  I understand this
might be asking a lot, and a decent parser which prevents attempts to
load garbage rules (point 2) would do a lot to alleviate this pain.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Reply via email to