Re: [clamav-users] human friendly signatures

2022-06-20 Thread G.W. Haywood via clamav-users

Hi there,

This is a more or less random data point.

On Mon, 14 Mar 2022, Micah Snyder (micasnyd) via clamav-users wrote:


Sorry that this response come so late that is nearly a necro-thread. ...


Er, ditto.


... If anyone has any other ideas about it, I'd love to hear them. ...


One thing has become much more obvious lately here and I felt the need
to get it written down somewhere.

We're seeing a lot more spam than ever we used to which is written in
CJKV (Chinese, Japanese, Korean, Vietnamese) using UTF-8 encoding.
It's mostly phishing of some sort.

We use UTF-8 text strings in Yara rules to catch a lot of this spam
for our automatic abuse reporting system.

Obviously to make things human friendly it helps a lot if the terminal
emulators, editors and other tools can render the text as appropriate,
but my point is that, however you manipulate Yara rules for ClamAV, as
things are they work fine for this purpose and I'd really hate to lose
that capability.

--

73,
Ged.
___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/Cisco-Talos/clamav-documentation

https://docs.clamav.net/#mailing-lists-and-chat


Re: [clamav-users] human friendly signatures

2022-03-21 Thread Kris Deugau

G.W. Haywood via clamav-users wrote:

Hi there,

On Mon, 21 Mar 2022, Kris Deugau wrote:


TBH I'd prefer if Clam *did* continue, just skipping malformed rules
(and also whinging loudly in the log).


I could live with that if it didn't *also* crash.


Either would be better than just exiting (it's not a hard *crash*,
it's "just" refusing to load a file with a malformed signature -
including things like entirely blank lines).


No, Kris.  It *is* a hard crash - and it doesn't happen when it loads
the rules, it happens when it tries to scan something *after* loading
a Yara file which contains a bad rule.  Not neccessarily any bad rule,
just one with any of a number of different kinds of badness which I've
found to be problematic.  But as I said in my mail things may well be
different as a result of Micah's August PR.  TBH I really haven't been
inclined for quite some time to crash clamd on purpose. :)


Sorry, didn't see that, figured you were talking about the joy of 
finding all those subtle little rules defining a well-formed signature 
To date I haven't managed to trip whatever bug(s) bit you, although I 
*have* found relatively simple signatures that should have matched but 
didn't.


I *have* pushed out "malformed" "signatures" (AKA "signature files with 
a blank line or two at the end") that caused the production clamd 
instances to shut down...  after which I spent some time adding 
validation to the SVN commit hook, and writing a local editing wrapper 
to help make sure signatures were valid before committing.


-kgd

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-21 Thread G.W. Haywood via clamav-users

Hi there,

On Mon, 21 Mar 2022, Kris Deugau wrote:


TBH I'd prefer if Clam *did* continue, just skipping malformed rules
(and also whinging loudly in the log).


I could live with that if it didn't *also* crash.


Either would be better than just exiting (it's not a hard *crash*,
it's "just" refusing to load a file with a malformed signature -
including things like entirely blank lines).


No, Kris.  It *is* a hard crash - and it doesn't happen when it loads
the rules, it happens when it tries to scan something *after* loading
a Yara file which contains a bad rule.  Not neccessarily any bad rule,
just one with any of a number of different kinds of badness which I've
found to be problematic.  But as I said in my mail things may well be
different as a result of Micah's August PR.  TBH I really haven't been
inclined for quite some time to crash clamd on purpose. :)


Strictly speaking, four characters (the {} delimiters for hex
strings). To my reading this is part of the upstream Yara spec, and
I'd be wary of extending this particular bit without at least
requiring some blatant, obvious flag in any such rule to clearly
indicate that it's not stock Yara syntax.


Agreed it needs some thought.  Maybe a different filename extension?
Not that I'm a great fan of systems which rely on filename extensions
to control the behaviour of executables.  Or maybe persuade the folks
upstream to make some enhancements?  That would be best, I think, but
it presupposes that the ClamAV Yara engine catches up - which IMHO is
a necessity in any case.

--

73,
Ged.

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-21 Thread Kris Deugau

G.W. Haywood via clamav-users wrote:

Hi Micah,

On Wed, 16 Mar 2022, Micah Snyder (micasnyd) wrote:

I'm not sure what you mean here.  Can you elaborate?  If you simply
want ClamAV ignore garbage rules on load and continue with the rest
of the file (see point #4) - that's something we can easily improve
regardless of what we do. And that's how our yara rule loading logic
works right now.


I strongly feel that if it finds a problem, rather than silently load
some sub-optimal ruleset the parser should abandon the reload of the
entire ruleset.  Obviously it should warn when it does that.  I guess
this might be an issue if it's running on a machine with too little
RAM to reload while simultaneously scanning with the previous ruleset,
but something like a --test-ruleset option could probably handle that.


TBH I'd prefer if Clam *did* continue, just skipping malformed rules 
(and also whinging loudly in the log).


Either would be better than just exiting (it's not a hard *crash*, it's 
"just" refusing to load a file with a malformed signature - including 
things like entirely blank lines).




While I was looking at this I also came upon another quirk that can be
a bit of a nuisance.  AFAICT Yara strings can only be delimited by one
of two characters, either a double-quote (for a literal string) or a
forward-slash (for a regex).  It would help to be able to choose the
quote character like in Perl; if not, at least having more available
to choose from could make many expressions more readable, especially
those which target e.g. HTML and links in mail (both of which tend to
have many occurrences of double-quote or forward-slash characters).


Strictly speaking, four characters (the {} delimiters for hex strings). 
To my reading this is part of the upstream Yara spec, and I'd be wary of 
extending this particular bit without at least requiring some blatant, 
obvious flag in any such rule to clearly indicate that it's not stock 
Yara syntax.


-kgd

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-19 Thread G.W. Haywood via clamav-users

Hi Micah,

On Wed, 16 Mar 2022, Micah Snyder (micasnyd) wrote:


(1) a plea for a way to test rules before they go live;


If you mean "for personal use" then I'd say, "What Maarten said."


Er, no.  Not "scan to make sure it detects things".  What I meant was
"do something to make sure it won't e.g. crash clamd when it tries to
scan something after this rule has been loaded" - but see (2) below.


(2) another plea for a parser which is good at its job;


I'm not sure what you mean here.  Can you elaborate?  If you simply
want ClamAV ignore garbage rules on load and continue with the rest
of the file (see point #4) - that's something we can easily improve
regardless of what we do. And that's how our yara rule loading logic
works right now.


I strongly feel that if it finds a problem, rather than silently load
some sub-optimal ruleset the parser should abandon the reload of the
entire ruleset.  Obviously it should warn when it does that.  I guess
this might be an issue if it's running on a machine with too little
RAM to reload while simultaneously scanning with the previous ruleset,
but something like a --test-ruleset option could probably handle that.

The following is from something I was doing back in June 2021, so it's
before your pull request of 2021.08.21:

https://github.com/Cisco-Talos/clamav/pull/261

and I haven't retested, but these are the sorts of things that were
driving me crazy around the middle of last year:

8<--
$ diff -U3 Garbage_Rules.yar.~140~.NBG Garbage_Rules.yar.~141~.OK 
--- Garbage_Rules.yar.~140~.NBG 2021-06-13 08:05:54.218256634 +0100

+++ Garbage_Rules.yar.~141~.OK  2021-06-13 08:08:11.025783287 +0100
@@ -30,7 +30,7 @@
  strings:
$ = /update from GOV.{1,10}UK/ ascii nocase
  condition:
-   any of them and not Blacklist_1
+   not Blacklist_1 and any of them
 }
8<--
$ cat does_not_notice_missing_curly_brace 
private rule Email_marketing

{
  strings:
$ = "email marketing" ascii nocase  // Testing
  condition:
any of them
}

// Test private rule
rule Garbage_spam_testing_Rule
  strings:
$TLD_4_to_20_chars = /htt(p|ps):\/\/[-a-z0-9]{3,50}\.[a-z]{4,20}\/./ ascii 
nocase
$ = "email marketing" ascii nocase
  condition:
all of them
}
8<--
$ cat does_not_notice_missing_dollar_symbol 
--- Garbage_Rules.yar.~297~ 2021-07-30 14:43:26.540758502 +0100

+++ Garbage_Rules.yar   2021-07-30 14:46:30.277470587 +0100
@@ -29,7 +29,7 @@
rule test_single_string
{
  strings:
-   = /cc.{1,3}ab...@jubileegroup.co.uk/ ascii nocase
+ $ = /cc.{1,3}ab...@jubileegroup.co.uk/ ascii nocase
  condition:
any of them
}
8<--
$ cat five_more_yara_bugs 
See Garbage rules of late June to early July 2021.


1. It doesn't notice if you have more than one string with the same name.

2. If you have a string with a name that isn't referenced in the condition, it 
crashes.

3. It crashes if you mistakenly write (see 199-200) something like
condition:
Spam_trap and ( any of ($spammer_*) or any of ($warning_*) or (#publish_* 
> 4) )

4. It crashes if you mistakenly write something like .*{range} (for example see 
200-201)
$ = /we.*{1,50}(sell|sale)/ ascii nocase
   which should be
$ = /we.{1,50}(sell|sale)/ ascii nocase

5. If you want to match "Alfreton, Derbyshire" the string "Alfreton, 
Derbyshire" *does*
   match if you use the form

$ = "Alfreton, Derbyshire" ascii nocase

   but it does *not* match if you use the form

$ = "Alfreton, Derbyshire" ascii
8<--

See also e.g.

https://bugzilla.clamav.net/show_bug.cgi?id=12095

While I was looking at this I also came upon another quirk that can be
a bit of a nuisance.  AFAICT Yara strings can only be delimited by one
of two characters, either a double-quote (for a literal string) or a
forward-slash (for a regex).  It would help to be able to choose the
quote character like in Perl; if not, at least having more available
to choose from could make many expressions more readable, especially
those which target e.g. HTML and links in mail (both of which tend to
have many occurrences of double-quote or forward-slash characters).


(3) a way to specify that a rule is to match in
(a) mail headers only or
(b) mail body only or
(c) both;


This is a neat idea.  It is a new signature language feature request
and is a great example of something that would be hard to implement
in the current clamav signature language(s).  If you have any ideas
on how this may be expressed either in the "clamav yara extensions"
idea or in the proposed "KDL-based signature language" or some other
proposed format, I'd love some examples.


In Yara, something like

rule only_match_RFQ_i

Re: [clamav-users] human friendly signatures

2022-03-16 Thread Steve Basford

On 16 March 2022 22:16:05 Eric Tykwinski  wrote:

Steve,

I like the idea, but why the hex; hex?


Sorry, should have been clearer... not just hex but

Test;Engine:81-255,Target:0;(b0&h1);0f0f0f*0b0b0b;0/blah*(?:[4-7]|[8003]\d)/
etc...>Just thinking about my recent issues with direct deposit phishing 
emails from gmail.com and they are written probably by people, so I can’t 
really hash it, and have to regex it.



Cheers,

Steve
Twitter: @sanesecurity

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-16 Thread Steve Basford

On 16 March 2022 22:16:05 Eric Tykwinski  wrote:

Steve,

I like the idea, but why the hex; hex?
Just thinking about my recent issues with direct deposit phishing emails 
from gmail.com and they are written probably by people, so I can’t really 
hash it, and have to regex it.







On Mar 16, 2022, at 5:10 PM, Steve Basford  
wrote:


On 16 March 2022 20:29:19 "Micah Snyder \(micasnyd\) via clamav-users" 
 wrote:

yara rule loading logic works right now.



(3) a way to specify that a rule is to match in
(a) mail headers only or
(b) mail body only or
(c) both;
Just a random early thought... could .ldb be extended... by reading the 
whole message processing  as normal... but if its a header line mark as h, 
body with a b...



So if the ldb could be extended with h/b... you could still use the normal 
ldb logic...



Test;Engine:81-255,Target:0;(h0&b0=0);hex;hex


Test;Engine:81-255,Target:0;(b0);

h=headers only line
b=body only line

So h0 hex will only match if its a header line
So b0 hex will only matt h if its a body line
Sorry for the formatting.. on mobile.


Cheers,

Steve
Twitter: @sanesecurity
___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml



Cheers,

Steve
Twitter: @sanesecurity

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-16 Thread Eric Tykwinski
Steve,

I like the idea, but why the hex; hex?
Just thinking about my recent issues with direct deposit phishing emails from 
gmail.com and they are written probably by people, so I can’t really hash it, 
and have to regex it.

> On Mar 16, 2022, at 5:10 PM, Steve Basford  
> wrote:
> 
> On 16 March 2022 20:29:19 "Micah Snyder \(micasnyd\) via clamav-users" 
> mailto:clamav-users@lists.clamav.net>> wrote:
> 
>>  yara rule loading logic works right now.
>> 
>> > (3) a way to specify that a rule is to match in
>> > (a) mail headers only or
>> > (b) mail body only or
>> > (c) both;
>> 
>> 
> 
> Just a random early thought... could .ldb be extended... by reading the whole 
> message processing  as normal... but if its a header line mark as h, body 
> with a b... 
> 
> So if the ldb could be extended with h/b... you could still use the normal 
> ldb logic... 
> 
> Test;Engine:81-255,Target:0;(h0&b0=0);hex;hex
> 
> Test;Engine:81-255,Target:0;(b0);
> 
> h=headers only line
> b=body only line
> 
> So h0 hex will only match if its a header line
> So b0 hex will only matt h if its a body line
> Sorry for the formatting.. on mobile.
> 
> Cheers,
> 
> Steve
> Twitter: @sanesecurity
> 
> ___
> 
> clamav-users mailing list
> clamav-users@lists.clamav.net 
> https://lists.clamav.net/mailman/listinfo/clamav-users 
> 
> 
> 
> Help us build a comprehensive ClamAV guide:
> https://github.com/vrtadmin/clamav-faq 
> 
> 
> http://www.clamav.net/contact.html#ml 

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-16 Thread Steve Basford
On 16 March 2022 20:29:19 "Micah Snyder \(micasnyd\) via clamav-users" 
 wrote:

yara rule loading logic works right now.



(3) a way to specify that a rule is to match in
(a) mail headers only or
(b) mail body only or
(c) both;
Just a random early thought... could .ldb be extended... by reading the 
whole message processing  as normal... but if its a header line mark as h, 
body with a b...



So if the ldb could be extended with h/b... you could still use the normal 
ldb logic...



Test;Engine:81-255,Target:0;(h0&b0=0);hex;hex


Test;Engine:81-255,Target:0;(b0);

h=headers only line
b=body only line

So h0 hex will only match if its a header line
So b0 hex will only matt h if its a body line
Sorry for the formatting.. on mobile.


Cheers,

Steve
Twitter: @sanesecurity

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-16 Thread Micah Snyder (micasnyd) via clamav-users
> Well I can understand that features which are unique to ClamAV might
> demand something more flexible than the Yara specification, although I
> don't profess to have great insight into that.  I wonder if this means
> there's a case for "ClamAV *extensions* to the Yara language" or some
> variation on that theme.  I guess it wouldn't be too difficult to make
> the extensions sufficiently non-Yara like to avoid clashes with future
> developments of Yara itself.  In case it isn't obvious we already have
> a "ClamAV *version* of the Yara language" so this suggestion might not
> be as outrageous as it seems.

The status quo is a sub-set of what's possible with the Yara.  I think that's 
vastly different than adding features that won't make any sense to real-Yara. 
That said, ClamAV extensions to the Yara language isn't a terrible idea.  It's 
an idea my boss kicked around a bit when we were chatting last week.  Some 
context: He started the Yara signature support effort in ClamAV. And he has 
friends in the Yara community.
I don't personally think that Yara + ClamAV extensions will be sufficient for 
all the different features we'll need.  But I don't really have a vision in 
mind for how that would look.  I'd be happy to be proven wrong with some 
proof-of-concepts work demonstrating each of the different features we have now.

> (1) a plea for a way to test rules before they go live;

If you mean "for personal use" then I'd say, "What Maarten said."  But if you 
mean so Cisco-Talos malware analysts can do more extensive testing with like 
"hunting signatures" before publishing as "malware signatures" then the answer 
is different.  I'm probably not the best person to discuss what's in the works 
there.  I'll leave that question open to my colleagues on the malware research 
side.

> (2) another plea for a parser which is good at its job;

I'm not sure what you mean here.  Can you elaborate?  If you simply want ClamAV 
ignore garbage rules on load and continue with the rest of the file (see point 
#4) - that's something we can easily improve regardless of what we do. And 
that's how our yara rule loading logic works right now.

> (3) a way to specify that a rule is to match in
> (a) mail headers only or
> (b) mail body only or
> (c) both;

This is a neat idea.  It is a new signature language feature request and is a 
great example of something that would be hard to implement in the current 
clamav signature language(s).  If you have any ideas on how this may be 
expressed either in the "clamav yara extensions" idea or in the proposed 
"KDL-based signature language" or some other proposed format, I'd love some 
examples.

> (4) it would be great to have a way to reload rulesets separately so
> it isn't necessary to reload ten million signatures when you've only
> added one Yara rule, only then to find clamd crashes the first time it
> tries to scan anything because you broke that rule.  I understand this
> might be asking a lot, and a decent parser which prevents attempts to
> load garbage rules (point 2) would do a lot to alleviate this pain.

Asking Clam to load additional rules to an existing engine while scans are 
ongoing is tricky, but potentially​​​ doable.  It throws a wrench into the 
works for some hardening ideas I'm proposing for scan process sandboxing.  Sort 
of.  It's a more to think about.

Asking Clam to unload specific rules from an active scanning engine has the 
same problem plus considerations about how to drop stuff from the trie 
structures without breaking anything.  It's also potentially doable.

Asking Clam to reload a modified signature database in-place is a different 
story.  Let's say you have a ClamD running that had database version A.  You 
modify the database file so now it's version B and version A is gone.  And you 
want ClamD to look at version B and figure out which signatures have been 
added/removed/changed and update accordingly.  I don't think that's something 
we can do.  When signatures are loaded, we store bits of patterns in a variety 
of structures like tries, lists, hashmaps, etc.  Figuring out which bits to 
remove and which to keep would be a bit of a nightmare.  I would imagine that 
you would have to build a reference as big as the loaded engine while doing it 
just to sort it all out.

Regards,
Micah


Micah Snyder
ClamAV Development
Talos
Cisco Systems, Inc.


From: clamav-users  on behalf of G.W. 
Haywood via clamav-users 
Sent: Tuesday, March 15, 2022 10:51 AM
To: ClamAV users ML 
Cc: G.W. Haywood 
Subject: Re: [clamav-users] human friendly signatures

Hi there,

On Tue, 15 Mar 2022, Laurent S. via clamav-users wrote

Re: [clamav-users] human friendly signatures

2022-03-16 Thread Micah Snyder (micasnyd) via clamav-users
Augh! Some hot-key combination just sent my email draft! Sorry! I was working 
on a list of the different distinct file formats we currently have, none of 
which are very easy to read.
I'm hoping to illustrate that if we can consolidate this down to something 
user-friendly it will be a big improvement.

Basing the file structure on the KDL language is just my initial proposal.  My 
teammate Scott is brainstorming some other ideas.  We have yet to make any hard 
decisions.

I agree with you about some sort of scoring.  Some signatures might never 
indicate maliciousness and be very-weak indicators.  Some might be very strong 
indicators; e.g. hash-based sigs for ransomware.  I too would like to add some 
different levels.  I don't know if a number-based scoring system makes sense, 
or if just a handful of different categories is sufficient.  More research 
needed.

Regards,
Micah



Micah Snyder
ClamAV Development
Talos
Cisco Systems, Inc.

From: Micah Snyder (micasnyd) 
Sent: Wednesday, March 16, 2022 12:10 PM
To: ClamAV users ML ; Laurent S. 
<110ef9e3086d8405c2929e34be5b4...@protonmail.ch>
Subject: Re: [clamav-users] human friendly signatures

The goal for the new sig format would be to include all the existing signature 
features currently spread across the existing ClamAV-specific signature file 
formats.
Right now we have different file formats for:

  *   NDB
  *   LDB
  *   CDB
  *   FTM
  *   CRB
  *   CFG
  *   PDB,WDB, HDB, HSB, MDB, MSB, FP, SFP, IGN2, and PWDB).

 from multiple file formats that are hard to read, hard to write, and hard to 
extend. We would like to the new  down into one format that is easier both for 
the signature authors and the developers.
We want to make a sigtool feature that can transcode from the old to the new, 
though we have no plans to remove support for the old signature formats. We 
might say they're deprecated to encourage folks to develop new content in the 
new format, but they would continue to work for the foreseeable future.

New signature features would only be added to the new signature format.

The goal is not to do away with Yara rule support.  We will continue to try to 
maintain the existing (limited) Yara rule support, and are still open to 
improving it.



Micah Snyder
ClamAV Development
Talos
Cisco Systems, Inc.


From: clamav-users  on behalf of Laurent 
S. via clamav-users 
Sent: Tuesday, March 15, 2022 3:42 AM
To: ClamAV users ML 
Cc: Laurent S. <110ef9e3086d8405c2929e34be5b4...@protonmail.ch>
Subject: Re: [clamav-users] human friendly signatures

On Tuesday, March 15th, 2022 at 00:36, Micah Snyder (micasnyd) 
 wrote:

> Starting with our own new language would let us maintain do that but make it 
> easier for new analysts to train up on ClamAV.

I don't see at all the advantage of using a different, less used language. I 
don't know many people looking forward to learn a new language that is quite 
specific to one software and used more or less nowhere else.

One big reason I like to use ClamAV is that it's possible to add other sources 
of signatures. Lots of people use the sanesecurity ones. I add a lot of my own. 
I suppose there's a big amount of people who would love to add more (ie YARA) 
sources.

Is the goal for KDL to replace all of the existing ClamAV formats? I guess the 
transition would be a whole lot of effort from a LOT of people.

> What would be every more cool would be to be able to have an archive alert 
> because we found weak indicators in several of the contained files.


I love the idea of weak indicators. But then, I'd like to have a more fine 
grained result in case of a hit. Something less binary but more something like 
a score. So that the amount of false positives could be more chosen. This would 
mean my paranoid customers could be as happy as the ones jumping to the roof at 
the first FP.

Best regards,
Laurent S.

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-16 Thread Micah Snyder (micasnyd) via clamav-users
The goal for the new sig format would be to include all the existing signature 
features currently spread across the existing ClamAV-specific signature file 
formats.
Right now we have different file formats for:

  *   NDB
  *   LDB
  *   CDB
  *   FTM
  *   CRB
  *   CFG
  *   PDB,WDB, HDB, HSB, MDB, MSB, FP, SFP, IGN2, and PWDB).

 from multiple file formats that are hard to read, hard to write, and hard to 
extend. We would like to the new  down into one format that is easier both for 
the signature authors and the developers.
We want to make a sigtool feature that can transcode from the old to the new, 
though we have no plans to remove support for the old signature formats. We 
might say they're deprecated to encourage folks to develop new content in the 
new format, but they would continue to work for the foreseeable future.

New signature features would only be added to the new signature format.

The goal is not to do away with Yara rule support.  We will continue to try to 
maintain the existing (limited) Yara rule support, and are still open to 
improving it.



Micah Snyder
ClamAV Development
Talos
Cisco Systems, Inc.


From: clamav-users  on behalf of Laurent 
S. via clamav-users 
Sent: Tuesday, March 15, 2022 3:42 AM
To: ClamAV users ML 
Cc: Laurent S. <110ef9e3086d8405c2929e34be5b4...@protonmail.ch>
Subject: Re: [clamav-users] human friendly signatures

On Tuesday, March 15th, 2022 at 00:36, Micah Snyder (micasnyd) 
 wrote:

> Starting with our own new language would let us maintain do that but make it 
> easier for new analysts to train up on ClamAV.

I don't see at all the advantage of using a different, less used language. I 
don't know many people looking forward to learn a new language that is quite 
specific to one software and used more or less nowhere else.

One big reason I like to use ClamAV is that it's possible to add other sources 
of signatures. Lots of people use the sanesecurity ones. I add a lot of my own. 
I suppose there's a big amount of people who would love to add more (ie YARA) 
sources.

Is the goal for KDL to replace all of the existing ClamAV formats? I guess the 
transition would be a whole lot of effort from a LOT of people.

> What would be every more cool would be to be able to have an archive alert 
> because we found weak indicators in several of the contained files.


I love the idea of weak indicators. But then, I'd like to have a more fine 
grained result in case of a hit. Something less binary but more something like 
a score. So that the amount of false positives could be more chosen. This would 
mean my paranoid customers could be as happy as the ones jumping to the roof at 
the first FP.

Best regards,
Laurent S.

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-15 Thread Maarten Broekman via clamav-users
On Tue, Mar 15, 2022 at 1:53 PM G.W. Haywood via clamav-users <
clamav-users@lists.clamav.net> wrote:

> Hi there,
>
> On Tue, 15 Mar 2022, Laurent S. via clamav-users wrote:
> >> using Yara's engine in clamav directly is something that has been
> >> brought up time and again. It is possible. My understanding is that
> >> the reason ClamAV's yara support isn't done this way is that it
> >> would require a second pass over the file with a Yara's pattern
> >> matcher, after ClamAV's pattern matcher, and that the performance
> >> concern made it make more sense to try and load yara rules into
> >> ClamAV's matcher instead.
>
> Speaking selfishly I wouldn't be greatly inconvenienced by an increase
> in the scan times (even if it doubles) caused by separating the Yara
> engine from the ClamAV engine.  That's because I only scan mail, and
> the clamd server is well on top of it.  I can understand that people
> who scan filesystems might have a different point of view; maybe both
> could be accommodated with a config option.


Anything that increases scan times would be prohibitive for me. We use
ClamAV to scan around a billion files per day and the primary thing
stopping us from using Yara is the increase in scan times.


> >> I honestly don't have any numbers to back up this argument. It

>> sounds reasonable, but I'd love to see the numbers.
>
> I occasionally run more than one clamd instance and I've seriously
> considered running a separate one purely so that that Yara rules are
> kept separate from the rest.  I always log scan times.  It will be a
> bit fiddly, but when I get a minute I'll set something up to try to
> give you some numbers.
>
>
We run multiple clamd instances specifically to load different sets of
signatures for different purposes.

For example, if we have instance 1 with very specific signatures and
instance 2 with more general signatures and instance 3 with ClamAV / 3rd
party signatures, we would first scan against instance 1 and, if we don't
get a match, we then scan against instance 2 and, if still no match,
against instance 3.


> > One big reason I like to use ClamAV is that it's possible to add
> > other sources of signatures. Lots of people use the sanesecurity
> > ones. I add a lot of my own.
>
> +1
>
>
For us, the attraction is the ease of creating our own signatures more than
the 3rd-party signatures, though 3rd-party signatures are a definite plus.


> Finally, unashamed repetition:
>
> (1) a plea for a way to test rules before they go live;
>

This is relatively straightforward to do on your own (save the signatures
in a temp location, create a file with something that you know will match,
and scan to make sure it is detected), so the fact that it's not built-in
is a bit confusing.


> (2) another plea for a parser which is good at its job;
>
> (3) a way to specify that a rule is to match in
> (a) mail headers only or
> (b) mail body only or
> (c) both;
>

This would be awesome for mail, but also for any file that has
differentiated parts. It would be great to have a better macro style that
would allow you to combine multiple signatures to produce a different
classification (sort of like logical signatures, but with the ability for
each sub-signature to hit different filetypes).

and lastly
>
> (4) it would be great to have a way to reload rulesets separately so
> it isn't necessary to reload ten million signatures when you've only
> added one Yara rule, only then to find clamd crashes the first time it
> tries to scan anything because you broke that rule.  I understand this
> might be asking a lot, and a decent parser which prevents attempts to
> load garbage rules (point 2) would do a lot to alleviate this pain.
>

100% this. Having the ability to load a diff rather than the complete
database would be an enormous boon.

--Maarten

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-15 Thread G.W. Haywood via clamav-users

Hi there,

On Tue, 15 Mar 2022, Laurent S. via clamav-users wrote:

On Tuesday, March 15th, 2022 at 00:36, Micah Snyder wrote:


Starting with our own new language would let us maintain do that
but make it easier for new analysts to train up on ClamAV.


I don't see at all the advantage of using a different, less used
language. I don't know many people looking forward to learn a new
language that is quite specific to one software and used more or
less nowhere else.


Well I can understand that features which are unique to ClamAV might
demand something more flexible than the Yara specification, although I
don't profess to have great insight into that.  I wonder if this means
there's a case for "ClamAV *extensions* to the Yara language" or some
variation on that theme.  I guess it wouldn't be too difficult to make
the extensions sufficiently non-Yara like to avoid clashes with future
developments of Yara itself.  In case it isn't obvious we already have
a "ClamAV *version* of the Yara language" so this suggestion might not
be as outrageous as it seems.


using Yara's engine in clamav directly is something that has been
brought up time and again. It is possible. My understanding is that
the reason ClamAV's yara support isn't done this way is that it
would require a second pass over the file with a Yara's pattern
matcher, after ClamAV's pattern matcher, and that the performance
concern made it make more sense to try and load yara rules into
ClamAV's matcher instead.


Speaking selfishly I wouldn't be greatly inconvenienced by an increase
in the scan times (even if it doubles) caused by separating the Yara
engine from the ClamAV engine.  That's because I only scan mail, and
the clamd server is well on top of it.  I can understand that people
who scan filesystems might have a different point of view; maybe both
could be accommodated with a config option.


I honestly don't have any numbers to back up this argument. It
sounds reasonable, but I'd love to see the numbers.


I occasionally run more than one clamd instance and I've seriously
considered running a separate one purely so that that Yara rules are
kept separate from the rest.  I always log scan times.  It will be a
bit fiddly, but when I get a minute I'll set something up to try to
give you some numbers.


One big reason I like to use ClamAV is that it's possible to add
other sources of signatures. Lots of people use the sanesecurity
ones. I add a lot of my own.


+1

Finally, unashamed repetition:

(1) a plea for a way to test rules before they go live;

(2) another plea for a parser which is good at its job;

(3) a way to specify that a rule is to match in
   (a) mail headers only or
   (b) mail body only or
   (c) both;

and lastly

(4) it would be great to have a way to reload rulesets separately so
it isn't necessary to reload ten million signatures when you've only
added one Yara rule, only then to find clamd crashes the first time it
tries to scan anything because you broke that rule.  I understand this
might be asking a lot, and a decent parser which prevents attempts to
load garbage rules (point 2) would do a lot to alleviate this pain.

--

73,
Ged.

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] human friendly signatures

2022-03-15 Thread Laurent S. via clamav-users
On Tuesday, March 15th, 2022 at 00:36, Micah Snyder (micasnyd) 
 wrote:

> Starting with our own new language would let us maintain do that but make it 
> easier for new analysts to train up on ClamAV.

I don't see at all the advantage of using a different, less used language. I 
don't know many people looking forward to learn a new language that is quite 
specific to one software and used more or less nowhere else.

One big reason I like to use ClamAV is that it's possible to add other sources 
of signatures. Lots of people use the sanesecurity ones. I add a lot of my own. 
I suppose there's a big amount of people who would love to add more (ie YARA) 
sources.

Is the goal for KDL to replace all of the existing ClamAV formats? I guess the 
transition would be a whole lot of effort from a LOT of people.

> What would be every more cool would be to be able to have an archive alert 
> because we found weak indicators in several of the contained files. 


I love the idea of weak indicators. But then, I'd like to have a more fine 
grained result in case of a hit. Something less binary but more something like 
a score. So that the amount of false positives could be more chosen. This would 
mean my paranoid customers could be as happy as the ones jumping to the roof at 
the first FP.

Best regards,
Laurent S.

signature.asc
Description: OpenPGP digital signature

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml