> Execution time will be important for scanning filesystems, less so for
> scanning mail (at least for scanning low-volume mail) and readability
> can be hugely important if you're writing a lot of rules.  Perhaps we
> should be asking the development team for readable LDB rules? :)

Creating a new "human readable", or "human friendly", signature language is 
something that I've brought up many times this past 6 months in our team 
meetings.  I think it's more feasible than trying to make Yara rules fully 
functional in ClamAV, or than trying to make our signatures look the same as 
Yara.

I toyed a bit with using the KDL document language 
(https://github.com/kdl-org/kdl) as a base for a new format.  My thought is it 
could be "compiled" or converted to more compact line of text prior to 
distribution, or unpacked/decompiled for readability as needed.  I am hoping we 
can spend some time these next few months investigating it further, once 0.105 
is out.  With our Rust language integration working rather nicely these days, 
we should be able to leverage the language and library ecosystem for this 
effort making it far easier to implement than with C.

A disclaimer: This is purely brainstorming, and I have no idea if we would 
continue with the KDL idea or find something else.  Here are some examples from 
my short time spent brainstorming this a few months back.

// example logical signature
Win.Trojan.Badness-98 type="logical" {
    Engine 81 255
    Target 0
    terms {
        x1 "41414141" type="hex" nocase=true ascii=true
        x2 "deadbeefcafe" type="hex"
        a1 "evil"
        fih "ff000d0d80ff" type="phash"
    }
    Condition "( x1 or x2 ) and a1 and fih"
}

// example .ign signature
Its.Not.So.Bad-99 type="ignore-signature" "Win.Trojan.Badness-98"

// example .crb trusted cert
Trusted.CA.Microsoft-9875186-0 type="trusted-certificate" {
    Engine 81 255
    Subject "6a7c2a3146b0335e9a3e1f5fb193338cd71c072d"
    Serial "b9e065ba400a6eec327f5b8a4f47faa169a87d8e"
    PubKey 
"dd0cbba2e42e09e3e7c5f79669bc0021bd693333efad04cb5480ee0683bbc52084d9f7d28bf338b0aba4ad2d7c627905ffe34a3f04352070e3c4e76be09cc03675e98a31dd8d70e5dc37b5744696285b8760232cbfdc47a567f751279e72eb07a6c9b91e3b53357ce5d3ec27b9871cfeb9c923096fa84691c16e963c41d3cba33f5d026a4dec691f25285c36fffd43150a94e019b4cfdfc212e2c25b27ee2778308b5b2a096b22895360162cc0681d53baec49f39d618c85680973445d7da2542bdd79f715cf355d6c1c2b5ccebc9c238b6f6eb526d93613c34fd627aeb9323b41922ce1c7cd77e8aa544ef75c0b048765b44318a8b2e06d1977ec5a24fa4803"
    Exponent "010001"
    CodeSign true
    TimeSign true
    CertSign true
    NotBefore 0
    Comment "Microsoft Windows Production PCA 2011 SHA256 2011-2026 
61:07:76:56:00:00:00:00:00:08"
}

// Example .crb revoked certificate
Blocklist.CRT.GluptebaRootkit-7910250-2 type="revoked-certificate" {
    Engine 81 255
    Subject "18df2f83e03d73694d4981d9ed4ac0b59c60ca3b"
    Serial "1ff8f31990f3244c29c955b3b56e340c43061807"
    PubKey 
"948adeb891aa3cca7db3fa09947f68db105ab50fccbf77cf43207d9c005ca24ecd35bd52ac0b4f0e48c77af7937d7185cc0c958551cb2b971892139c548b54bb50c96781dd3c6ade0ac2a0686efd5816ba68c144e24ae6579860de3daf70ac15b2332a5ff2874807a04983554f8e95ce034ac05c414fdc3e3f9f5eee778da849d8390d27876425d039c5cd70c6e710677ce9f63427771413f2d425fc4aac323fb5bf8905fa5df1895ec447d4bbff36001c8fdfb69d251f17befdb4fa1baf2dd4379a11935f9b9a6a47e5eee9ca2e84c5f96da9027f54f51ae85e7c250f423ac8de44d1a99aef9a9be014ef9b42794b01a6f2b297896583096233081fac6b4541"
    Exponent "010001"
    CodeSign true
    TimeSign false  // could be omitted to ignore
    CertSign false  // could be omitted to ignore
    NotBefore 0     // could be omitted to ignore
    Comment ""      // could be omitted to ignore
}

I haven't thought through all the implications of allowing plaintext logical 
condition terms (called "subsignatures" in LDB signatures), such as how to 
escape the special characters used for wildcarding, ranges, etc.  There is a 
lot to think about, but I feel this project is doable.

I am particularly interested in feedback from those of you who write ClamAV 
signatures regularly, and from those of you who are new to writing signatures 
and can more easily spot the sharp edges to which many of us have been 
desensitized.

Cheers,
Micah


Micah Snyder
ClamAV Development
Talos
Cisco Systems, Inc.
________________________________
From: clamav-users <clamav-users-boun...@lists.clamav.net> on behalf of G.W. 
Haywood via clamav-users <clamav-users@lists.clamav.net>
Sent: Saturday, February 26, 2022 1:56 AM
To: ClamAV users ML <clamav-users@lists.clamav.net>
Cc: G.W. Haywood <cla...@jubileegroup.co.uk>
Subject: Re: [clamav-users] Minor bug or working as intended?

Hi there,

On Fri, 25 Feb 2022, Laurent S. via clamav-users wrote:

> I've had the same issue. In the last two years, I was regularly
> writing YARA sigs in ClamAV and finding that it behaves in strange
> ways... Especially the regex integration.
>
> I specifically remember that counting regex wasn't possible and that
> I had to write those sigs either in strings or HEX.
>
> After too many timeouts and strange stuff ...

Sounds like you and I have been through the same pain.

> I decided to rewrite all of the sigs I had written to LDB. It's not
> easy to read, less fun to write... but damn it's much more reliable
> and fast.

Execution time will be important for scanning filesystems, less so for
scanning mail (at least for scanning low-volume mail) and readability
can be hugely important if you're writing a lot of rules.  Perhaps we
should be asking the development team for readable LDB rules? :)

> PS: This YARA might technically work, but might cost you lots of CPU:
> $a3 = /(<script type="text\/javascript">functionsendemail.?\(\)\{.*){3}/

I think it's generally best to avoid things like '.*' in Yara rules,
and possibly in regexes in general for use in scanning.  Even in mail
you can find yourself scanning fairly big base64-encoded texts which
are never going to match but still cost CPU, but in a filesystem there
may be files of gigabytes+ and some regexes will be *very* expensive.

> I personally think a better project for the community would be to
> improve YARA in ClamAV ...

+1

If I'd had the time I'd have done it myself already.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Reply via email to