> Execution time will be important for scanning filesystems, less so for > scanning mail (at least for scanning low-volume mail) and readability > can be hugely important if you're writing a lot of rules. Perhaps we > should be asking the development team for readable LDB rules? :)
Creating a new "human readable", or "human friendly", signature language is something that I've brought up many times this past 6 months in our team meetings. I think it's more feasible than trying to make Yara rules fully functional in ClamAV, or than trying to make our signatures look the same as Yara. I toyed a bit with using the KDL document language (https://github.com/kdl-org/kdl) as a base for a new format. My thought is it could be "compiled" or converted to more compact line of text prior to distribution, or unpacked/decompiled for readability as needed. I am hoping we can spend some time these next few months investigating it further, once 0.105 is out. With our Rust language integration working rather nicely these days, we should be able to leverage the language and library ecosystem for this effort making it far easier to implement than with C. A disclaimer: This is purely brainstorming, and I have no idea if we would continue with the KDL idea or find something else. Here are some examples from my short time spent brainstorming this a few months back. // example logical signature Win.Trojan.Badness-98 type="logical" { Engine 81 255 Target 0 terms { x1 "41414141" type="hex" nocase=true ascii=true x2 "deadbeefcafe" type="hex" a1 "evil" fih "ff000d0d80ff" type="phash" } Condition "( x1 or x2 ) and a1 and fih" } // example .ign signature Its.Not.So.Bad-99 type="ignore-signature" "Win.Trojan.Badness-98" // example .crb trusted cert Trusted.CA.Microsoft-9875186-0 type="trusted-certificate" { Engine 81 255 Subject "6a7c2a3146b0335e9a3e1f5fb193338cd71c072d" Serial "b9e065ba400a6eec327f5b8a4f47faa169a87d8e" PubKey "dd0cbba2e42e09e3e7c5f79669bc0021bd693333efad04cb5480ee0683bbc52084d9f7d28bf338b0aba4ad2d7c627905ffe34a3f04352070e3c4e76be09cc03675e98a31dd8d70e5dc37b5744696285b8760232cbfdc47a567f751279e72eb07a6c9b91e3b53357ce5d3ec27b9871cfeb9c923096fa84691c16e963c41d3cba33f5d026a4dec691f25285c36fffd43150a94e019b4cfdfc212e2c25b27ee2778308b5b2a096b22895360162cc0681d53baec49f39d618c85680973445d7da2542bdd79f715cf355d6c1c2b5ccebc9c238b6f6eb526d93613c34fd627aeb9323b41922ce1c7cd77e8aa544ef75c0b048765b44318a8b2e06d1977ec5a24fa4803" Exponent "010001" CodeSign true TimeSign true CertSign true NotBefore 0 Comment "Microsoft Windows Production PCA 2011 SHA256 2011-2026 61:07:76:56:00:00:00:00:00:08" } // Example .crb revoked certificate Blocklist.CRT.GluptebaRootkit-7910250-2 type="revoked-certificate" { Engine 81 255 Subject "18df2f83e03d73694d4981d9ed4ac0b59c60ca3b" Serial "1ff8f31990f3244c29c955b3b56e340c43061807" PubKey "948adeb891aa3cca7db3fa09947f68db105ab50fccbf77cf43207d9c005ca24ecd35bd52ac0b4f0e48c77af7937d7185cc0c958551cb2b971892139c548b54bb50c96781dd3c6ade0ac2a0686efd5816ba68c144e24ae6579860de3daf70ac15b2332a5ff2874807a04983554f8e95ce034ac05c414fdc3e3f9f5eee778da849d8390d27876425d039c5cd70c6e710677ce9f63427771413f2d425fc4aac323fb5bf8905fa5df1895ec447d4bbff36001c8fdfb69d251f17befdb4fa1baf2dd4379a11935f9b9a6a47e5eee9ca2e84c5f96da9027f54f51ae85e7c250f423ac8de44d1a99aef9a9be014ef9b42794b01a6f2b297896583096233081fac6b4541" Exponent "010001" CodeSign true TimeSign false // could be omitted to ignore CertSign false // could be omitted to ignore NotBefore 0 // could be omitted to ignore Comment "" // could be omitted to ignore } I haven't thought through all the implications of allowing plaintext logical condition terms (called "subsignatures" in LDB signatures), such as how to escape the special characters used for wildcarding, ranges, etc. There is a lot to think about, but I feel this project is doable. I am particularly interested in feedback from those of you who write ClamAV signatures regularly, and from those of you who are new to writing signatures and can more easily spot the sharp edges to which many of us have been desensitized. Cheers, Micah Micah Snyder ClamAV Development Talos Cisco Systems, Inc. ________________________________ From: clamav-users <clamav-users-boun...@lists.clamav.net> on behalf of G.W. Haywood via clamav-users <clamav-users@lists.clamav.net> Sent: Saturday, February 26, 2022 1:56 AM To: ClamAV users ML <clamav-users@lists.clamav.net> Cc: G.W. Haywood <cla...@jubileegroup.co.uk> Subject: Re: [clamav-users] Minor bug or working as intended? Hi there, On Fri, 25 Feb 2022, Laurent S. via clamav-users wrote: > I've had the same issue. In the last two years, I was regularly > writing YARA sigs in ClamAV and finding that it behaves in strange > ways... Especially the regex integration. > > I specifically remember that counting regex wasn't possible and that > I had to write those sigs either in strings or HEX. > > After too many timeouts and strange stuff ... Sounds like you and I have been through the same pain. > I decided to rewrite all of the sigs I had written to LDB. It's not > easy to read, less fun to write... but damn it's much more reliable > and fast. Execution time will be important for scanning filesystems, less so for scanning mail (at least for scanning low-volume mail) and readability can be hugely important if you're writing a lot of rules. Perhaps we should be asking the development team for readable LDB rules? :) > PS: This YARA might technically work, but might cost you lots of CPU: > $a3 = /(<script type="text\/javascript">functionsendemail.?\(\)\{.*){3}/ I think it's generally best to avoid things like '.*' in Yara rules, and possibly in regexes in general for use in scanning. Even in mail you can find yourself scanning fairly big base64-encoded texts which are never going to match but still cost CPU, but in a filesystem there may be files of gigabytes+ and some regexes will be *very* expensive. > I personally think a better project for the community would be to > improve YARA in ClamAV ... +1 If I'd had the time I'd have done it myself already. -- 73, Ged. _______________________________________________ clamav-users mailing list clamav-users@lists.clamav.net https://lists.clamav.net/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml
_______________________________________________ clamav-users mailing list clamav-users@lists.clamav.net https://lists.clamav.net/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml