> Well I can understand that features which are unique to ClamAV might > demand something more flexible than the Yara specification, although I > don't profess to have great insight into that. I wonder if this means > there's a case for "ClamAV *extensions* to the Yara language" or some > variation on that theme. I guess it wouldn't be too difficult to make > the extensions sufficiently non-Yara like to avoid clashes with future > developments of Yara itself. In case it isn't obvious we already have > a "ClamAV *version* of the Yara language" so this suggestion might not > be as outrageous as it seems.
The status quo is a sub-set of what's possible with the Yara. I think that's vastly different than adding features that won't make any sense to real-Yara. That said, ClamAV extensions to the Yara language isn't a terrible idea. It's an idea my boss kicked around a bit when we were chatting last week. Some context: He started the Yara signature support effort in ClamAV. And he has friends in the Yara community. I don't personally think that Yara + ClamAV extensions will be sufficient for all the different features we'll need. But I don't really have a vision in mind for how that would look. I'd be happy to be proven wrong with some proof-of-concepts work demonstrating each of the different features we have now. > (1) a plea for a way to test rules before they go live; If you mean "for personal use" then I'd say, "What Maarten said." But if you mean so Cisco-Talos malware analysts can do more extensive testing with like "hunting signatures" before publishing as "malware signatures" then the answer is different. I'm probably not the best person to discuss what's in the works there. I'll leave that question open to my colleagues on the malware research side. > (2) another plea for a parser which is good at its job; I'm not sure what you mean here. Can you elaborate? If you simply want ClamAV ignore garbage rules on load and continue with the rest of the file (see point #4) - that's something we can easily improve regardless of what we do. And that's how our yara rule loading logic works right now. > (3) a way to specify that a rule is to match in > (a) mail headers only or > (b) mail body only or > (c) both; This is a neat idea. It is a new signature language feature request and is a great example of something that would be hard to implement in the current clamav signature language(s). If you have any ideas on how this may be expressed either in the "clamav yara extensions" idea or in the proposed "KDL-based signature language" or some other proposed format, I'd love some examples. > (4) it would be great to have a way to reload rulesets separately so > it isn't necessary to reload ten million signatures when you've only > added one Yara rule, only then to find clamd crashes the first time it > tries to scan anything because you broke that rule. I understand this > might be asking a lot, and a decent parser which prevents attempts to > load garbage rules (point 2) would do a lot to alleviate this pain. Asking Clam to load additional rules to an existing engine while scans are ongoing is tricky, but potentially doable. It throws a wrench into the works for some hardening ideas I'm proposing for scan process sandboxing. Sort of. It's a more to think about. Asking Clam to unload specific rules from an active scanning engine has the same problem plus considerations about how to drop stuff from the trie structures without breaking anything. It's also potentially doable. Asking Clam to reload a modified signature database in-place is a different story. Let's say you have a ClamD running that had database version A. You modify the database file so now it's version B and version A is gone. And you want ClamD to look at version B and figure out which signatures have been added/removed/changed and update accordingly. I don't think that's something we can do. When signatures are loaded, we store bits of patterns in a variety of structures like tries, lists, hashmaps, etc. Figuring out which bits to remove and which to keep would be a bit of a nightmare. I would imagine that you would have to build a reference as big as the loaded engine while doing it just to sort it all out. Regards, Micah Micah Snyder ClamAV Development Talos Cisco Systems, Inc. ________________________________ From: clamav-users <clamav-users-boun...@lists.clamav.net> on behalf of G.W. Haywood via clamav-users <clamav-users@lists.clamav.net> Sent: Tuesday, March 15, 2022 10:51 AM To: ClamAV users ML <clamav-users@lists.clamav.net> Cc: G.W. Haywood <cla...@jubileegroup.co.uk> Subject: Re: [clamav-users] human friendly signatures Hi there, On Tue, 15 Mar 2022, Laurent S. via clamav-users wrote: > On Tuesday, March 15th, 2022 at 00:36, Micah Snyder wrote: > >> Starting with our own new language would let us maintain do that >> but make it easier for new analysts to train up on ClamAV. > > I don't see at all the advantage of using a different, less used > language. I don't know many people looking forward to learn a new > language that is quite specific to one software and used more or > less nowhere else. Well I can understand that features which are unique to ClamAV might demand something more flexible than the Yara specification, although I don't profess to have great insight into that. I wonder if this means there's a case for "ClamAV *extensions* to the Yara language" or some variation on that theme. I guess it wouldn't be too difficult to make the extensions sufficiently non-Yara like to avoid clashes with future developments of Yara itself. In case it isn't obvious we already have a "ClamAV *version* of the Yara language" so this suggestion might not be as outrageous as it seems. >> using Yara's engine in clamav directly is something that has been >> brought up time and again. It is possible. My understanding is that >> the reason ClamAV's yara support isn't done this way is that it >> would require a second pass over the file with a Yara's pattern >> matcher, after ClamAV's pattern matcher, and that the performance >> concern made it make more sense to try and load yara rules into >> ClamAV's matcher instead. Speaking selfishly I wouldn't be greatly inconvenienced by an increase in the scan times (even if it doubles) caused by separating the Yara engine from the ClamAV engine. That's because I only scan mail, and the clamd server is well on top of it. I can understand that people who scan filesystems might have a different point of view; maybe both could be accommodated with a config option. >> I honestly don't have any numbers to back up this argument. It >> sounds reasonable, but I'd love to see the numbers. I occasionally run more than one clamd instance and I've seriously considered running a separate one purely so that that Yara rules are kept separate from the rest. I always log scan times. It will be a bit fiddly, but when I get a minute I'll set something up to try to give you some numbers. > One big reason I like to use ClamAV is that it's possible to add > other sources of signatures. Lots of people use the sanesecurity > ones. I add a lot of my own. +1 Finally, unashamed repetition: (1) a plea for a way to test rules before they go live; (2) another plea for a parser which is good at its job; (3) a way to specify that a rule is to match in (a) mail headers only or (b) mail body only or (c) both; and lastly (4) it would be great to have a way to reload rulesets separately so it isn't necessary to reload ten million signatures when you've only added one Yara rule, only then to find clamd crashes the first time it tries to scan anything because you broke that rule. I understand this might be asking a lot, and a decent parser which prevents attempts to load garbage rules (point 2) would do a lot to alleviate this pain. -- 73, Ged. _______________________________________________ clamav-users mailing list clamav-users@lists.clamav.net https://lists.clamav.net/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml
_______________________________________________ clamav-users mailing list clamav-users@lists.clamav.net https://lists.clamav.net/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml