On Wed, Nov 10, 2021, at 15:31, Mike Pattrick wrote: > On Wed, Nov 10, 2021 at 6:30 AM Gaëtan Rivet <gr...@u256.net> wrote: >> >> On Tue, Nov 2, 2021, at 19:43, Mike Pattrick wrote: >> > Recently there has been a lot of press about the "trojan source" attack, >> > where Unicode characters are used to obfuscate the true functionality of >> > code. This attack didn't effect OVS, but adding the check here will help >> > guard against it sneaking in later. >> > >> > Signed-off-by: Mike Pattrick <m...@redhat.com> >> >> Hi, >> >> What did you base the selection of characters to blacklist on? > > I believe this list was sourced from https://unicode.org/reports/tr9/ >
Sure, I'm just thinking about zero-width chars, that are used to subtly introduce off-by-ones. The bidir check seems incomplete. >> Reading issues open on other languages, I haven't found a good comprehensive >> set of characters that would need to be blacklisted. I'm not sure it is a >> sufficient >> approach: getting creative and circumventing this kind of blacklist would be >> a sport. >> >> Instead, shouldn't we take the reverse approach and whitelist single-byte >> chars? >> (warn on multi-byte unicode sequence). It would be sufficient for the vast >> majority >> of C sources (and scripts). > > I've been going back and forth on that idea. I'm afraid of making a > change that seems exclusive to people with non-latin characters in > their name. There are a few pre-canned lists of homoglyphs, maybe I > could add those to the blacklist? > I understand, but the check should only execute on {.c,.h,.in} files, not on the commit header itself. If restricted to sources, I think no name would appear. Comments and doc are written in English. -- Gaetan Rivet _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev