Eric Blake <ebl...@redhat.com> writes: > On 11/18/2015 04:51 AM, Markus Armbruster wrote: >> Eric Blake <ebl...@redhat.com> writes: >> >>> We already documented that qapi names should match specific >>> patterns (such as starting with a letter unless it was an enum >>> value or a downstream extension). Tighten that from a suggestion >>> into a hard requirement, which frees up names beginning with a >>> single underscore for qapi internal usage. >> >> If we care enough about naming conventions to document them, enforcing >> them makes only sense. >> >>> Also restrict things >>> to avoid 'a__b' or 'a--b' (that is, dash and underscore must not >>> occur consecutively). >> >> I feel this is entering "foolish names" territory, where things like >> "IcAnFiNdNaMeSeVeNlEsSrEaDaBlEtHaNStudlyCaps" live. Catching fools > > IhAdToOmUcHfUnReAdInGtHaT > > [It's amazing that scholars can ever manage to correctly interpret old > written languages, thanks to omitted word breaks (scriptio continua) or > omitted vowels (such as in Hebrew)]
Indeed. >> automatically is generally futile, they're too creative :) >> >> Let's drop this part. > > Sure, I can do that. I'll post a fixup patch, as it will affect docs too. > >> >>> Add a new test, reserved-member-underscore, to demonstrate the >>> tighter checking. >>> >>> Signed-off-by: Eric Blake <ebl...@redhat.com> >>> >>> --- >>> v12: new patch >>> --- > >>> +incompatibly in a future release. All names must begin with a letter, >>> +and contain only ASCII letters, digits, dash, and underscore, where >>> +dash and underscore cannot occur consecutively. There are two > > Namely, right here. > >>> +# Names must be letters, numbers, -, and _. They must start with letter, >>> +# except for downstream extensions which must start with __RFQDN_. >>> +# Dots are only valid in the downstream extension prefix. >>> +valid_name = re.compile('^(__[a-zA-Z][a-zA-Z0-9.]*_)?' >>> + '[a-zA-Z][a-zA-Z0-9]*([_-][a-zA-Z0-9]+)*$') >> >> This regexp consists of two parts: optional __RFQDN_ prefix and the name >> proper. >> >> The latter stays simpler if we don't attempt to catch adjacent [-_]. >> >> The former isn't quite right. According to RFC 822 Appendix 1 - Domain >> Name Syntax Specification: >> >> * We must accept '-' in addition to digits. >> >> * We require RFQDN to start with a letter. This is still a loose match. >> The tight match is "labels separated by dots", where labels consist of >> letters, digits and '-', starting with a letter. I wouldn't bother >> checking first characters are letters at all. >> >> Recommend >> >> valid_name = re.compile('^(__[a-zA-Z0-9.-]+_)?' >> '[a-zA-Z][a-zA-Z0-9_-]*$') > > Sure, that works for me. It's tighter than what we had before, but > looser than what I proposed so that it allows more RFQDN. It > potentially lets users confuse us by abusing 'foo__max' or similar, but > I'm less worried about that abuse - it's okay to stop the blatantly > obvious mistakes without worrying about the corner cases, if it makes > the maintenance easier for the cases we do catch. > >> >>> >>> >>> def check_name(expr_info, source, name, allow_optional=False, >>> @@ -374,8 +376,8 @@ def check_name(expr_info, source, name, >>> allow_optional=False, >>> % (source, name)) >>> # Enum members can start with a digit, because the generated C >>> # code always prefixes it with the enum name >>> - if enum_member: >>> - membername = '_' + membername >>> + if enum_member and membername[0].isdigit(): >> >> What's wrong with the old condition? >> >>> + membername = 'D' + membername > > The old code prepended a lone '_', which doesn't work any more with the > tighter valid_name regex, so we have to use some other prefix (I picked > 'D'). > >>> # Reserve the entire 'q_' namespace for c_name() >>> if not valid_name.match(membername) or \ >>> c_name(membername, False).startswith('q_'): > > The other thing is that I realized that an enum value of 'q-int' was > getting munged to '_q-int' which no longer gets flagged by the c_name() > filter. But neither would 'Dq-int' get flagged. So limiting the > munging to just enum values that start with a digit (where we know it > doesn't start with q) means we don't weaken the second condition. Aha! Worth a comment, I think. > I guess I need to call that out in the commit message; all the more > reason for me to send a fixup. Yes, please!