Paolo Bonzini <[email protected]> writes:
> On Tue, Mar 3, 2026 at 1:31 PM Markus Armbruster <[email protected]> wrote:
>
>> >> You replied "In this case it doesn't really matter: public items (such
>> >> as QAPI enum entries, or struct fields) do not raise the unused warning
>> >> anyway."
>> >>
>> >> What gives us confidence rs_name() will only be used where it doesn't
>> >> really matter?
>> >
>> > The fact that all QAPI type definitions are (more or less by design)
>> > public.
>>
>> Any particular reason not to use the same 'q_' prefix as in C?
>
> The other Rust convention is that you usually have enums in CamelCase
> (starting with an uppercase letter). "q_" or "Q_" would also trigger a
> warning, and they would complicate to_camel_case a bit (because the
> prefix would have to be kept including the underscore).
Alright.
One of the QAPI generator's known shortcomings is weak protection
against name clashes in generated code.
The generator was built with cavalier disregard for name clashes. The
existing naming rules and their enforcement was retrofitted to limit the
damage.
There are two kinds:
* The user's names clashing with the generator's.
Naming rules help avoid such clashes.
Example: ['Frob'] makes the generator define a C type FrobList, which
could clash with a user-defined 'FrobList' if we didn't reject
user-defined names ending with 'List'. Test case reserved-type-list.
But there are gaps.
Example: a user-defined type QAPIEvent clashes with generated enum
QAPIEvent.
The naming rules should reflect the names the generator wants to use.
When we add new patterns of generated names, we should update the
rules. One pattern this series adds is names ending with 'Variant',
as I pointed out in the review of the generated code I sent yesterday.
* The user's names clashing with themselves.
Naming rules again help avoid such clashes.
Example: struct members 'a_b and 'a-b' would both map to C identifier
a_b if we didn't reject that. Test case struct-member-name-clash.
The rejection code checks for clashes among the values of c_name().
This guards against clashes in generated C. To also guard against
clashes in generated Rust, we'd need to check the values of rs_name().
I'd accept a TODO comment for now.
Again, there are gaps.
Example: value 'ni-cator' of enum 'Frob' and value 'cator' of enum
'FrobNi' both generate C enumeration constant FROB_NI_CATOR.
The more the generator massages names, the more we risk such clashes.
The C generator doesn't massage all that much, mostly funny characters
to '_', and enumeration constants. The Rust generator needs to
massage more, because Rust's naming rules differ from C's. I think we
should at least document the problem clearly. Anything else? Better
ideas?
See also docs/devel/qapi-code-gen.rst section "Naming rules and reserved
names".
>> >> This maps 'foo-0123-bar' to 'Foo_0123Bar'. Intentional? I'd kind of
>> >> expect 'Foo0123Bar'.
>> >
>> > Will fix (it is meant for 0123-45). New version is:
>> >
>> > def to_camel_case(value: str) -> str:
>> > result = ''
>> > for p in re.split(r'[-_]+', value):
>> > if not p:
>> > pass
>> > elif p[0].isalpha() or (result and result[-1].isalpha()):
>> > result += p[0].upper() + p[1:]
>> > else:
>> > result += '_' + p
>> > return result
>>
>> Maps '0123-45' to '_0123_45'. Is the leading '_' intentional?
>
> Yes, because otherwise the output is not an identifier; but it doesn't
> matter since the input is actually an identifier already and
> to_camel_case('_0123_45') does give '_0123_45'. In neither case you
> get the invalid identifier '0123_45'.
Shouldn't that be left to rs_name()?
>> I'm fine with not running rustfmt, and I'm fine with running it always
>> (makes it a hard requirement). Running it sometimes feels like more
>> trouble than it's worth.
>
> Yeah, I will drop it. I changed mcgen to allow removing empty lines in
> the middle of a declaration, like this:
>
> def mcgen(*code: T.Sequence[str], **kwds: object) -> str:
> '''
> Generate ``code`` with ``kwds`` interpolated. Separate
> positional arguments represent separate segments that could
> expand to empty strings; empty segments are omitted and no
> blank lines are introduced at their boundaries.
> '''
> last = len(code) - 1
> result = []
> for i, s in enumerate(code):
> if s.startswith('\n'):
> s = s[1:]
> if i != last:
> s = s.rstrip()
> s = cgen(s, **kwds)
> if s:
> result.append(s)
> return '\n'.join(result)
>
> For the case of a single argument, the result is equivalent to the
> existing implementation:
>
> # already checked by current mcgen()
> if s.startswith('\n'):
> s = s[1:]
> # skipped for a single argument
> if i != last:
> s = s.rstrip()
> s = cgen(s, **kwds)
> if s:
> result.append(s)
> # result has zero or a single element, no '\n' added
> return '\n'.join(result)
I wonder how the generated C would change if we used this there.