On Tue, 3 Mar 2026 15:12:30 +0000 "Loktionov, Aleksandr" <[email protected]> wrote:
> > -----Original Message----- > > From: Mauro Carvalho Chehab <[email protected]> > > Sent: Tuesday, March 3, 2026 3:53 PM > > To: Jani Nikula <[email protected]> > > Cc: Lobakin, Aleksander <[email protected]>; Jonathan > > Corbet <[email protected]>; Kees Cook <[email protected]>; Mauro Carvalho > > Chehab <[email protected]>; [email protected]; linux- > > [email protected]; [email protected]; linux- > > [email protected]; [email protected]; Gustavo A. R. Silva > > <[email protected]>; Loktionov, Aleksandr > > <[email protected]>; Randy Dunlap <[email protected]>; > > Shuah Khan <[email protected]> > > Subject: Re: [PATCH 00/38] docs: several improvements to kernel-doc > > > > On Mon, 23 Feb 2026 15:47:00 +0200 > > Jani Nikula <[email protected]> wrote: > > > > > There's always the question, if you're putting a lot of effort into > > > making kernel-doc closer to an actual C parser, why not put all that > > > effort into using and adapting to, you know, an actual C parser? > > > > Playing with this idea, it is not that hard to write an actual C > > parser - or at least a tokenizer. There is already an example of it > > at: > > > > https://docs.python.org/3/library/re.html > > > > I did a quick implementation, and it seems to be able to do its job: ... > > As hobby C compiler writer, I must say that you need to implement C > preprocessor first, because C preprocessor influences/changes the syntax. > In your tokenizer I see right away that any line which begins from '#' must > be just as C preprocessor command without further tokenizing. Yeah, we may need to implement C preprocessor parser in the future, but this will require handling #include, with could be somewhat complex. It is also tricky to handle conditional preprocessor macros, as kernel-doc would either require a file with at least some defines or would have to guess how to evaluate it to produce the right documentation, as ifdefs interfere at C macros. For now, I want to solve some specific problems: - fix trim_private_members() function that it is meant to handle /* private: */ and /* public: */ comments, as it currently have bugs when used on nested structs/unions, related to where the "private" scope finishes; - properly parse nested struct/union and properly pick nested identifiers; - detect and replace function arguments when macros with multiple arguments are used at the same prototype. Plus, kernel-doc has already a table of transforms to "convert" the C preprocessor macros that affect documentation into something that will work. So, I'm considering to start simple, for now ignoring cpp, addressing the existing issues. > But the real pain make C preprocessor substitutions IMHO Agreed. For now, we're using a transforms list inside kernel-doc for such purpose. So, those macros are manually "evaluated" there, like: (KernRe(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + struct_args_pattern + r'\)', re.S), r'dma_addr_t \1'), This works fine on trivial cases, where the argument is just an ID, but there are cases were we use macros like here: struct page_pool_params { struct_group_tagged(page_pool_params_fast, fast, unsigned int order; unsigned int pool_size; int nid; struct device *dev; struct napi_struct *napi; enum dma_data_direction dma_dir; unsigned int max_len; unsigned int offset; ); struct_group_tagged(page_pool_params_slow, slow, struct net_device *netdev; unsigned int queue_idx; unsigned int flags; /* private: used by test code only */ void (*init_callback)(netmem_ref netmem, void *arg); void *init_arg; ); }; To handle it, I'm thinking on using something like this(*): CFunction('struct_group_tagged'), r'struct \1 { \3 } \2;') E.g. teaching kernel-doc that, when: struct_group_tagged(a, b, c) is used, it should convert it into: struct a { c } b; which is basically what this macro does. On other words, hardcoding kernel-doc with some rules to handle the cases where CPP macros need to be evaluated. As there aren't much cases where such macros affect documentation (on lots of cases, just drop macros are enough), such approach kinda works. (*) I wrote already a patch for it, but as Jani pointed, perhaps using a tokenizer will make the logic simpler and easier to be understood/maintained. -- Thanks, Mauro

