Re: [RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax
Aditya Srivastava writes: > Currently kernel-doc does not identify some cases of probable kernel > doc comments, for e.g. pointer used as declaration type for identifier, > space separated identifier, etc. > > Some example of these cases in files can be: > i)" * journal_t * jbd2_journal_init_dev() - creates and initialises a > journal structure" > in fs/jbd2/journal.c > > ii) "* dget, dget_dlock - get a reference to a dentry" in > include/linux/dcache.h > > iii) " * DEFINE_SEQLOCK(sl) - Define a statically allocated seqlock_t" > in include/linux/seqlock.h > > Also improve identification for non-kerneldoc comments. For e.g., > > i) " *The following functions allow us to read data using a swap map" > in kernel/power/swap.c does follow the kernel-doc like syntax, but the > content inside does not adheres to the expected format. > > Improve parsing by adding support for these probable attempts to write > kernel-doc comment. > > Suggested-by: Jonathan Corbet > Link: https://lore.kernel.org/lkml/87mtujktl2@meer.lwn.net > Signed-off-by: Aditya Srivastava > --- > scripts/kernel-doc | 16 > 1 file changed, 12 insertions(+), 4 deletions(-) OK, I've applied this, but I have a couple of comments... > diff --git a/scripts/kernel-doc b/scripts/kernel-doc > index 888913528185..37665aa41e6b 100755 > --- a/scripts/kernel-doc > +++ b/scripts/kernel-doc > @@ -2110,17 +2110,25 @@ sub process_name($$) { > } elsif (/$doc_decl/o) { > $identifier = $1; > my $is_kernel_comment = 0; > - if (/^\s*\*\s*([\w\s]+?)(\(\))?\s*([-:].*)?$/) { > + my $decl_start = qr{\s*\*}; I appreciate the attempt to make the regexes a bit more comprehensible, but we can do better yet, methinks. This $decl_start is very much like $doc_com defined globally. It would really help a lot if we could at least take the incredible mass of regexes in this program and boil them down to a smaller, unique set that is used throughout. kernel-doc might still make brains explode, but perhaps the blast radius would be a bit smaller. > + my $fn_type = qr{\w+\s*\*\s*}; # i.e. pointer declaration type, foo * > bar() - desc Some of the lines in this change go way beyond the 80-character limit; please try not to do that. I fixed up the offending comments this time around. Thanks, jon
Re: [RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax
On 15/4/21 12:55 am, Aditya Srivastava wrote: > Currently kernel-doc does not identify some cases of probable kernel > doc comments, for e.g. pointer used as declaration type for identifier, > space separated identifier, etc. > > Some example of these cases in files can be: > i)" * journal_t * jbd2_journal_init_dev() - creates and initialises a > journal structure" > in fs/jbd2/journal.c > > ii) "* dget, dget_dlock - get a reference to a dentry" in > include/linux/dcache.h > > iii) " * DEFINE_SEQLOCK(sl) - Define a statically allocated seqlock_t" > in include/linux/seqlock.h > > Also improve identification for non-kerneldoc comments. For e.g., > > i) " *The following functions allow us to read data using a swap map" > in kernel/power/swap.c does follow the kernel-doc like syntax, but the > content inside does not adheres to the expected format. > > Improve parsing by adding support for these probable attempts to write > kernel-doc comment. > > Suggested-by: Jonathan Corbet > Link: https://lore.kernel.org/lkml/87mtujktl2@meer.lwn.net > Signed-off-by: Aditya Srivastava > --- > scripts/kernel-doc | 16 > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/scripts/kernel-doc b/scripts/kernel-doc > index 888913528185..37665aa41e6b 100755 > --- a/scripts/kernel-doc > +++ b/scripts/kernel-doc > @@ -2110,17 +2110,25 @@ sub process_name($$) { > } elsif (/$doc_decl/o) { > $identifier = $1; > my $is_kernel_comment = 0; > - if (/^\s*\*\s*([\w\s]+?)(\(\))?\s*([-:].*)?$/) { > + my $decl_start = qr{\s*\*}; > + my $fn_type = qr{\w+\s*\*\s*}; # i.e. pointer declaration type, foo * > bar() - desc > + my $parenthesis = qr{\(\w*\)}; > + my $decl_end = qr{[-:].*}; > + if (/^$decl_start\s*([\w\s]+?)$parenthesis?\s*$decl_end?$/) { > $identifier = $1; > - $decl_type = 'function'; > - $identifier =~ s/^define\s+//; > - $is_kernel_comment = 1; > } > if ($identifier =~ m/^(struct|union|enum|typedef)\b\s*(\S*)/) { > $decl_type = $1; > $identifier = $2; > $is_kernel_comment = 1; > } > + elsif (/^$decl_start\s*$fn_type?(\w+)\s*$parenthesis?\s*$decl_end?$/ || > # i.e. foo() > + /^$decl_start\s*$fn_type?(\w+.*)$parenthesis?\s*$decl_end$/) { > # i.e. static void foo() - description; or misspelt identifier > + $identifier = $1; > + $decl_type = 'function'; > + $identifier =~ s/^define\s+//; > + $is_kernel_comment = 1; > + } > $identifier =~ s/\s+$//; > > $state = STATE_BODY; > Hi I have generated a diff file for changes in kernel-doc warnings for all the files in the kernel-tree, before and after this patch. It can be found at: https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/kernel-doc/kernel_doc_comment_syntax_improvement_diff.txt Thanks Aditya
[RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax
Currently kernel-doc does not identify some cases of probable kernel doc comments, for e.g. pointer used as declaration type for identifier, space separated identifier, etc. Some example of these cases in files can be: i)" * journal_t * jbd2_journal_init_dev() - creates and initialises a journal structure" in fs/jbd2/journal.c ii) "* dget, dget_dlock - get a reference to a dentry" in include/linux/dcache.h iii) " * DEFINE_SEQLOCK(sl) - Define a statically allocated seqlock_t" in include/linux/seqlock.h Also improve identification for non-kerneldoc comments. For e.g., i) " * The following functions allow us to read data using a swap map" in kernel/power/swap.c does follow the kernel-doc like syntax, but the content inside does not adheres to the expected format. Improve parsing by adding support for these probable attempts to write kernel-doc comment. Suggested-by: Jonathan Corbet Link: https://lore.kernel.org/lkml/87mtujktl2@meer.lwn.net Signed-off-by: Aditya Srivastava --- scripts/kernel-doc | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/scripts/kernel-doc b/scripts/kernel-doc index 888913528185..37665aa41e6b 100755 --- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -2110,17 +2110,25 @@ sub process_name($$) { } elsif (/$doc_decl/o) { $identifier = $1; my $is_kernel_comment = 0; - if (/^\s*\*\s*([\w\s]+?)(\(\))?\s*([-:].*)?$/) { + my $decl_start = qr{\s*\*}; + my $fn_type = qr{\w+\s*\*\s*}; # i.e. pointer declaration type, foo * bar() - desc + my $parenthesis = qr{\(\w*\)}; + my $decl_end = qr{[-:].*}; + if (/^$decl_start\s*([\w\s]+?)$parenthesis?\s*$decl_end?$/) { $identifier = $1; - $decl_type = 'function'; - $identifier =~ s/^define\s+//; - $is_kernel_comment = 1; } if ($identifier =~ m/^(struct|union|enum|typedef)\b\s*(\S*)/) { $decl_type = $1; $identifier = $2; $is_kernel_comment = 1; } + elsif (/^$decl_start\s*$fn_type?(\w+)\s*$parenthesis?\s*$decl_end?$/ || # i.e. foo() + /^$decl_start\s*$fn_type?(\w+.*)$parenthesis?\s*$decl_end$/) { # i.e. static void foo() - description; or misspelt identifier + $identifier = $1; + $decl_type = 'function'; + $identifier =~ s/^define\s+//; + $is_kernel_comment = 1; + } $identifier =~ s/\s+$//; $state = STATE_BODY; -- 2.17.1