Re: [RFC v3] scripts: kernel-doc: fix typedef support for struct/union parsing

2021-03-06 Thread Aditya
On 6/3/21 8:50 pm, Matthew Wilcox wrote:
> On Sat, Mar 06, 2021 at 01:18:38PM +0530, Aditya wrote:
>> On 6/3/21 11:55 am, Lukas Bulwahn wrote:
>>> I agree. That might be a suitable clean-up to keep the code for
>>> functions and struct/union parsing similar in style/spirit.
>>>
>>> Aditya, would you like to create a patch for that?
>>
>> Sure Lukas.
>> I have a doubt though, Can't we use a single expression separated by
>> "|" here, instead of multiple lines? i.e.,
>>
>> $x =~
>> s/__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)\s*//;
>>
>>
>> Probably we could do something similar for dump_function, i.e.,
>> -$prototype =~ s/^static +//;
>> -$prototype =~ s/^extern +//;
>> -$prototype =~ s/^asmlinkage +//;
>> -$prototype =~ s/^inline +//;
>> -$prototype =~ s/^__inline__ +//;
>> -$prototype =~ s/^__inline +//;
>> -$prototype =~ s/^__always_inline +//;
>> -$prototype =~ s/^noinline +//;
>>
>> +$prototype =~
>> s/^(?:static|extern|asmlinkage|__?inline__?|__always_inline|noinline) +//;
>> And so on for other regexps.
>>
>> What do you think?
> 
> I think there's a tradeoff between speed / compactness and readability.
> As someone who doesn't know perl particularly well, I can look at the
> series of lines and say "Oh, it's stripping out these unwanted things".
> Your one line, while undoubtedly more efficient, is considerably less
> easy to understand.
> 
> Maybe there's another way to do it that's more efficient while not
> sacrificing the readability?
> 
> Also, would your suggestion work for 'static inline void foo(void)'?
> I think it needs to remove multiple occurrences of the things in the
> regex.  

Ah, I get it.

But maybe that's what the ?: on the beginning is for?
> 

No, "?:" is just to use this regex for matching, without capturing.
So, the regex will just remove any of those 'starting' occurrences,
consequently, "static inline" occurrence will probably not be removed.
I think the reason for using multiple lines for substitution in
dump_function is for the same reason, ie, subsequent substitution.

But for dump_struct, it is probably not desired, i.e., subsequent
substitution/removal. Also, for the same reason, using it with
dump_struct may cause unwelcomed discrepancies, or cause the user to
understand that here multiple substitutions are required, but is not so.
So, I think that we should use a single expression substitution for
dump_struct, at best. But am not sure if that is required, as
currently also we are not capturing this part of the regex, as well
as, matching only at certain position of the definition expression.

But I am curious to what others think about this.
Will be happy to work on this patch :)

Thanks
Aditya


Re: [RFC v3] scripts: kernel-doc: fix typedef support for struct/union parsing

2021-03-06 Thread Matthew Wilcox
On Sat, Mar 06, 2021 at 01:18:38PM +0530, Aditya wrote:
> On 6/3/21 11:55 am, Lukas Bulwahn wrote:
> > I agree. That might be a suitable clean-up to keep the code for
> > functions and struct/union parsing similar in style/spirit.
> > 
> > Aditya, would you like to create a patch for that?
> 
> Sure Lukas.
> I have a doubt though, Can't we use a single expression separated by
> "|" here, instead of multiple lines? i.e.,
> 
> $x =~
> s/__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)\s*//;
> 
> 
> Probably we could do something similar for dump_function, i.e.,
> -$prototype =~ s/^static +//;
> -$prototype =~ s/^extern +//;
> -$prototype =~ s/^asmlinkage +//;
> -$prototype =~ s/^inline +//;
> -$prototype =~ s/^__inline__ +//;
> -$prototype =~ s/^__inline +//;
> -$prototype =~ s/^__always_inline +//;
> -$prototype =~ s/^noinline +//;
> 
> +$prototype =~
> s/^(?:static|extern|asmlinkage|__?inline__?|__always_inline|noinline) +//;
> And so on for other regexps.
> 
> What do you think?

I think there's a tradeoff between speed / compactness and readability.
As someone who doesn't know perl particularly well, I can look at the
series of lines and say "Oh, it's stripping out these unwanted things".
Your one line, while undoubtedly more efficient, is considerably less
easy to understand.

Maybe there's another way to do it that's more efficient while not
sacrificing the readability?

Also, would your suggestion work for 'static inline void foo(void)'?
I think it needs to remove multiple occurrences of the things in the
regex.  But maybe that's what the ?: on the beginning is for?


Re: [RFC v3] scripts: kernel-doc: fix typedef support for struct/union parsing

2021-03-05 Thread Aditya
On 6/3/21 11:55 am, Lukas Bulwahn wrote:
> On Sat, Mar 6, 2021 at 5:35 AM Matthew Wilcox  wrote:
>>
>> On Thu, Feb 25, 2021 at 08:20:33PM +0530, Aditya Srivastava wrote:
>>> +++ b/scripts/kernel-doc
>>> @@ -1201,12 +1201,23 @@ sub dump_union($$) {
>>>  sub dump_struct($$) {
>>>  my $x = shift;
>>>  my $file = shift;
>>> +my $decl_type;
>>> +my $members;
>>> +my $type = qr{struct|union};
>>> +# For capturing struct/union definition body, i.e. 
>>> "{members*}qualifiers*"
>>> +my $definition_body = 
>>> qr{\{(.*)\}(?:\s*(?:__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*};
>>> -if ($x =~ 
>>> /(struct|union)\s+(\w+)\s*\{(.*)\}(\s*(__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*/)
>>>  {
>>> - my $decl_type = $1;
>>> +if ($x =~ /($type)\s+(\w+)\s*$definition_body/) {
>>> + $decl_type = $1;
>>>   $declaration_name = $2;
>>> - my $members = $3;
>>> + $members = $3;
>>> +} elsif ($x =~ /typedef\s+($type)\s*$definition_body\s*(\w+)\s*;/) {
>>> + $decl_type = $1;
>>> + $declaration_name = $3;
>>> + $members = $2;
>>> +}
>>
>> In the same spirit as dump_function, would something like this work?
>>
> 
> I agree. That might be a suitable clean-up to keep the code for
> functions and struct/union parsing similar in style/spirit.
> 
> Aditya, would you like to create a patch for that?
> 

Sure Lukas.
I have a doubt though, Can't we use a single expression separated by
"|" here, instead of multiple lines? i.e.,

$x =~
s/__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)\s*//;


Probably we could do something similar for dump_function, i.e.,
-$prototype =~ s/^static +//;
-$prototype =~ s/^extern +//;
-$prototype =~ s/^asmlinkage +//;
-$prototype =~ s/^inline +//;
-$prototype =~ s/^__inline__ +//;
-$prototype =~ s/^__inline +//;
-$prototype =~ s/^__always_inline +//;
-$prototype =~ s/^noinline +//;

+$prototype =~
s/^(?:static|extern|asmlinkage|__?inline__?|__always_inline|noinline) +//;
And so on for other regexps.

What do you think?

Thanks
Aditya


Re: [RFC v3] scripts: kernel-doc: fix typedef support for struct/union parsing

2021-03-05 Thread Lukas Bulwahn
On Sat, Mar 6, 2021 at 5:35 AM Matthew Wilcox  wrote:
>
> On Thu, Feb 25, 2021 at 08:20:33PM +0530, Aditya Srivastava wrote:
> > +++ b/scripts/kernel-doc
> > @@ -1201,12 +1201,23 @@ sub dump_union($$) {
> >  sub dump_struct($$) {
> >  my $x = shift;
> >  my $file = shift;
> > +my $decl_type;
> > +my $members;
> > +my $type = qr{struct|union};
> > +# For capturing struct/union definition body, i.e. 
> > "{members*}qualifiers*"
> > +my $definition_body = 
> > qr{\{(.*)\}(?:\s*(?:__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*};
> > -if ($x =~ 
> > /(struct|union)\s+(\w+)\s*\{(.*)\}(\s*(__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*/)
> >  {
> > - my $decl_type = $1;
> > +if ($x =~ /($type)\s+(\w+)\s*$definition_body/) {
> > + $decl_type = $1;
> >   $declaration_name = $2;
> > - my $members = $3;
> > + $members = $3;
> > +} elsif ($x =~ /typedef\s+($type)\s*$definition_body\s*(\w+)\s*;/) {
> > + $decl_type = $1;
> > + $declaration_name = $3;
> > + $members = $2;
> > +}
>
> In the same spirit as dump_function, would something like this work?
>

I agree. That might be a suitable clean-up to keep the code for
functions and struct/union parsing similar in style/spirit.

Aditya, would you like to create a patch for that?

Lukas


Re: [RFC v3] scripts: kernel-doc: fix typedef support for struct/union parsing

2021-03-05 Thread Matthew Wilcox
On Thu, Feb 25, 2021 at 08:20:33PM +0530, Aditya Srivastava wrote:
> +++ b/scripts/kernel-doc
> @@ -1201,12 +1201,23 @@ sub dump_union($$) {
>  sub dump_struct($$) {
>  my $x = shift;
>  my $file = shift;
> +my $decl_type;
> +my $members;
> +my $type = qr{struct|union};
> +# For capturing struct/union definition body, i.e. 
> "{members*}qualifiers*"
> +my $definition_body = 
> qr{\{(.*)\}(?:\s*(?:__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*};
> -if ($x =~ 
> /(struct|union)\s+(\w+)\s*\{(.*)\}(\s*(__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*/)
>  {
> - my $decl_type = $1;
> +if ($x =~ /($type)\s+(\w+)\s*$definition_body/) {
> + $decl_type = $1;
>   $declaration_name = $2;
> - my $members = $3;
> + $members = $3;
> +} elsif ($x =~ /typedef\s+($type)\s*$definition_body\s*(\w+)\s*;/) {
> + $decl_type = $1;
> + $declaration_name = $3;
> + $members = $2;
> +}

In the same spirit as dump_function, would something like this work?

-if ($x =~ 
/(struct|union)\s+(\w+)\s*\{(.*)\}(\s*(__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*/)
 {
+$x =~ s/__packed +//;
+$x =~ s/__aligned +//;
+$x =~ s/cacheline_aligned_in_smp +//;
+$x =~ s/cacheline_aligned +//;
+$x =~ s/__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)//;
+
+if ($x =~ /(struct|union)\s+(\w+)\s*\{(.*)\}(\s*)*/) {



Re: [RFC v3] scripts: kernel-doc: fix typedef support for struct/union parsing

2021-03-01 Thread Jonathan Corbet
Aditya Srivastava  writes:

> Currently, there are ~1290 occurrences in 447 files in the kernel tree
> 'typedef struct/union' syntax for defining some struct/union. However,
> kernel-doc currently does not support that syntax. Of the ~1290
> occurrences, there are four occurrences in ./include/linux/zstd.h with
> typedef struct/union syntax and a preceding kernel-doc; all other
> occurrences have no preceding kernel-doc.
>
> Add support for parsing struct/union following this syntax.
>
> Signed-off-by: Aditya Srivastava 

Applied, thanks.

jon


[RFC v3] scripts: kernel-doc: fix typedef support for struct/union parsing

2021-02-25 Thread Aditya Srivastava
Currently, there are ~1290 occurrences in 447 files in the kernel tree
'typedef struct/union' syntax for defining some struct/union. However,
kernel-doc currently does not support that syntax. Of the ~1290
occurrences, there are four occurrences in ./include/linux/zstd.h with
typedef struct/union syntax and a preceding kernel-doc; all other
occurrences have no preceding kernel-doc.

Add support for parsing struct/union following this syntax.

Signed-off-by: Aditya Srivastava 
---
Changes in v3:
- Modify commit message

Changes in v2:
- Split recurring regex into multiple variables
- Modify commit message

 scripts/kernel-doc | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 8b5bc7bf4bb8..68df17877384 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1201,12 +1201,23 @@ sub dump_union($$) {
 sub dump_struct($$) {
 my $x = shift;
 my $file = shift;
+my $decl_type;
+my $members;
+my $type = qr{struct|union};
+# For capturing struct/union definition body, i.e. "{members*}qualifiers*"
+my $definition_body = 
qr{\{(.*)\}(?:\s*(?:__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*};
 
-if ($x =~ 
/(struct|union)\s+(\w+)\s*\{(.*)\}(\s*(__packed|__aligned|cacheline_aligned_in_smp|cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*/)
 {
-   my $decl_type = $1;
+if ($x =~ /($type)\s+(\w+)\s*$definition_body/) {
+   $decl_type = $1;
$declaration_name = $2;
-   my $members = $3;
+   $members = $3;
+} elsif ($x =~ /typedef\s+($type)\s*$definition_body\s*(\w+)\s*;/) {
+   $decl_type = $1;
+   $declaration_name = $3;
+   $members = $2;
+}
 
+if ($members) {
if ($identifier ne $declaration_name) {
print STDERR "${file}:$.: warning: expecting prototype for 
$decl_type $identifier. Prototype was for $decl_type $declaration_name 
instead\n";
return;
-- 
2.17.1