Re: [RFC PATCH] coccinelle: misc: add flexible_array.cocci script

2020-08-07 Thread Gustavo A. R. Silva
Hi Denis,

Thanks a lot for working on this. Please, see some comments below...

On 8/6/20 17:03, Denis Efremov wrote:
> Commit 68e4cd17e218 ("docs: deprecated.rst: Add zero-length and one-element
> arrays") marks one-element and zero-length arrays as deprecated. Kernel
> code should always use "flexible array members" instead.
> 
> The script warns about one-element and zero-length arrays in structs.
> 
> Cc: Kees Cook 
> Cc: Gustavo A. R. Silva 
> Signed-off-by: Denis Efremov 
> ---
> 
> Currently, it's just a draft. I've placed a number of questions in the
> script and marked them as TODO. Kees, Gustavo, if you could help me with
> my questions I think that this rule will be enough to close:
> https://github.com/KSPP/linux/issues/76
> 
> BTW, I it's possible to not warn about files in uapi folder if
> this is relevant. Do I need to do it in the script?
> 

I think the script should warn about new additions of zero-length/one-element
arrays in UAPI.

>  scripts/coccinelle/misc/flexible_array.cocci | 158 +++
>  1 file changed, 158 insertions(+)
>  create mode 100644 scripts/coccinelle/misc/flexible_array.cocci
> 
> diff --git a/scripts/coccinelle/misc/flexible_array.cocci 
> b/scripts/coccinelle/misc/flexible_array.cocci
> new file mode 100644
> index ..1e7165c79e60
> --- /dev/null
> +++ b/scripts/coccinelle/misc/flexible_array.cocci
> @@ -0,0 +1,158 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +///
> +/// Zero-length and one-element arrays are deprecated, see
> +/// Documentation/process/deprecated.rst
> +/// Flexible-array members should be used instead.
> +///
> +//
> +// Confidence: High
> +// Copyright: (C) 2020 Denis Efremov ISPRAS.
> +// Comments:
> +// Options: --no-includes --include-headers
> +
> +virtual context
> +virtual report
> +virtual org
> +virtual patch
> +
> +@r depends on !patch@
> +identifier name, size, array;
> +// TODO: We can additionally restrict size and array to:
> +// identifier size =~ ".*(num|len|count|size|ncpus).*";
> +// identifier array !~ ".*(pad|reserved).*";
> +// Do we need it?
> +type TS, TA;
> +position p;
> +@@
> +
> +(
> +  // This will also match: typedef struct name { ...
> +  // However nested structs are not matched, i.e.:
> +  //   struct name1 { struct name2 { int s; int a[0]; } st; int i; }
> +  // will not be matched. Do we need to handle it?

It's fine. I think this would be a different script. One that
exclusively look for all three: zero-length, one-element arrays
and flexible array members in nested structures because
"A structure containing a flexible array member, or a union
containing such a structure (possibly recursively), may not be
a member of a structure or an element of an array. (However
these uses are permitted by GCC as extensions.)"[1]

> +  struct name {
> +...  // TODO: Maybe simple ... is enough? It will match structs with 
> a

Yep; simple is always better at first. :)

> +TS size; // single field, e.g.
> +...  // 
> https://elixir.bootlin.com/linux/v5.8/source/arch/arm/include/uapi/asm/setup.h#L127
> +(
> +*TA array@p[0];
> +|
> + // TODO: It seems that there are exception cases for array[1], e.g.
> + //  
> https://elixir.bootlin.com/linux/v5.8/source/arch/powerpc/boot/rs6000.h#L152
> + //  
> https://elixir.bootlin.com/linux/v5.8/source/include/uapi/linux/cdrom.h#L292
> + //  
> https://elixir.bootlin.com/linux/v5.8/source/drivers/net/wireless/ath/ath6kl/usb.c#L108
> + // We could either drop array[1] checking from this rule or
> + // restrict array name with regexp and add, for example, an "allowlist"
> + // with struct names where we allow this code pattern.
> + // TODO: How to handle: u8 data[1][MAXLEN_PSTR6]; ?
> +*TA array@p[1];
> +)
> +  };
> +|
> +  struct {
> +...
> +TS size;
> +...
> +(
> +*TA array@p[0];
> +|
> +*TA array@p[1];
> +)
> +  };
> +|
> +  // TODO: do we need to handle unions?

Yep; we should warn about this in unions, too.

However, I think unions cannot have members with
incomplete type, so we should not suggest the use
of flexible-array members in unions, because
flexible arrays have incomplete type.

> +  union name {
> +...
> +TS size;
> +...
> +(
> +*TA array@p[0];
> +|
> +*TA array@p[1];
> +)
> +  };
> +|
> +  union {
> +...
> +TS size;
> +...
> +(
> +*TA array@p[0];
> +|
> +*TA array@p[1];
> +)
> +  };
> +)
> +
> +// FIXME: Patch mode doesn't work as expected.
> +// Coccinelle handles formatting incorrectly.
> +// Patch mode in this rule should be disabled until
> +// proper formatting will be supported.
> +@depends on patch exists@
> +identifier name, size, array;
> +type TS, TA;
> +@@
> +
> +(
> +  struct name {
> +...
> +TS size;
> +...
> +(
> +-TA array[0];
> +|
> +-TA array[1];
> +)
> ++TA array[];
> +  };
> +|
> +  struct {
> +...
> +TS size;
> +...
> +(
> +-TA array[0];
> +|
> +-TA array[1];
> +)
> ++  

[RFC PATCH] coccinelle: misc: add flexible_array.cocci script

2020-08-06 Thread Denis Efremov
Commit 68e4cd17e218 ("docs: deprecated.rst: Add zero-length and one-element
arrays") marks one-element and zero-length arrays as deprecated. Kernel
code should always use "flexible array members" instead.

The script warns about one-element and zero-length arrays in structs.

Cc: Kees Cook 
Cc: Gustavo A. R. Silva 
Signed-off-by: Denis Efremov 
---

Currently, it's just a draft. I've placed a number of questions in the
script and marked them as TODO. Kees, Gustavo, if you could help me with
my questions I think that this rule will be enough to close:
https://github.com/KSPP/linux/issues/76

BTW, I it's possible to not warn about files in uapi folder if
this is relevant. Do I need to do it in the script?

 scripts/coccinelle/misc/flexible_array.cocci | 158 +++
 1 file changed, 158 insertions(+)
 create mode 100644 scripts/coccinelle/misc/flexible_array.cocci

diff --git a/scripts/coccinelle/misc/flexible_array.cocci 
b/scripts/coccinelle/misc/flexible_array.cocci
new file mode 100644
index ..1e7165c79e60
--- /dev/null
+++ b/scripts/coccinelle/misc/flexible_array.cocci
@@ -0,0 +1,158 @@
+// SPDX-License-Identifier: GPL-2.0-only
+///
+/// Zero-length and one-element arrays are deprecated, see
+/// Documentation/process/deprecated.rst
+/// Flexible-array members should be used instead.
+///
+//
+// Confidence: High
+// Copyright: (C) 2020 Denis Efremov ISPRAS.
+// Comments:
+// Options: --no-includes --include-headers
+
+virtual context
+virtual report
+virtual org
+virtual patch
+
+@r depends on !patch@
+identifier name, size, array;
+// TODO: We can additionally restrict size and array to:
+// identifier size =~ ".*(num|len|count|size|ncpus).*";
+// identifier array !~ ".*(pad|reserved).*";
+// Do we need it?
+type TS, TA;
+position p;
+@@
+
+(
+  // This will also match: typedef struct name { ...
+  // However nested structs are not matched, i.e.:
+  //   struct name1 { struct name2 { int s; int a[0]; } st; int i; }
+  // will not be matched. Do we need to handle it?
+  struct name {
+...  // TODO: Maybe simple ... is enough? It will match structs with a
+TS size; // single field, e.g.
+...  // 
https://elixir.bootlin.com/linux/v5.8/source/arch/arm/include/uapi/asm/setup.h#L127
+(
+*TA array@p[0];
+|
+ // TODO: It seems that there are exception cases for array[1], e.g.
+ //  
https://elixir.bootlin.com/linux/v5.8/source/arch/powerpc/boot/rs6000.h#L152
+ //  
https://elixir.bootlin.com/linux/v5.8/source/include/uapi/linux/cdrom.h#L292
+ //  
https://elixir.bootlin.com/linux/v5.8/source/drivers/net/wireless/ath/ath6kl/usb.c#L108
+ // We could either drop array[1] checking from this rule or
+ // restrict array name with regexp and add, for example, an "allowlist"
+ // with struct names where we allow this code pattern.
+ // TODO: How to handle: u8 data[1][MAXLEN_PSTR6]; ?
+*TA array@p[1];
+)
+  };
+|
+  struct {
+...
+TS size;
+...
+(
+*TA array@p[0];
+|
+*TA array@p[1];
+)
+  };
+|
+  // TODO: do we need to handle unions?
+  union name {
+...
+TS size;
+...
+(
+*TA array@p[0];
+|
+*TA array@p[1];
+)
+  };
+|
+  union {
+...
+TS size;
+...
+(
+*TA array@p[0];
+|
+*TA array@p[1];
+)
+  };
+)
+
+// FIXME: Patch mode doesn't work as expected.
+// Coccinelle handles formatting incorrectly.
+// Patch mode in this rule should be disabled until
+// proper formatting will be supported.
+@depends on patch exists@
+identifier name, size, array;
+type TS, TA;
+@@
+
+(
+  struct name {
+...
+TS size;
+...
+(
+-TA array[0];
+|
+-TA array[1];
+)
++TA array[];
+  };
+|
+  struct {
+...
+TS size;
+...
+(
+-TA array[0];
+|
+-TA array[1];
+)
++TA array[];
+  };
+|
+  union name {
+...
+TS size;
+...
+(
+-TA array[0];
+|
+-TA array[1];
+)
++TA array[];
+  };
+|
+  union {
+...
+TS size;
+...
+(
+-TA array[0];
+|
+-TA array[1];
+)
++TA array[];
+  };
+)
+
+@script: python depends on report@
+p << r.p;
+@@
+
+msg = "WARNING: use flexible-array member instead"
+coccilib.report.print_report(p[0], msg)
+
+@script: python depends on org@
+p << r.p;
+@@
+
+msg = "WARNING: use flexible-array member instead"
+coccilib.org.print_todo(p, msg)
-- 
2.26.2