On Sat, Nov 11, 2023 at 08:56:34AM -0800, Stephen Hemminger wrote:
> While examining the use of VLA in DPDK, ran into a bug in sfc driver.
> 
> If DPDK is built with -Wvla, then the RTE_BUILD_BUG_ON() macro won't work
> as written. Experimenting with a better more portable version of that macro
> as:
>       #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
> 
> revealed that the SFC driver was calling RTE_BUILD_BUG_ON with non constant
> expression.
> 
> ../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’:
> ../lib/eal/include/rte_common.h:585:20: warning: comparison of integer 
> expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
>   585 |                 _a < _b ? _a : _b; \
>       |                    ^
> ../lib/eal/include/rte_common.h:498:46: note: in definition of macro 
> ‘RTE_BUILD_BUG_ON’
>   498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>       |                                              ^
> ../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro 
> ‘RTE_MIN’
>   566 |                                  RTE_MIN((unsigned 
> int)EFX_MAC_PDU_MAX,
>       |                                  ^~~~~~~
> ../lib/eal/include/rte_common.h:585:32: warning: operand of ‘?:’ changes 
> signedness from ‘int’ to ‘unsigned int’ due to unsignedness of other operand 
> [-Wsign-compare]
>   585 |                 _a < _b ? _a : _b; \
>       |                                ^~
> ../lib/eal/include/rte_common.h:498:46: note: in definition of macro 
> ‘RTE_BUILD_BUG_ON’
>   498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>       |                                              ^
> ../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro 
> ‘RTE_MIN’
>   566 |                                  RTE_MIN((unsigned 
> int)EFX_MAC_PDU_MAX,
>       |                                  ^~~~~~~
> ../lib/eal/include/rte_common.h:498:44: error: expression in static assertion 
> is not constant
>   498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>       |                                            ^~~~
> ../drivers/net/sfc/sfc_ef100_tx.c:565:17: note: in expansion of macro 
> ‘RTE_BUILD_BUG_ON’
> 
> 
> The problem is that Gcc does not evaluate a ternary operator expression
> with all constants at compile time to produce a constant value! Apparently,
> the language standards leave this as ambiguous.

the problem is the expression being asserted was *never* constant or at
least not the expression you thought was being evaluated.

when the non-constant value is provided to the macro it produces a VLA
where the sizeof that VLA is the result of the expansion. i'm aware of
multiple instances of this bug outside of the use of this macro in dpdk.

the reason the expression is non-constant is because all the RTE_XXX
macros are expanded to statement expressions which themselves do not
evaluate as constant expressions. any RTE_BUILD_BUG_ON() expanding an
RTE_XXX macro will be impacted.

ideally we would move the source tree to disable VLAs entirely to wipe
out this and other similar bugs i've found. it would also be nice to
find a way to eliminate the use of statement expressions because of the
extremely subtle ways they can explode when seemingly innocuous control
flow mechanisms are present.

we should immediately convert to _Static_assert even if that means
temporarily disabling some of the RTE_BUILD_BUG_ON entries (which as we
can see have no value as written).

we should enable -Wvla and then handle re-enablement as an exception to
allow us to transition to a tree without VLAs to squash new additions
and catch other unexpected VLA generation.

> 
> If the code is expanded into two assertions as:
> 
> diff --git a/drivers/net/sfc/sfc_ef100_tx.c b/drivers/net/sfc/sfc_ef100_tx.c
> index 1b6374775f07..25e6633d6679 100644
> --- a/drivers/net/sfc/sfc_ef100_tx.c
> +++ b/drivers/net/sfc/sfc_ef100_tx.c
> @@ -562,9 +562,8 @@ sfc_ef100_tx_pkt_descs_max(const struct rte_mbuf *m)
>                  * Make sure that the first segment does not need 
> fragmentation
>                  * (split into many Tx descriptors).
>                  */
> -               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX <
> -                                RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
> -                                SFC_MBUF_SEG_LEN_MAX));
> +               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < 
> EFX_MAC_PDU_MAX);
> +               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < 
> SFC_MBUF_SEG_LEN_MAX);
>         }
>  
>         if (m->ol_flags & sfc_dp_mport_override) {
> 
> Then a new problem arises:
> In file included from ../lib/mbuf/rte_mbuf.h:36,
>                  from ../drivers/net/sfc/sfc_ef100_tx.c:12:
> ../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’:
> ../lib/eal/include/rte_common.h:498:29: error: static assertion failed: 
> "SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX"
>   498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>       |                             ^~~~~~~~~~~~~~
> ../drivers/net/sfc/sfc_ef100_tx.c:566:17: note: in expansion of macro 
> ‘RTE_BUILD_BUG_ON’
>   566 |                 RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < 
> SFC_MBUF_SEG_LEN_MAX);
>       |                 ^~~~~~~~~~~~~~~~
> 
> Building a little program to unwind the #defines reveals:
> 
> SFC_EF100_TX_SEND_DESC_LEN_MAX = 16383
> EFX_MAC_PDU_MAX = 9240
> SFC_MBUF_SEG_LEN_MAX = 65535
> 
> I.e:
>       RTE_BUILD_BUG_ON(16383 < RTE_MIN(9240, 65535));
>       
> 
> Therefore the current driver should be getting build bug, but the existing 
> macro
> hides it.

Reply via email to