On Mon, Nov 09, 2020 at 08:18:51PM +0530, Syed Nayyar Waris wrote:
> On Mon, Nov 9, 2020 at 8:09 PM William Breathitt Gray
> <vilhelm.g...@gmail.com> wrote:
> >
> > On Mon, Nov 09, 2020 at 08:41:28AM -0500, William Breathitt Gray wrote:
> > > On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> > > > On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> > > > > On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
> > > > > <vilhelm.g...@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > > > > > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris 
> > > > > > > <syednwa...@gmail.com> wrote:
> > > > > > > >
> > > > > > > > This patch reimplements the xgpio_set_multiple() function in
> > > > > > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > > > > > bitmap_get_value() and bitmap_set_value(). The code is now 
> > > > > > > > simpler
> > > > > > > > to read and understand. Moreover, instead of looping for each 
> > > > > > > > bit
> > > > > > > > in xgpio_set_multiple() function, now we can check each channel 
> > > > > > > > at
> > > > > > > > a time and save cycles.
> > > > > > >
> > > > > > > This now causes -Wtype-limits warnings in linux-next with gcc-10:
> > > > > >
> > > > > > Hi Arnd,
> > > > > >
> > > > > > What version of gcc-10 are you running? I'm having trouble 
> > > > > > generating
> > > > > > these warnings so I suspect I'm using a different version than you.
> > > > >
> > > > > I originally saw it with the binaries from
> > > > > https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
> > > > > also been able to reproduce it with a minimal test case on the
> > > > > binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
> > > > >
> > > > > > Let me first verify that I understand the problem correctly. The 
> > > > > > issue
> > > > > > is the possibility of a stack smash in bitmap_set_value() when the 
> > > > > > value
> > > > > > of start + nbits is larger than the length of the map bitmap memory
> > > > > > region. This is because index (or index + 1) could be outside the 
> > > > > > range
> > > > > > of the bitmap memory region passed in as map. Is my understanding
> > > > > > correct here?
> > > > >
> > > > > Yes, that seems to be the case here.
> > > > >
> > > > > > In xgpio_set_multiple(), the variables width[0] and width[1] serve 
> > > > > > as
> > > > > > possible start and nbits values for the bitmap_set_value() calls.
> > > > > > Because width[0] and width[1] are unsigned int variables, GCC 
> > > > > > considers
> > > > > > the possibility that the value of width[0]/width[1] might exceed the
> > > > > > length of the bitmap memory region named old and thus result in a 
> > > > > > stack
> > > > > > smash.
> > > > > >
> > > > > > I don't know if invalid width values are actually possible for the
> > > > > > Xilinx gpio device, but let's err on the side of safety and assume 
> > > > > > this
> > > > > > is actually a possibility. We should verify that the combined value 
> > > > > > of
> > > > > > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> > > > > > check for this in xgpio_probe() when we grab the gpio_width values.
> > > > > >
> > > > > > However, we're still left with the GCC warnings because GCC is not 
> > > > > > smart
> > > > > > enough to know that we've already checked the boundary and width[0] 
> > > > > > and
> > > > > > width[1] are valid values. I suspect we can avoid this warning is we
> > > > > > refactor bitmap_set_value() to increment map seperately and then 
> > > > > > set it:
> > > > >
> > > > > As I understand it, part of the problem is that gcc sees the possible
> > > > > range as being constrained by the operations on 'start' and 'nbits',
> > > > > in particular the shift in BIT_WORD() that put an upper bound on
> > > > > the index, but then it sees that the upper bound is higher than the
> > > > > upper bound of the array, i.e. element zero.
> > > > >
> > > > > I added a check
> > > > >
> > > > >       if (start >= 64 || start + size >= 64) return;
> > > > >
> > > > > in the godbolt.org testcase, which does help limit the start
> > > > > index appropriately, but it is not sufficient to let the compiler
> > > > > see that the 'if (space >= nbits) ' condition is guaranteed to
> > > > > be true for all values here.
> > > > >
> > > > > > static inline void bitmap_set_value(unsigned long *map,
> > > > > >                                     unsigned long value,
> > > > > >                                     unsigned long start, unsigned 
> > > > > > long nbits)
> > > > > > {
> > > > > >         const unsigned long offset = start % BITS_PER_LONG;
> > > > > >         const unsigned long ceiling = round_up(start + 1, 
> > > > > > BITS_PER_LONG);
> > > > > >         const unsigned long space = ceiling - start;
> > > > > >
> > > > > >         map += BIT_WORD(start);
> > > > > >         value &= GENMASK(nbits - 1, 0);
> > > > > >
> > > > > >         if (space >= nbits) {
> > > > > >                 *map &= ~(GENMASK(nbits - 1, 0) << offset);
> > > > > >                 *map |= value << offset;
> > > > > >         } else {
> > > > > >                 *map &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > > >                 *map |= value << offset;
> > > > > >                 map++;
> > > > > >                 *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > > >                 *map |= value >> space;
> > > > > >         }
> > > > > > }
> > > > > >
> > > > > > This avoids adding a costly conditional check inside 
> > > > > > bitmap_set_value()
> > > > > > when almost all bitmap_set_value() calls will have static arguments 
> > > > > > with
> > > > > > well-defined and obvious boundaries.
> > > > > >
> > > > > > Do you think this would be an acceptable solution to resolve your 
> > > > > > GCC
> > > > > > warnings?
> > > > >
> > > > > Unfortunately, it does not seem to make a difference, as gcc still
> > > > > knows that this compiles to the same result, and it produces the same
> > > > > warning as before (see https://godbolt.org/z/rjx34r)
> > > > >
> > > > >          Arnd
> > > >
> > > > Hi Arnd,
> > > >
> > > > Sharing a different version of bitmap_set_valuei() function. See below.
> > > >
> > > > Let me know if the below solution looks good to you and if it resolves
> > > > the above compiler warning.
> > > >
> > > >
> > > > @@ -1,5 +1,5 @@
> > > >  static inline void bitmap_set_value(unsigned long *map,
> > > > -                                    unsigned long value,
> > > > +                                    unsigned long value, const size_t 
> > > > length,
> > > >                                      unsigned long start, unsigned long 
> > > > nbits)
> > > >  {
> > > >          const size_t index = BIT_WORD(start);
> > > > @@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map,
> > > >          const unsigned long ceiling = round_up(start + 1, 
> > > > BITS_PER_LONG);
> > > >          const unsigned long space = ceiling - start;
> > > >
> > > > +       if (index >= length)
> > > > +               return;
> > > > +
> > > >          value &= GENMASK(nbits - 1, 0);
> > > >
> > > >          if (space >= nbits) {
> > > > @@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long 
> > > > *map,
> > > >          } else {
> > > >                  map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > >                  map[index + 0] |= value << offset;
> > > > +
> > > > +               if (index + 1 >= length)
> > > > +                       return;
> > > > +
> > > >                  map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + 
> > > > nbits);
> > > >                  map[index + 1] |= value >> space;
> > > >          }
> > >
> > > One of my concerns is that we're incurring the latency two additional
> > > conditional checks just to suppress a compiler warning about a case that
> > > wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
> > > there's a way for us to suppress these warnings without adding onto the
> > > latency of this function; given that bitmap_set_value() is intended to
> > > be used in loops, conditionals here could significantly increase latency
> > > in drivers.
> > >
> > > I wonder if array_index_nospec() might have the side effect of
> > > suppressing these warnings for us. For example, would this work:
> > >
> > > static inline void bitmap_set_value(unsigned long *map,
> > >                                   unsigned long value,
> > >                                   unsigned long start, unsigned long 
> > > nbits)
> > > {
> > >       const unsigned long offset = start % BITS_PER_LONG;
> > >       const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > >       const unsigned long space = ceiling - start;
> > >       size_t index = BIT_WORD(start);
> > >
> > >       value &= GENMASK(nbits - 1, 0);
> > >
> > >       if (space >= nbits) {
> > >               index = array_index_nospec(index, index + 1);
> > >
> > >               map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > >               map[index] |= value << offset;
> > >       } else {
> > >               index = array_index_nospec(index, index + 2);
> > >
> > >               map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > >               map[index + 0] |= value << offset;
> > >               map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > >               map[index + 1] |= value >> space;
> > >       }
> > > }
> > >
> > > Or is this going to produce the same warning because we're not using an
> > > explicit check against the map array size?
> > >
> > > William Breathitt Gray
> >
> > After testing my suggestion, it looks like the warnings are still
> > present. :-(
> >
> > Something else I've also considered is perhaps using the GCC built-in
> > function __builtin_unreachable() instead of returning. So in Syed's code
> > we would have the following instead:
> >
> > if (index + 1 >= length)
> >         __builtin_unreachable();
> >
> > This might allow GCC to optimize better and avoid the conditional check
> > all together, thus avoiding latency while also hinting enough context to
> > the compiler to suppress the warnings.
> >
> > William Breathitt Gray
> 
> I also thought of another optimization. Arnd, William, let me know
> what you think about it.
> 
> Since exceeding the array limit is a rather rare event, we can use the
> gcc extension: 'unlikely'  for the boundary checks.
> We can use it at the two places where 'index' and 'index + 1' is being
> checked against the boundary limit.
> 
> It might help optimize the code. Wouldn't it?
> 
> Syed Nayyar Waris

We probably don't need unlikely() because __builtin_unreachable() should
suffice to inform GCC that this condition will never occur -- in other
words, GCC will compile optimized code to avoid the conditional
entirely.

By the way, I think we only need the (index + 1 >= length) check; the
first index conditional check is not needed and does not affect the
warnings at all, so we might as well get rid of it.

William Breathitt Gray

Attachment: signature.asc
Description: PGP signature

Reply via email to