On Thu, Sep 24, 2020 at 03:32:48PM +0800, Kent Gibson wrote:
> On Wed, Sep 23, 2020 at 07:18:08PM +0300, Andy Shevchenko wrote:
> > On Tue, Sep 22, 2020 at 5:36 AM Kent Gibson <warthog...@gmail.com> wrote:
> > >
> > > Add support for the GPIO_V2_LINE_SET_VALUES_IOCTL.
> > 
> > > +static long linereq_set_values_unlocked(struct linereq *lr,
> > > +                                       struct gpio_v2_line_values *lv)
> > > +{
> > > +       DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX);
> > > +       struct gpio_desc **descs;
> > > +       unsigned int i, didx, num_set;
> > > +       int ret;
> > > +
> > > +       bitmap_zero(vals, GPIO_V2_LINES_MAX);
> > > +       for (num_set = 0, i = 0; i < lr->num_lines; i++) {
> > > +               if (lv->mask & BIT_ULL(i)) {
> > 
> > Similar idea
> > 
> > DECLARE_BITMAP(mask, 64) = BITMAP_FROM_U64(lv->mask);
> > 
> > num_set = bitmap_weight();
> > 
> 
> I had played with this option, but bitmap_weight() counts all
> the bits set in the mask - which considers bits >= lr->num_lines.
> So you would need to mask lv->mask before converting it to a bitmap.
> (I'm ok with ignoring those bits in case userspace wants to be lazy and
> use an all 1s mask.)
> 
> But since we're looping over the bitmap anyway we may as well just
> count as we go.
> 
> > for_each_set_bit(i, mask, lr->num_lines)
> > 
> 
> Yeah, that should work.  I vaguely recall trying this and finding it
> generated larger object code, but I'll give it another try and if it
> works out then include it in v10.
> 

Tried it again and, while it works, it does increase the size of
gpiolib-cdev.o as follows:

          u64   ->   bitmap
x86_64   28360       28616
i386     22056       22100
aarch64  37392       37600
mips32   28008       28016

So for 64-bit platforms changing to bitmap generates larger code,
probably as we are forcing them to use 32-bit array semantics where
before they could use the native u64.  For 32-bit there is a much
smaller difference as they were already using 32-bit array semantics
to realise the u64.

Those are for some of my test builds, so obviously YMMV.

It is also only for changing linereq_get_values(), which has three
instances of the loop.  linereq_set_values_unlocked() has another two,
so you could expect another increase of ~2/3 of that seen here if we
change that as well.

The sizeable increase in x86_64 was what made me revert this last time,
and I'm still satisfied with that choice.  Are you still eager to switch
to for_each_set_bit()?

Cheers,
Kent.

Reply via email to