On Fri, Jan 19, 2018 at 02:53:25PM -0800, Linus Torvalds wrote:

> It would probably be good to add the size too, just to explain why
> it's potentially expensive.
> 
> That said, apparently we do have hundreds of them, with just
> cpufreq_frequency_table having a ton. Maybe some are hidden in macros
> and removing one removes a lot.

cpufreq_table_find_index_...(), mostly.

> The real problem is that sometimes the subtraction is simply the right
> thing to do, and there's no sane way to say "yeah, this is one of
> those cases you shouldn't warn about".

FWIW, the sizes of the most common ones are
     91 sizeof struct cpufreq_frequency_table = 12
Almost all of those come from
        cpufreq_for_each_valid_entry(pos, table)
                if (....)
                        return pos - table;
and I wonder if we would be better off with something like
#define cpufreq_for_each_valid_entry(pos, table, idx)                           
        \
        for (pos = table, idx = 0; pos->frequency != CPUFREQ_TABLE_END; pos++, 
idx++)   \
                if (pos->frequency == CPUFREQ_ENTRY_INVALID)            \
                        continue;                                       \
                else
so that those loops would become
        cpufreq_for_each_valid_entry(pos, table, idx)
                if (....)
                        return idx;
     36 sizeof struct Indirect = 24
     21 sizeof struct ips_scb = 216
     18 sizeof struct runlist_element = 24
     13 sizeof struct zone = 1728
Some are from
#define zone_idx(zone)          ((zone) - (zone)->zone_pgdat->node_zones)
but there's
static inline int zone_id(const struct zone *zone)
{ 
        struct pglist_data *pgdat = zone->zone_pgdat;

        return zone - pgdat->node_zones;
}
and a couple of places where we have
        for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
Those bloody well ought to be
        for (zone = node_zones, end = node_zones + MAX_NR_ZONES; zone < end; 
zone++) {
     13 sizeof struct vring = 40
     11 sizeof struct usbhsh_device = 24
     10 sizeof struct xpc_partition = 888
      9 sizeof struct skge_element = 40
      9 sizeof struct lock_class = 400
      9 sizeof struct hstate = 29872
That little horror comes from get_hstate_idx() and hstate_index().  Code 
generated
for division is
        movabsq $-5542915600080909725, %rax
        sarq    $4, %rdi
        imulq   %rdi, %rax
      7 sizeof struct nvme_rdma_queue = 312
      7 sizeof struct iso_context = 208
      6 sizeof struct i915_power_well = 48
      6 sizeof struct hpet_dev = 168
      6 sizeof struct ext4_extent = 12
      6 sizeof struct esas2r_target = 120
      5 sizeof struct iio_chan_spec = 152
      5 sizeof struct hwspinlock = 96
      4 sizeof struct myri10ge_slice_state = 704
      4 sizeof struct ext4_extent_idx = 12
Another interesting-looking one is struct vhost_net_virtqueue (18080 bytes)

Note that those sizes are rather sensitive to lockdep, spinlock debugging, etc.

Reply via email to