On Fri, Nov 06, 2020 at 10:58:58AM +0800, Li, Aubrey wrote: > > > > -- workload D, new added syscall workload, performance drop in cs_on: > > +----------------------+------+-------------------------------+ > > | | ** | will-it-scale * 192 | > > | | | (pipe based context_switch) | > > +======================+======+===============================+ > > | cgroup | ** | cg_will-it-scale | > > +----------------------+------+-------------------------------+ > > | record_item | ** | threads_avg | > > +----------------------+------+-------------------------------+ > > | coresched_normalized | ** | 0.2 | > > +----------------------+------+-------------------------------+ > > | default_normalized | ** | 1 | > > +----------------------+------+-------------------------------+ > > | smtoff_normalized | ** | 0.89 | > > +----------------------+------+-------------------------------+ > > will-it-scale may be a very extreme case. The story here is, > - On one sibling reader/writer gets blocked and tries to schedule another > reader/writer in. > - The other sibling tries to wake up reader/writer. > > Both CPUs are acquiring rq->__lock, > > So when coresched off, they are two different locks, lock stat(1 second > delta) below: > > class name con-bounces contentions waittime-min waittime-max > waittime-total waittime-avg acq-bounces acquisitions holdtime-min > holdtime-max holdtime-total holdtime-avg > &rq->__lock: 210 210 0.10 3.04 > 180.87 0.86 797 79165021 0.03 > 20.69 60650198.34 0.77 > > But when coresched on, they are actually one same lock, lock stat(1 second > delta) below: > > class name con-bounces contentions waittime-min waittime-max > waittime-total waittime-avg acq-bounces acquisitions holdtime-min > holdtime-max holdtime-total holdtime-avg > &rq->__lock: 6479459 6484857 0.05 216.46 > 60829776.85 9.38 8346319 15399739 0.03 > 95.56 81119515.38 5.27 > > This nature of core scheduling may degrade the performance of similar > workloads with frequent context switching.
When core sched is off, is SMT off as well? From the above table, it seems to be. So even for core sched off, there will be a single lock per physical CPU core (assuming SMT is also off) right? Or did I miss something? thanks, - Joel