On 2020/11/7 1:54, Joel Fernandes wrote: > On Fri, Nov 06, 2020 at 10:58:58AM +0800, Li, Aubrey wrote: > >>> >>> -- workload D, new added syscall workload, performance drop in cs_on: >>> +----------------------+------+-------------------------------+ >>> | | ** | will-it-scale * 192 | >>> | | | (pipe based context_switch) | >>> +======================+======+===============================+ >>> | cgroup | ** | cg_will-it-scale | >>> +----------------------+------+-------------------------------+ >>> | record_item | ** | threads_avg | >>> +----------------------+------+-------------------------------+ >>> | coresched_normalized | ** | 0.2 | >>> +----------------------+------+-------------------------------+ >>> | default_normalized | ** | 1 | >>> +----------------------+------+-------------------------------+ >>> | smtoff_normalized | ** | 0.89 | >>> +----------------------+------+-------------------------------+ >> >> will-it-scale may be a very extreme case. The story here is, >> - On one sibling reader/writer gets blocked and tries to schedule another >> reader/writer in. >> - The other sibling tries to wake up reader/writer. >> >> Both CPUs are acquiring rq->__lock, >> >> So when coresched off, they are two different locks, lock stat(1 second >> delta) below: >> >> class name con-bounces contentions waittime-min waittime-max >> waittime-total waittime-avg acq-bounces acquisitions holdtime-min >> holdtime-max holdtime-total holdtime-avg >> &rq->__lock: 210 210 0.10 3.04 >> 180.87 0.86 797 79165021 0.03 >> 20.69 60650198.34 0.77 >> >> But when coresched on, they are actually one same lock, lock stat(1 second >> delta) below: >> >> class name con-bounces contentions waittime-min waittime-max >> waittime-total waittime-avg acq-bounces acquisitions holdtime-min >> holdtime-max holdtime-total holdtime-avg >> &rq->__lock: 6479459 6484857 0.05 216.46 >> 60829776.85 9.38 8346319 15399739 0.03 >> 95.56 81119515.38 5.27 >> >> This nature of core scheduling may degrade the performance of similar >> workloads with frequent context switching. > > When core sched is off, is SMT off as well? From the above table, it seems to > be. So even for core sched off, there will be a single lock per physical CPU > core (assuming SMT is also off) right? Or did I miss something? >
The table includes 3 cases: - default: SMT on, coresched off - coresched: SMT on, coresched on - smtoff: SMT off, coresched off I was comparing the default(coresched off & SMT on) case with (coresched on & SMT on) case. If SMT off, then reader and writer on the different cores have different rq->lock, so the lock contention is not that serious. class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg &rq->__lock: 60 60 0.11 1.92 41.33 0.69 127 67184172 0.03 22.95 33160428.37 0.49 Does this address your concern? Thanks, -Aubrey