On Thu, Jan 28, 2016 at 09:55:45AM +0000, Dario Faggioli wrote: > On Wed, 2016-01-27 at 15:53 +0000, George Dunlap wrote: > > On 27/01/16 15:27, Konrad Rzeszutek Wilk wrote: > > > > > > So Elena started looking at the CPU bound and seeing how Xen > > > behaves then > > > and if we can improve the floating situation as she saw some > > > abnormal > > > behavious. > > > > OK -- if the focus was on the two cases where the Xen credit1 > > scheduler > > (apparently) co-located two cpu-burning vcpus on sibling threads, > > then > > yeah, that's behavior we should probably try to get to the bottom of. > > > Well, let's see the trace.
Hey Dario Please disregard the previous email with topology information. It was incorrect and I am attaching the topology that is actually result of Joao smt patches application. Elena > > In any case, I'm up to trying hooking the SMT load balancer in > runq_tickle (which would mean doing it upon every vcpus wakeup). > > My gut feeling is that the overhead my outwieght the benefit, and that > it will actually reveal useful only in a minority of the > cases/workloads, but it's maybe worth a try. > > Regards, > Dario > -- > <<This happens because I choose it to happen!>> (Raistlin Majere) > ----------------------------------------------------------------- > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) >
processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 0 cpu cores : 8 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 0 cpu cores : 8 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 1 cpu cores : 8 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 1 cpu cores : 8 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 4 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 2 cpu cores : 8 apicid : 4 initial apicid : 4 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 5 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 2 cpu cores : 8 apicid : 5 initial apicid : 5 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 6 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 3 cpu cores : 8 apicid : 6 initial apicid : 6 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 3 cpu cores : 8 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 8 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 4 cpu cores : 8 apicid : 8 initial apicid : 8 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 9 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 4 cpu cores : 8 apicid : 9 initial apicid : 9 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 10 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 5 cpu cores : 8 apicid : 10 initial apicid : 10 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 11 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 5 cpu cores : 8 apicid : 11 initial apicid : 11 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 12 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 6 cpu cores : 8 apicid : 12 initial apicid : 12 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 13 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 6 cpu cores : 8 apicid : 13 initial apicid : 13 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 14 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 7 cpu cores : 8 apicid : 14 initial apicid : 14 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 15 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Genuine Intel(R) CPU @ 2.80GHz stepping : 2 microcode : 0x209 cpu MHz : 2793.360 cache size : 25600 KB physical id : 0 siblings : 16 core id : 7 cpu cores : 8 apicid : 15 initial apicid : 15 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt bugs : bogomips : 5586.72 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:
cat /proc/sys/kernel/sched_domain/cpu*/domain*/flags 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 4783 559 cat /proc/sys/kernel/sched_domain/cpu*/domain*/names SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC SMT MC
Advisory to Users on system topology enumeration This utility is for demonstration purpose only. It assumes the hardware topology configuration within a coherent domain does not change during the life of an OS session. If an OS support advanced features that can change hardware topology configurations, more sophisticated adaptation may be necessary to account for the hardware configuration change that might have added and reduced the number of logical processors being managed by the OS. User should also`be aware that the system topology enumeration algorithm is based on the assumption that CPUID instruction will return raw data reflecting the native hardware configuration. When an application runs inside a virtual machine hosted by a Virtual Machine Monitor (VMM), any CPUID instructions issued by an app (or a guest OS) are trapped by the VMM and it is the VMM's responsibility and decision to emulate/supply CPUID return data to the virtual machines. When deploying topology enumeration code based on querying CPUID inside a VM environment, the user must consult with the VMM vendor on how an VMM will emulate CPUID instruction relating to topology enumeration. Software visible enumeration in the system: Number of logical processors visible to the OS: 16 Number of logical processors visible to this process: 16 Number of processor cores visible to this process: 8 Number of physical packages visible to this process: 1 Hierarchical counts by levels of processor topology: # of cores in package 0 visible to this process: 8 . # of logical processors in Core 0 visible to this process: 2 . # of logical processors in Core 1 visible to this process: 2 . # of logical processors in Core 2 visible to this process: 2 . # of logical processors in Core 3 visible to this process: 2 . # of logical processors in Core 4 visible to this process: 2 . # of logical processors in Core 5 visible to this process: 2 . # of logical processors in Core 6 visible to this process: 2 . # of logical processors in Core 7 visible to this process: 2 . Affinity masks per SMT thread, per core, per package: Individual: P:0, C:0, T:0 --> 1 P:0, C:0, T:1 --> 2 Core-aggregated: P:0, C:0 --> 3 Individual: P:0, C:1, T:0 --> 4 P:0, C:1, T:1 --> 8 Core-aggregated: P:0, C:1 --> c Individual: P:0, C:2, T:0 --> 10 P:0, C:2, T:1 --> 20 Core-aggregated: P:0, C:2 --> 30 Individual: P:0, C:3, T:0 --> 40 P:0, C:3, T:1 --> 80 Core-aggregated: P:0, C:3 --> c0 Individual: P:0, C:4, T:0 --> 100 P:0, C:4, T:1 --> 200 Core-aggregated: P:0, C:4 --> 300 Individual: P:0, C:5, T:0 --> 400 P:0, C:5, T:1 --> 800 Core-aggregated: P:0, C:5 --> c00 Individual: P:0, C:6, T:0 --> 1z3 P:0, C:6, T:1 --> 2z3 Core-aggregated: P:0, C:6 --> 3z3 Individual: P:0, C:7, T:0 --> 4z3 P:0, C:7, T:1 --> 8z3 Core-aggregated: P:0, C:7 --> cz3 Pkg-aggregated: P:0 --> ffff APIC ID listings from affinity masks OS cpu 0, Affinity mask 000001 - apic id 0 OS cpu 1, Affinity mask 000002 - apic id 1 OS cpu 2, Affinity mask 000004 - apic id 2 OS cpu 3, Affinity mask 000008 - apic id 3 OS cpu 4, Affinity mask 000010 - apic id 4 OS cpu 5, Affinity mask 000020 - apic id 5 OS cpu 6, Affinity mask 000040 - apic id 6 OS cpu 7, Affinity mask 000080 - apic id 7 OS cpu 8, Affinity mask 000100 - apic id 8 OS cpu 9, Affinity mask 000200 - apic id 9 OS cpu 10, Affinity mask 000400 - apic id a OS cpu 11, Affinity mask 000800 - apic id b OS cpu 12, Affinity mask 001000 - apic id c OS cpu 13, Affinity mask 002000 - apic id d OS cpu 14, Affinity mask 004000 - apic id e OS cpu 15, Affinity mask 008000 - apic id f Package 0 Cache and Thread details Box Description: Cache is cache level designator Size is cache size OScpu# is cpu # as seen by OS Core is core#[_thread# if > 1 thread/core] inside socket AffMsk is AffinityMask(extended hex) for core and thread CmbMsk is Combined AffinityMask(extended hex) for hw threads sharing cache CmbMsk will differ from AffMsk if > 1 hw_thread/cache Extended Hex replaces trailing zeroes with 'z#' where # is number of zeroes (so '8z5' is '0x800000') L1D is Level 1 Data cache, size(KBytes)= 32, Cores/cache= 2, Caches/package= 8 L1I is Level 1 Instruction cache, size(KBytes)= 32, Cores/cache= 2, Caches/package= 8 L2 is Level 2 Unified cache, size(KBytes)= 256, Cores/cache= 2, Caches/package= 8 L3 is Level 3 Unified cache, size(KBytes)= 25600, Cores/cache= 16, Caches/package= 1 +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ Cache | L1D | L1D | L1D | L1D | L1D | L1D | L1D | L1D | Size | 32K | 32K | 32K | 32K | 32K | 32K | 32K | 32K | OScpu#| 0 1| 2 3| 4 5| 6 7| 8 9| 10 11| 12 13| 14 15| Core |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|c4_t0 c4_t1|c5_t0 c5_t1|c6_t0 c6_t1|c7_t0 c7_t1| AffMsk| 1 2| 4 8| 10 20| 40 80| 100 200| 400 800| 1z3 2z3| 4z3 8z3| CmbMsk| 3 | c | 30 | c0 | 300 | c00 | 3z3 | cz3 | +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ Cache | L1I | L1I | L1I | L1I | L1I | L1I | L1I | L1I | Size | 32K | 32K | 32K | 32K | 32K | 32K | 32K | 32K | +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ Cache | L2 | L2 | L2 | L2 | L2 | L2 | L2 | L2 | Size | 256K | 256K | 256K | 256K | 256K | 256K | 256K | 256K | +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ Cache | L3 | Size | 25M | CmbMsk| ffff | +-----------------------------------------------------------------------------------------------+
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel