Re: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-11 Thread Tony Luck
> What I understand from above in intel 64 Arch software Developer's manual are: > 1) this manual is written for software developer; > 2) It says that MCE handler only requires to synchronize among the logical > cores in the same package/core(what I assume here is same CPU socket). > > I have two

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Ming Lei
y hardware standard or specification for it? Ming -Original Message- From: Luck, Tony [mailto:tony.l...@intel.com] Sent: Friday, May 10, 2013 3:42 PM To: Ming Lei; linux-kernel@vger.kernel.org Cc: mche...@redhat.com; b...@alien8.de Subject: RE: x86_mce: mce_start uses number of phs

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Luck, Tony
> So only one socket gets the machine check. So is there still a problem but > the fix will be different? > I think the error inject creates a real machine check, but since each CPU has > its own memory controller, > the machine check may only send to the CPU the error happens. If there is a rea

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Ming Lei
-Original Message- From: Luck, Tony [mailto:tony.l...@intel.com] Sent: Friday, May 10, 2013 2:05 PM To: Ming Lei; linux-kernel@vger.kernel.org Cc: mche...@redhat.com; b...@alien8.de Subject: RE: x86_mce: mce_start uses number of phsical cores instead of logical cores > I used intel edac er

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Luck, Tony
> I used intel edac error injector and saw the same problem. I actually wrote > down the core numbers > and I saw mce got to 0-5 and 12-17, but not the others. I have 2 sockets, 24 > logical cores. Mauro: How does the EDAC injector work on E5645 (Westmere-EP)? Does it create a real error in m

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Ming Lei
nt: Friday, May 10, 2013 12:10 PM To: Ming Lei; linux-kernel@vger.kernel.org Cc: mche...@redhat.com; b...@alien8.de Subject: RE: x86_mce: mce_start uses number of phsical cores instead of logical cores > With hyperthread turns on, the num_online_cpus reports the number of all > logical cores

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Luck, Tony
> With hyperthread turns on, the num_online_cpus reports the number of all > logical cores. > What I found in testing is only half the cores receives the mce broadcast, so > I assume only the physical cores get broadcast. See Intel Software Developer Manual Volume 3B Section 15.10.4.1, 3rd bulle

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Ming Lei
Number E5645 # of Cores 6 # of Threads 12 Ming -Original Message- From: Luck, Tony [mailto:tony.l...@intel.com] Sent: Friday, May 10, 2013 11:14 AM To: Ming Lei; linux-kernel@vger.kernel.org Cc: mche...@redhat.com; b...@alien8.de Subject: RE: x86_mce: mce_start uses number of phsical

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Luck, Tony
> +#if NR_CPUS > 1 > + cpus /= cpumask_weight(cpu_core_mask(0)) / cpu_data(0).booted_cores; > +#endif Not entirely sure what you are trying to do here (apart from making "cpus" be a smaller number). What is the reasoning behind the right hand side of this expression? Is this problem more rel