Sorry for my delayed response. I was away on vacation. What platform is this? what do you mean by crashing? Do you see a system freeze or oops?
thanks, suresh On Mon, Jan 22, 2007 at 01:42:48PM +0530, Srinivasa Ds wrote: > I saw cpuhotplug operations on 32-bit mode of xeon-64bit processors > crashing the system. This happens on latest 2.6.20-rc5 kernel also. Same > (i386 cpuhotplug code) runs fine on xeon-32bit processors. > Steps to reproduce. > ==================== > echo 0 > /sys/devices/system/cpu/cpu6/online > echo 1 > /sys/devices/system/cpu/cpu6/online > ================================ > dmesg shows. > ============== > Breaking affinity for irq 4 > cpu_mask_to_apicid: Not a valid mask! > CPU 6 is now offline > ======================= > > On debugging the problem, I found that problem is not in cpuhotplug code > but in apic part. Execution of "stale" IPI's by onlined cpus(which we > offlined earlier) is causing the crash. Now we need to debug,why IPI's > are reaching the offlined cpu's too. > > 1) During the calculation of apicid's, if cpu to which IPI has to > deliver is not in > same apic cluster,it prints "Not a valid mask" error and returns "0xFF" > which means broadcast the IPI's to all cpus(which are offlined too) and > hence the problem. > > 2) I booted the system with maxcpus=2 boot parameter, and tried cpu > hotplugging on it. > but still problem recreates(I think there is no concept of apic clusters > if there are only 2 cpus). Hence it makes me to conclude that problem is > in delivery of IPI's. > > So Iam completely stuck here. Iam not able to move forward in debugging. > So could someone(may be intel folks) please throw some light on this. > > Thanks in advance > Srinivasa DS > LTC-IBM - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/