On 01/24/2013 03:47 PM, Mike Galbraith wrote: > On Thu, 2013-01-24 at 15:15 +0800, Michael Wang wrote: >> On 01/24/2013 02:51 PM, Mike Galbraith wrote: >>> On Thu, 2013-01-24 at 14:01 +0800, Michael Wang wrote: >>> >>>> I've enabled WAKE flag on my box like you did, but still can't see >>>> regression, and I've just tested on a power server with 64 cpu, also >>>> failed to reproduce the issue (not compared with virgin yet, but can't >>>> see collapse). >>> >>> I'm not surprised. I'm seeing enough inconsistent crap to come to the >>> conclusion that stock scheduler knobs flat can't be used on a largish >>> box, they're just too preempt-happy, leading to weird crap. >>> >>> My 2 missing nodes came back, and the very same kernel that highly >>> repeatably collapsed with 2 nodes does not with 4 nodes, and 2 nodes >>> does not collapse with only preemption knob tweaking, and that's >>> bullshit. Virgin shows instability in the mid-range, make a tiny tweak >>> that should have little if any effect there, and that instability >>> vanishes entirely. Test runs are not consistent enough boot to boot etc >>> etc. Either stock knobs suck on NUMA boxen, or this box is possessed. >> >> Mike, I wonder the reason why change back to the old way make collapse >> away may not because there are logical error in new balance path, it's >> just changed the cost of select_task_rq(), whatever it's more or less, >> it's accidentally achieve the same effect as you tweak the knob, so >> that's the reason why it looks like old is better than new. > > That's what I'm saying, it's a useless crap side-effect of a preempt > happy kernel. Results with these knobs are just not stable. Results go > wildly unstable with 2 nodes vs 4 in this box, but can be stabilized in > all with preemption knob adjustment.. or phase of moon might make them > appear stable.. or not.
Yeah, it's time to stop blame the patch now, it's not the real killer on your box. Well, at least it's worth to be tortured on it, we found several points I missed, we are more familiar with the balance path, and we found some places we could do better, all these are because your kindly help, it's nice to work with you ;-) Now it's time to work on v3 I think, let's see what we could get this time. Regards, Michael Wang > > -Mike > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/