On 07/05/2013 12:08 PM, Mike Galbraith wrote: [snip] >> >> Wow, I used to think such issue is very hard to be tracked by >> benchmarks, is this regression stable? > > Yeah, seems to be. I was curious as to why you saw an improvement to > hackbench, didn't seem there should be any, so though I'd try it on my > little box on the way to a long weekend. The unexpected happened.
Oh, I think I failed to explain things clearly in comments... It's not the patch who bring 15% benefit to hackbench, but the wake-affine stuff itself. In the prev-test, I removed the whole stuff and find that hackbench dropped 15%, which means with wake-affine enabled, we will gain 15% benefit (and that's actually the reason why we don't kill the stuff). And this idea is try to not harm that 15% benefit, and meanwhile regain the pgbench lost performance, thus, apply this patch to mainline won't improve hackbench performance, but improve pgbench performance. But this regression is really unexpected... I could hardly believe it's just caused by cache issue now, since the number is not small (10% at most?). Have you tried to use more loops and groups? will that show even bigger regressions? BTW, is this the results of 10 group and 40 sockets == 400 tasks? Regards, Michael Wang > >>> pahole said... >>> >>> marge:/usr/local/src/kernel/linux-3.x.git # tail virgin >>> long unsigned int timer_slack_ns; /* 1512 8 */ >>> long unsigned int default_timer_slack_ns; /* 1520 8 */ >>> atomic_t ptrace_bp_refcnt; /* 1528 4 */ >>> >>> /* size: 1536, cachelines: 24, members: 125 */ >>> /* sum members: 1509, holes: 6, sum holes: 23 */ >>> /* bit holes: 1, sum bit holes: 26 bits */ >>> /* padding: 4 */ >>> /* paddings: 1, sum paddings: 4 */ >>> }; >>> >>> marge:/usr/local/src/kernel/linux-3.x.git # tail michael >>> long unsigned int default_timer_slack_ns; /* 1552 8 */ >>> atomic_t ptrace_bp_refcnt; /* 1560 4 */ >>> >>> /* size: 1568, cachelines: 25, members: 128 */ >>> /* sum members: 1533, holes: 8, sum holes: 31 */ >>> /* bit holes: 1, sum bit holes: 26 bits */ >>> /* padding: 4 */ >>> /* paddings: 1, sum paddings: 4 */ >>> /* last cacheline: 32 bytes */ >>> }; >>> >>> ..but plugging holes, didn't help, moving this/that around neither, nor >>> did letting pahole go wild to get the line back. It's plus signs I tell >>> ya, the evil things must die ;-) >> >> Hmm...so the new members kicked some tail members to a new line...or may >> be totally different when compiler take part in... >> >> It's really hard to estimate the influence, especially when the >> task_struct is still keep changing... > > Yeah, could be memory layout crud that disappears with the next > pull/build. Wouldn't be the first time. > > -Mike > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/