* Andy Lutomirski <[email protected]> wrote:
> On Mon, Dec 7, 2015 at 1:51 PM, Andy Lutomirski <[email protected]> wrote:
>
> > This is kind of like the 32-bit and compat code, except that I preserved
> > the
> > fast path this time. I was unable to measure any significant performance
> > change on my laptop in the fast path.
> >
> > What do you all think?
>
> For completeness, if I zap the fast path entirely (see attached), I lose 20
> cycles (148 cycles vs 128 cycles) on Skylake. Switching between movq and
> pushq
> for stack setup makes no difference whatsoever, interestingly. I haven't
> tried
> to figure out exactly where those 20 cycles go.
So I asked for this before, and I'll do so again: could you please stick the
cycle
granular system call performance test into a 'perf bench' variant so that:
1) More people can run it all on various pieces of hardware and help out
quantify
the patches.
2) We can keep an eye on not regressing base system call performance in the
future, with a good in-tree testcase.
Thanks!!
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/