Re: [Maria-developers] Next steps in improving single-threaded performance

2014-05-01 Thread Vangelis Katsikaros
Hi Kristian On 04/30/2014 02:11 PM, Kristian Nielsen wrote: Vangelis Katsikaros vkatsika...@yahoo.gr writes: What do you mean with the phrase [code|server] falls over? I am refering to a common phenomenon in high-concurrency benchmarks. See for example the graph in this email:

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-04-30 Thread Kristian Nielsen
Vangelis Katsikaros vkatsika...@yahoo.gr writes: What do you mean with the phrase [code|server] falls over? I am refering to a common phenomenon in high-concurrency benchmarks. See for example the graph in this email: https://lists.launchpad.net/maria-developers/msg06799.html

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-04-29 Thread Vangelis Katsikaros
Hi Kristian On 04/22/2014 04:11 PM, Kristian Nielsen wrote: As a side effect, if we improve the code performance of a single thread, we effectively increase the concurrency in the critical spots - threads spend less time executing the real code, hence more time in concurrency bottlenecks. The

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-04-22 Thread Kristian Nielsen
Kristian Nielsen kniel...@knielsen-hq.org writes: Axel Schwenke a...@askmonty.org writes: Benchmark 1 is good old sysbench OLTP. I tested 10.0.7 vs. 10.0.7-pgo. With low concurrency there is about 10% win by PGO; however this is completely reversed at higher concurrency by mutex contention

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-02-12 Thread Axel Schwenke
Kristian Nielsen wrote: I implemented a simple program to generate some profile load: https://github.com/knielsen/gen_profile_load I propose the following change to make it work with MariaDB releases before 10.0.4 (and MySQL) that lack the SHUTDOWN statement: --- gen_profile_load.c.orig

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-02-12 Thread Kristian Nielsen
Axel Schwenke a...@askmonty.org writes: I propose the following change to make it work with MariaDB releases before 10.0.4 (and MySQL) that lack the SHUTDOWN statement: Nice, thanks Axel! - Kristian. ___ Mailing list:

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-01-28 Thread Kristian Nielsen
Sergey Vojtovich s...@mariadb.org writes: Still a questions mostly to educate myself. According to proc mysqld executable size is something like: VmExe: 12228 kB VmLib:6272 kB I assume the above refers to overall instructions. Level 1 instruction cache size is like

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-01-28 Thread Axel Schwenke
Hi Kristian, Kristian Nielsen wrote: I have been analysing CPU bottlenecks in single-threaded sysbench read-only load. I found that icache misses is the main bottleneck, and that profile-guided compiler optimisation (PGO) with GCC gives a large speedup, 25% or more. (More details in my

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-01-28 Thread Kristian Nielsen
Axel Schwenke a...@askmonty.org writes: Wow. 25% is a lot. Have you also tried compiling MySQL 5.6 with PGO? No. Because if that gets the same improvement, we haven't won anything in the comparison. On the contrary, if the same works for MySQL 5.6 (and it seems likely it will), then we have

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-01-27 Thread Sergey Vojtovich
Hi Kristian, just out of curiosity: is it possible to find out which functions cause highest amount of icache misses? Can it have anything to do with branch misprediction? Regards, Sergey On Fri, Jan 24, 2014 at 03:51:25PM +0100, Kristian Nielsen wrote: I have been analysing CPU bottlenecks in

Re: [Maria-developers] Next steps in improving single-threaded performance

2014-01-27 Thread Kristian Nielsen
Sergey Vojtovich s...@mariadb.org writes: just out of curiosity: is it possible to find out which functions cause highest amount of icache misses? Yes, see the second post, the profiles marked Icache misses (ICACHE.MISSES), before PGO and Icache misses (ICACHE.MISSES), after PGO. These are

[Maria-developers] Next steps in improving single-threaded performance

2014-01-24 Thread Kristian Nielsen
I have been analysing CPU bottlenecks in single-threaded sysbench read-only load. I found that icache misses is the main bottleneck, and that profile-guided compiler optimisation (PGO) with GCC gives a large speedup, 25% or more. (More details in my blog posts: