On Sun, Jan 19, 2014 at 4:59 PM, Tim Shen <timshe...@gmail.com> wrote: > Tested and committed.
It's quite interesting that after this change in the patch: - this->_M_quantifier(); + while (this->_M_quantifier()); ...even the regex input is nothing to do with quantifiers at all (say regex re(" ")), g++ -O3 generates slower code than -O2: ~ # g++ -O2 perf.cc && time ./a.out ./a.out 0.46s user 0.00s system 99% cpu 0.461 total ~ # g++ -O3 perf.cc && time ./a.out ./a.out 0.56s user 0.00s system 99% cpu 0.569 total perf.cc is almost the same as testsuite/performance/28_regex/split.cc. Following the man page, I found that g++ claims that the difference between -O3 and -O2 are: ~ # /usr/bin/g++ -c -Q -O3 --help=optimizers > /tmp/O3-opts ~ # /usr/bin/g++ -c -Q -O2 --help=optimizers > /tmp/O2-opts ~ # diff /tmp/O2-opts /tmp/O3-opts | grep enabled > -fgcse-after-reload [enabled] > -finline-functions [enabled] > -fipa-cp-clone [enabled] > -fpredictive-commoning [enabled] > -ftree-loop-distribute-patterns [enabled] > -ftree-loop-vectorize [enabled] > -ftree-partial-pre [enabled] > -ftree-slp-vectorize [enabled] > -funswitch-loops [enabled] However, -O2 with those flags give me a postive result: ~ # g++ -O2 perf.cc -fgcse-after-reload -finline-functions -fipa-cp-clone -fpredictive-commoning -ftree-loop-distribute-patterns -ftree-loop-vectorize -ftree-partial-pre -ftree-slp-vectorize -funswitch-loops && time ./a.out ./a.out 0.45s user 0.01s system 99% cpu 0.460 total By the way, my "g++" is alas to "g++ -g -Wall -std=c++11", and I'm sure -g doesn't matter. I don't know much about it. Is there any explanations as interesting as the question? ;) Thank you! -- Regards, Tim Shen