On 28/04/14 17:14 -0400, Tim Shen wrote:
On Mon, Apr 28, 2014 at 4:18 PM, Jonathan Wakely <[email protected]> wrote:
I thought I'd make a 5x speedup to the run-time of the regex matching,
but I was comparing the wrong version and the improvement actually
came from one of your patches yesterday - maybe this one:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01725.html
Nice work!
That's surprising. May I ask for the performance testcase?
See below. These are just microbenchmarks, so not very good or
reliable, but the <regex> code is large enough and complicated enough
that the results are fairly consistent and reproducible.
(I use __builtin_printf and __builtin_puts when I'm too lazy to
include a file, there's no other reason for that!)
I was using something like this to test basic_regex<char>:
#include <regex>
int main()
{
auto re = std::regex("[abc]{3}d*x?");
int count = 0;
for (int i = 0; i < 100000; ++i)
if (std::regex_match("bab", re)
&& std::regex_search("aacdddddddddddddddddddddddxaa", re))
++count;
__builtin_printf("%d\n", count);
}
This runs much faster on trunk than with 4.9.0, so we might want to
backport your recent patches to the gcc-4_9-branch.
I was using examples like this for testing _BracketMatcher with
basic_regex<wchar_t>
#include <regex>
int main()
{
auto re = std::wregex(L'[' + std::wstring(300, L'a') + L"bc"
+ std::wstring(1000, 'a') + L"d]");
int count = 0;
for (int i = 0; i < 100000; ++i)
if (std::regex_match(L"b", re) && std::regex_match(L"d", re))
++count;
__builtin_printf("%d\n", count);
}
This runs faster with my patch to use std::sort and std::unique on the
_M_char_set vector, because the regex compiles to be equivalent to
std::wregex(L"[abcd]")