http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53630
Bug #: 53630 Summary: C+11 regex compiler produces SIGSEGV Classification: Unclassified Product: gcc Version: 4.7.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi This simple code produces a segmentation fault. Tested in GCC 4.7.0, GCC 4.6.3, and Clang 3.1 (where the latter uses libstdc++ from GCC 4.7). #include <regex> int main() { std::regex r("(go |)((n(orth|))|(s(outh|))|(w(est|))|(e(ast|)))", std::regex::extended); return 0; } Omitting the std::regex::extended option does not make a difference. Replacing all of the "|)" with ")" makes it compile, but obviously with a completely different expression. As of now, libstdc++ does not yet support the '?' operator, so the expression cannot be rewritten as "(go )?((n(orth)?)|(s(outh)?)|(w(est)?)|(e(ast)?))". There is also no non-capturing grouping operator, so writing e.g. (n(?:orth|)) is not an option. A minimal regexp that duplicates the crash is: "((a(b|))|x)". Simple reorderings such as "((a(|b))|x)" or "(x|(a(|b)))" do not make a difference. GDB backtrace below: (gdb) bt #0 0x00007ffff732cdbd in malloc_consolidate (av=0x7ffff7639e60) at malloc.c:5169 #1 0x00007ffff732f2a4 in _int_malloc (av=0x7ffff7639e60, bytes=1280) at malloc.c:4373 #2 0x00007ffff7331960 in *__GI___libc_malloc (bytes=1280) at malloc.c:3660 #3 0x00007ffff7b39e6d in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #4 0x0000000000404ce4 in __gnu_cxx::new_allocator<std::__regex::_State>::allocate (this=0x7fffffffe980, __n=16) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/ext/new_allocator.h:94 #5 0x00000000004046b0 in std::_Vector_base<std::__regex::_State, std::allocator<std::__regex::_State> >::_M_allocate (this=0x7fffffffe980, __n=16) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/stl_vector.h:169 #6 0x0000000000404338 in _ZNSt6vectorINSt7__regex6_StateESaIS1_EE19_M_emplace_back_auxIJS1_EEEvDpOT_ (this=0x7fffffffe980, __args=0x7fffffffe010) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/vector.tcc:402 #7 0x0000000000404270 in _ZNSt6vectorINSt7__regex6_StateESaIS1_EE12emplace_backIJS1_EEEvDpOT_ (this=0x7fffffffe980, __args=0x7fffffffe010) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/vector.tcc:102 #8 0x0000000000403690 in std::vector<std::__regex::_State, std::allocator<std::__regex::_State> >::push_back(std::__regex::_State&&) ( this=0x7fffffffe980, __x=0x7fffffffe010) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/stl_vector.h:900 #9 0x0000000000403052 in std::__regex::_Nfa::_M_insert_subexpr_begin(std::function<void (std::__regex::_PatternCursor const&, std::__regex::_Results&)> const&) (this=0x7fffffffe978, __t=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_nfa.h:312 #10 0x000000000040848f in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_M_atom (this=0x7fffffffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:943 #11 0x0000000000407b98 in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_M_term (this=0x7fffffffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:793 #12 0x0000000000405be9 in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_M_alternative (this=0x7fffffffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:771 #13 0x0000000000403119 in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_M_disjunction (this=0x7fffffffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:756 #14 0x00000000004084d5 in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_M_atom (this=0x7fffffffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:945 #15 0x0000000000407b98 in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_M_term (this=0x7fffffffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:793 #16 0x0000000000405be9 in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_M_alternative (this=0x7fffffffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:771 #17 0x0000000000405c2f in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_M_alternative (this=0x7fffffffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:774 #18 0x0000000000403119 in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_M_disjunction (this=0x7fffffffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:756 #19 0x0000000000402cae in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_Compiler (this=0x7fffffffe928, __b=@0x7fffffffea70: 0x40cb10 "(go |)((n(orth|))|(s(outh|))|(w(est|))|(e(ast|)))", __e=@0x7fffffffea50: 0x40cb41 "", __traits=..., __flags=64) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:727 #20 0x0000000000401f25 in std::__regex::_Compiler<char const*, std::regex_traits<char> >::_Compiler (this=0x7fffffffe928, __b=@0x7fffffffea70: 0x40cb10 "(go |)((n(orth|))|(s(outh|))|(w(est|))|(e(ast|)))", __e=@0x7fffffffea50: 0x40cb41 "", __traits=..., __flags=64) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:735 #21 0x0000000000401d80 in std::__regex::__compile<char const*, std::regex_traits<char> > ( __b=@0x7fffffffea70: 0x40cb10 "(go |)((n(orth|))|(s(outh|))|(w(est|))|(e(ast|)))", __e=@0x7fffffffea50: 0x40cb41 "", __t=..., __f=64) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:1102 #22 0x0000000000401cc1 in std::basic_regex<char, std::regex_traits<char> >::basic_regex (this=0x7fffffffeac8, __p=0x40cb10 "(go |)((n(orth|))|(s(outh|))|(w(est|))|(e(ast|)))", __f=64) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex.h:405 #23 0x0000000000401a23 in std::basic_regex<char, std::regex_traits<char> >::basic_regex (this=0x7fffffffeac8, __p=0x40cb10 "(go |)((n(orth|))|(s(outh|))|(w(est|))|(e(ast|)))", __f=64) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex.h:407 #24 0x0000000000401915 in main () at test.cc:5 I also tested this on boost::regex, but boost::regex does not crash. This bug may be related to PR52719.