https://bugs.exim.org/show_bug.cgi?id=2707
Bug ID: 2707 Summary: pcre2-posix library provides regex symbols which clash with system regex if a program links to pcre2-posix indirectly Product: PCRE Version: 10.36 (PCRE2) Hardware: x86 OS: Linux Status: NEW Severity: bug Priority: medium Component: Code Assignee: philip.ha...@gmail.com Reporter: ppi...@redhat.com CC: pcre-dev@exim.org I got a report that pacemaker which links to ncurses library which links to pcre2-posix library has problems with calling regex functions. I don't have a reproducer at hand, but here is how I understand it: pacemaker does not use pcre2-posix library it all. It includes <regex.h>, calls regexec(), uses REG_NOMATCH constant and does not link to pcre2-posix library. Thus pacemaker program is compiled with REG_NOMATCH = 1 (see /usr/include/regex.h) and has undefined regexec symbol which is expected to be resolved to a regexp function of libc. But pacemaker also uses and links to an ncurses library which can be optionally compiled against pcre2-posix. Thus when pacemaker program is loaded, ncurses library and its dependencies are mapped and their symbols are available to the pacemaker process. And one of the dependencies is pcre2-posix which provides its own regexec symbol. So pacemaker ends up with two regexec symbols (from libc and from pcre2-posix) in its symbol space and the dynamic linker must decide which to use. If it binds the pacemaker's reference to libc's regex() everything is fine. But if it binds the reference to pcre2-posix's regex(), bad things happen. Namely, pcre2-posix REG_NOMATCH = 17 (/usr/include/pcre2posix.h) does not match pacemaker's REG_NOMATCH = 1. All this happens because pocre2-posix decided to keep the regex functions defined, from pcre2posix(3): Although they are not defined as protypes in pcre2posix.h, the library does contain functions with the POSIX names regcomp() etc. These simply pass their arguments to the PCRE2 functions. These functions are provided for backwards compatibility with earlier versions of PCRE2, so that existing programs do not have to be recompiled. and at the same time libc's and pcre2-posix's ABI differ. We already tackled a related problem in bug #1830 and a similar issue with PCRE1 was reported in bug #2654. I must confess that I cannot reproduce this issue because my libc (glibc-2.33) versions the regex symbols: $ nm -D /usr/lib64/libc.so.6 | grep regexec 000000000016b530 T regexec@GLIBC_2.2.5 00000000000e4f10 T regexec@@GLIBC_2.3.4 So they differ from pcre2-posix: $ nm -D /usr/lib64/libpcre2-posix.so.2 | grep regexec 0000000000001590 T pcre2_regexec 0000000000001750 T regexec But in general, the standard library does not have to version the symbols and the problem can emerge. I propose removing the POSIX regex functions from src/pcre2posix.c: #undef regexec PCRE2POSIX_EXP_DECL int regexec(const regex_t *, const char *, size_t, regmatch_t *, int); PCRE2POSIX_EXP_DEFN int PCRE2_CALL_CONVENTION regexec(const regex_t *preg, const char *string, size_t nmatch, regmatch_t pmatch[], int eflags) { return pcre2_regexec(preg, string, nmatch, pmatch, eflags); } while retaining the redefinitions in src/pcre2posix.h: #define regexec pcre2_regexec That would mean that program still could build against PCRE2 by including <pcre2posix.h> without rewriting regex function calls, but old programs which happened to include <regex.h> would stop working against PCRE2. I actually wonder whether any of these program exist and work against PCRE2 because of the ABI differences. If you agree, the only open question is whether we should bump pcre2-posix SOANAME or not. Technically the library would lost the symbols and changes ABI, but since the regex functions were never part of the (current) API, they could be perceived as a non-public internal implementation detail. -- You are receiving this mail because: You are on the CC list for the bug. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev