Re: [HACKERS] Buildfarm failures on woodlouse (in ecpg-check)
* I wrote: If there is interest in fixing this issue that is apparently limited to VS 2013, I will try and produce a proper patch. Patch. Perhaps surprisingly, the bug is in the failing test cases themselves, not ecpg. The CRT has two modes for its locale implementation: When called in a "global" mode thread, setlocale() affects all global mode threads; when called from a per-thread mode thread, it affects that thread only. [1] In the threaded test cases, many global mode threads call setlocale() simultaneously, getting in each other's way. The fix is to switch the worker threads to per-thread mode first. Without this patch, I see crashes in approximately 25 percent of runs (four crashes in 16 cycles of vcregress ecpgcheck, ~10 in 50 runs of thread.exe alone). With the patch, I have no crashes in 100 ecpgcheck runs. On the other hand, this affects only the buildfarm VM I mentioned earlier. I have another VM where it does not ever crash even without the patch -- such are the joys of multithreading. Both of them should have plenty of cores, both physical and virtual. There are some alternatives to fixing it this way, but I think this is the best approach: - Selecting per-thread mode in ecpglib takes the choice away from the developer who might want shared locale. - Adding locking around setlocale() is difficult because ecpglib already uses a wrapper around the CRT function, provided by libpgport. These test cases are the only consumers of the wrapper that have any concept of multithreading. Supporting it in libpgport for the sole benefit of threaded ECPG applications on Windows does not seem to be a good idea, and re-wrapping the wrapper in ecpglib is not only beyond my abilities to write, but is also going to be unmaintainable. - Adding locking around every setlocale() call in ecpglib is just ugly. While working on this, I also noticed that there seem to be two separate partial implementations of a pthread synchronization emulation for Win32. One is in ecpglib, uses mutexes and provides PTHREAD_MUTEX_INITIALIZER and pthread_once(), the other has the header in src/port and the implementation in libpq, uses critical sections and does not cover either feature. Should the two be merged at some point? [1] https://msdn.microsoft.com/en-us/library/ms235302(v=vs.120).aspx -- Christian From 5dee698f4cef684a320ced59b19cd4fea61319fb Mon Sep 17 00:00:00 2001 From: Christian Ullrich Date: Wed, 14 Jun 2017 22:18:18 +0200 Subject: [PATCH] Make setlocale() aware of multithreading to avoid crash. --- src/interfaces/ecpg/test/expected/thread-alloc.c | 39 +++-- .../ecpg/test/expected/thread-descriptor.c | 19 +++--- src/interfaces/ecpg/test/expected/thread-prep.c| 67 -- src/interfaces/ecpg/test/expected/thread-thread.c | 60 ++- .../ecpg/test/expected/thread-thread_implicit.c| 60 ++- src/interfaces/ecpg/test/thread/alloc.pgc | 5 ++ src/interfaces/ecpg/test/thread/descriptor.pgc | 5 ++ src/interfaces/ecpg/test/thread/prep.pgc | 5 ++ src/interfaces/ecpg/test/thread/thread.pgc | 6 ++ .../ecpg/test/thread/thread_implicit.pgc | 6 ++ 10 files changed, 163 insertions(+), 109 deletions(-) diff --git a/src/interfaces/ecpg/test/expected/thread-alloc.c b/src/interfaces/ecpg/test/expected/thread-alloc.c index 49f1cd1..1312580 100644 --- a/src/interfaces/ecpg/test/expected/thread-alloc.c +++ b/src/interfaces/ecpg/test/expected/thread-alloc.c @@ -22,6 +22,7 @@ main(void) #define WIN32_LEAN_AND_MEAN #include #include +#include #else #include #endif @@ -99,7 +100,7 @@ struct sqlca_t *ECPGget_sqlca(void); #endif -#line 24 "alloc.pgc" +#line 25 "alloc.pgc" #line 1 "regression.h" @@ -109,14 +110,14 @@ struct sqlca_t *ECPGget_sqlca(void); -#line 25 "alloc.pgc" +#line 26 "alloc.pgc" /* exec sql whenever sqlerror sqlprint ; */ -#line 27 "alloc.pgc" +#line 28 "alloc.pgc" /* exec sql whenever not found sqlprint ; */ -#line 28 "alloc.pgc" +#line 29 "alloc.pgc" #ifdef WIN32 @@ -127,59 +128,63 @@ static void* fn(void* arg) { int i; +#ifdef WIN32 + _configthreadlocale(_ENABLE_PER_THREAD_LOCALE); +#endif + /* exec sql begin declare section */ -#line 39 "alloc.pgc" +#line 44 "alloc.pgc" int value ; -#line 40 "alloc.pgc" +#line 45 "alloc.pgc" char name [ 100 ] ; -#line 41 "alloc.pgc" +#line 46 "alloc.pgc" char ** r = NULL ; /* exec sql end declare section */ -#line 42 "alloc.pgc" +#line 47 "alloc.pgc" value = (long)arg; sprintf(name, "Connection: %d", value); { ECPGconnect(__LINE__, 0, "ecpg1_regression" , NULL, NULL , name, 0); -#line 47 "alloc.pgc" +#line 52 "alloc.pgc" if (sqlca.sqlcode < 0) sqlprint();} -#line 47 "alloc.pgc" +#line 52 "alloc.pgc" { ECPGsetcommit(__LINE__, "on", NULL); -#line 48 "al
Re: [HACKERS] Buildfarm failures on woodlouse (in ecpg-check)
* Andrew Dunstan wrote: On 06/11/2017 11:33 AM, Christian Ullrich wrote: To build correctly, it requires defining _WIN32_WINNT to be 0x600 or above (and using an SDK that knows about InitOnceExecuteOnce()). We already define _WIN32_WINNT to be 0x0600 on all appropriate platforms (Vista/2008 and above), so I think you could probably just check for that value. Not quite; the definition depends on the build toolset, not the build platform. When building with VS 2015 and above, _WIN32_WINNT is set to 0x600 (Vista/2008), while with older compilers, the value is 0x501 (XP/2003). This is also due to locale issues, but of a different kind, and is apparently coincidental. The build platform does not figure into the target platform, which is clearly a good idea for binary distribution reasons. -- Christian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Buildfarm failures on woodlouse (in ecpg-check)
On 06/11/2017 11:33 AM, Christian Ullrich wrote: > Hello, > > my buildfarm animal woodlouse (Visual Studio 2013 on Windows 7) > stopped working correctly some months ago. After Tom kindly prodded me > into fixing it, I noticed that I had configured it to skip the > ecpg-check step because one of the tests in the "thread" section (not > always the same) nearly always failed. > > I had set it up in circa 2015, so I reenabled the step to see whether > anything had changed since, and it started failing again. > > Through some trial and error, and with the help of Microsoft's > Application Verifier tool, I found what I think is the cause: It looks > like the VS 2013 CRT's setlocale() function is not entirely > thread-safe; the heap debugging options make it crash when > manipulating per-thread locale state, and according to the comments > surrounding that spot in the CRT source, the developers were aware the > code is fragile. > > I crudely hacked a critical section around the setlocale() call in > pgwin32_setlocale(). After this change, the test does not crash, while > without it, it crashes every time. > > If there is interest in fixing this issue that is apparently limited > to VS 2013, I will try and produce a proper patch. I notice, however, > that there is a pthread compatibility layer available that I have no > experience with at all, and I assume using it is preferred over > straight Win32? > > My hack is attached; please feel free to tell me I'm on the wrong track. > To build correctly, it requires defining _WIN32_WINNT to be 0x600 or > above (and using an SDK that knows about InitOnceExecuteOnce()). > > It's certainly worth doing. I turned off testing ecpg ages ago on bowerbird because I was getting errors. That's an older version of the toolset. We already define _WIN32_WINNT to be 0x0600 on all appropriate platforms (Vista/2008 and above), so I think you could probably just check for that value. I have no opinion on the pthread question. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Buildfarm failures on woodlouse (in ecpg-check)
Hello, my buildfarm animal woodlouse (Visual Studio 2013 on Windows 7) stopped working correctly some months ago. After Tom kindly prodded me into fixing it, I noticed that I had configured it to skip the ecpg-check step because one of the tests in the "thread" section (not always the same) nearly always failed. I had set it up in circa 2015, so I reenabled the step to see whether anything had changed since, and it started failing again. Through some trial and error, and with the help of Microsoft's Application Verifier tool, I found what I think is the cause: It looks like the VS 2013 CRT's setlocale() function is not entirely thread-safe; the heap debugging options make it crash when manipulating per-thread locale state, and according to the comments surrounding that spot in the CRT source, the developers were aware the code is fragile. I crudely hacked a critical section around the setlocale() call in pgwin32_setlocale(). After this change, the test does not crash, while without it, it crashes every time. If there is interest in fixing this issue that is apparently limited to VS 2013, I will try and produce a proper patch. I notice, however, that there is a pthread compatibility layer available that I have no experience with at all, and I assume using it is preferred over straight Win32? My hack is attached; please feel free to tell me I'm on the wrong track. To build correctly, it requires defining _WIN32_WINNT to be 0x600 or above (and using an SDK that knows about InitOnceExecuteOnce()). -- Christian diff --git a/src/port/win32setlocale.c b/src/port/win32setlocale.c index cbf109836b..278d836b4d 100644 --- a/src/port/win32setlocale.c +++ b/src/port/win32setlocale.c @@ -164,19 +164,33 @@ map_locale(const struct locale_map * map, const char *locale) return locale; } +static CRITICAL_SECTION setlocale_cs; +static INIT_ONCE init_once; +static BOOL CALLBACK init_setlocale_cs(PINIT_ONCE pInitOnce, PVOID pParam, PVOID pCtx) +{ + InitializeCriticalSection((PCRITICAL_SECTION)pParam); + return TRUE; +} + char * pgwin32_setlocale(int category, const char *locale) { const char *argument; char *result; + if (InitOnceExecuteOnce(&init_once, init_setlocale_cs, (PVOID)&setlocale_cs, NULL) == 0) { + abort(); + } + if (locale == NULL) argument = NULL; else argument = map_locale(locale_map_argument, locale); /* Call the real setlocale() function */ + EnterCriticalSection(&setlocale_cs); result = setlocale(category, argument); + LeaveCriticalSection(&setlocale_cs); /* * setlocale() is specified to return a "char *" that the caller is -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers