Re: pgbench stopped supporting large number of client connections on Windows

Fabien COELHO Fri, 06 Nov 2020 14:03:17 -0800


Hello Marina,

While trying to test a patch that adds a synchronization barrier in pgbench[1] on Windows,

Thanks for trying that, I do not have a windows setup for testing, and thesync code I wrote for Windows is basically blind coding:-(

I found that since the commit "Use ppoll(2), if available, towait for input in pgbench." [2] I cannot use a large number of clientconnections in pgbench on my Windows virtual machines (Windows Server 2008 R2and Windows 2019), for example:
bin\pgbench.exe -c 90 -S -T 3 postgres
starting vacuum...end.


ISTM that 1 thread with 90 clients is a bad idea, see below.

The almost same thing happens with reindexdb and vacuumdb (build oncommit [3]):

Windows fd implementation is somehow buggy because it does not return thesmallest number available, and then with the assumption that select uses adense array indexed with them (true on linux, less so on Windows whichprobably uses a sparse array), so that the number gets over the limit,even if less are actually used, hence the catch, as you noted.

Another point is windows has a hardcoded number of objects one thread canreally wait for, typically 64, so that waiting for more requires actuallyforking threads to do the waiting. But if you are ready to fork threadsjust to wait, then probaly you could have started pgbench with morethreads in the first place. Now it would probably not make the problem goaway because fd numbers would be per process, not per thread, but itreally suggests that one should not load a thread is more than 64 clients.

IIUC the checks below are not correct on Windows, since on this systemsockets can have values equal to or greater than FD_SETSIZE (see Windowsdocumentation [4] and pgbench debug output in attached pgbench_debug.txt).


Okay.

But then, how may one detect that there are too many fds in the set?

I think that an earlier version of the code needed to make assumptionsabout the internal implementation of windows (there is a counter somewherein windows fd_set struct), which was rejected because if was breaking theinterface. Now your patch is basically resurrecting that. Why not if thereis no other solution, but this is quite depressing, and because it breaksthe interface it would be broken if windows changed its internals for somereason:-(

Doesn't windows has "ppoll"? Should we implement the stuff above windowspolling capabilities and coldly skip its failed posix portabilityattempts? This raises again the issue that you should not have more that64 clients per thread anyway, because it is an intrinsic limit on windows.

I think that at one point it was suggested to error or warn ifnclients/nthreads is too great, but that was not kept in the end.

I tried to fix this, see attached fix_max_client_conn_on_Windows.patch (basedon commit [3]). I checked it for reindexdb and vacuumdb, and it works forsimple databases (1025 jobs are not allowed and 1024 jobs is ok).Unfortunately, pgbench was getting connection errors when it tried to use1000 jobs on my virtual machines, although there were no errors for fewerjobs (500) and the same number of clients (1000)...

It seems that the max number of threads you can start depends on availablememory, because each thread is given its own stack, so it would depend onyour vm settings?

Any suggestions are welcome!


Use ppoll, and start more threads but not too many?

--
Fabien.

Re: pgbench stopped supporting large number of client connections on Windows

Reply via email to