Assuming this happens really in thread start of a HTTP/2 worker, the following change was made in Revision 1874909. The stacktrace indicates a 64 bit system.
Is someone making assumptions about connection->id content here? winnt mpm? Another module that freaks out? Or do I just not see the problem... --- httpd/httpd/branches/2.4.x/modules/http2/h2_task.c 2020/03/06 16:14:06 1874908 +++ httpd/httpd/branches/2.4.x/modules/http2/h2_task.c 2020/03/06 16:15:17 1874909 @@ -555,37 +555,36 @@ apr_status_t h2_task_do(h2_task *task, a task->worker_started = 1; if (c->master) { - /* Each conn_rec->id is supposed to be unique at a point in time. Since + /* See the discussion at <https://github.com/icing/mod_h2/issues/195> + * + * Each conn_rec->id is supposed to be unique at a point in time. Since * some modules (and maybe external code) uses this id as an identifier * for the request_rec they handle, it needs to be unique for slave * connections also. - * The connection id is generated by the MPM and most MPMs use the formula - * id := (child_num * max_threads) + thread_num - * which means that there is a maximum id of about - * idmax := max_child_count * max_threads - * If we assume 2024 child processes with 2048 threads max, we get - * idmax ~= 2024 * 2048 = 2 ** 22 - * On 32 bit systems, we have not much space left, but on 64 bit systems - * (and higher?) we can use the upper 32 bits without fear of collision. - * 32 bits is just what we need, since a connection can only handle so - * many streams. + * + * The MPM module assigns the connection ids and mod_unique_id is using + * that one to generate identifier for requests. While the implementation + * works for HTTP/1.x, the parallel execution of several requests per + * connection will generate duplicate identifiers on load. + * + * The original implementation for slave connection identifiers used + * to shift the master connection id up and assign the stream id to the + * lower bits. This was cramped on 32 bit systems, but on 64bit there was + * enough space. + * + * As issue 195 showed, mod_unique_id only uses the lower 32 bit of the + * connection id, even on 64bit systems. Therefore collisions in request ids. + * + * The way master connection ids are generated, there is some space "at the + * top" of the lower 32 bits on allmost all systems. If you have a setup + * with 64k threads per child and 255 child processes, you live on the edge. + * + * The new implementation shifts 8 bits and XORs in the worker + * id. This will experience collisions with > 256 h2 workers and heavy + * load still. There seems to be no way to solve this in all possible + * configurations by mod_h2 alone. */ - int slave_id, free_bits; - - task->id = apr_psprintf(task->pool, "%ld-%d", c->master->id, - task->stream_id); - if (sizeof(unsigned long) >= 8) { - free_bits = 32; - slave_id = task->stream_id; - } - else { - /* Assume we have a more limited number of threads/processes - * and h2 workers on a 32-bit system. Use the worker instead - * of the stream id. */ - free_bits = 8; - slave_id = worker_id; - } - task->c->id = (c->master->id << free_bits)^slave_id; + task->c->id = (c->master->id << 8)^worker_id; } h2_beam_create(&task->output.beam, c->pool, task->stream_id, "output", Stefan Eissing <green/>bytes GmbH Hafenweg 16 48155 Münster www.greenbytes.de > Am 14.04.2020 um 14:12 schrieb Eric Covener <cove...@gmail.com>: > > On Tue, Apr 14, 2020 at 8:09 AM Ruediger Pluem <rpl...@apache.org> wrote: >> >> >> >> On 4/14/20 12:22 PM, Steffen wrote: >>> >>> >>> This is the post above of backtrace >> >> Thanks. >> >>> >>> By accident I've seen that Perl comes with GDB. This might help as well. >>> I called httpd.exe from GDB with "-X -e debug" and then called a Perl URL >>> in the browser. >>> >>> Excerpt below: >>> >> >> Somehow the below wasn't visible in the original mail. >> >>> Thread 100 received signal SIGSEGV, Segmentation fault. >>> [Switching to Thread 4936.0x23e0] >>> 0x00007ffbe57515d9 in libhttpd!ap_get_server_built () from >>> X:\Apps\Apache24\bin\libhttpd.dll >>> (gdb) bt >>> #0 0x00007ffbe57515d9 in libhttpd!ap_get_server_built () from >>> X:\Apps\Apache24\bin\libhttpd.dll >>> #1 0x00007ffbe44d14aa in ?? () from X:\Apps\Apache24\modules\mod_cgi.so >>> #2 0x00007ffbe575ee85 in libhttpd!ap_run_handler () from >>> X:\Apps\Apache24\bin\libhttpd.dll >>> #3 0x00007ffbe575da7f in libhttpd!ap_invoke_handler () from >>> X:\Apps\Apache24\bin\libhttpd.dll >>> #4 0x00007ffbe575a62a in libhttpd!ap_internal_redirect_handler () from >>> X:\Apps\Apache24\bin\libhttpd.dll >>> #5 0x00007ffbe575a6af in libhttpd!ap_process_request () from >>> X:\Apps\Apache24\bin\libhttpd.dll >>> #6 0x00007ffbe22888ef in ?? () from X:\Apps\Apache24\modules\mod_http2.so >>> #7 0x00007ffbe5761545 in libhttpd!ap_run_process_connection () from >>> X:\Apps\Apache24\bin\libhttpd.dll >>> #8 0x00007ffbe22885ba in ?? () from X:\Apps\Apache24\modules\mod_http2.so >>> #9 0x00007ffbe228c36e in ?? () from X:\Apps\Apache24\modules\mod_http2.so >>> #10 0x00007ffbe9e30e72 in ucrtbase!_beginthreadex () from >>> C:\Windows\System32\ucrtbase.dll >>> #11 0x00007ffbea107bd4 in KERNEL32!BaseThreadInitThunk () from >>> C:\Windows\System32\kernel32.dll >>> #12 0x00007ffbebecced1 in ntdll!RtlUserThreadStart () from >>> C:\Windows\SYSTEM32\ntdll.dll >>> #13 0x0000000000000000 in ?? () >>> Backtrace stopped: previous frame inner to this frame (corrupt stack?) >>> (gdb) >>> >> >> >> Unfortunately this stacktrace does not help. One reason might be that the >> debugging symbols are missing. >> It is very strange that it segfaults in ap_get_server_built, a simple >> function just returning a pointer >> to a static string constant. Furthermore ap_get_server_built is not called >> by mod_cgi. >> Can the crash be repeated against a binary with debugging symbols that are >> then used to generate the stacktrace? >> As I am not a Windows guy, I unfortunately cannot provide any instructions >> how to do this. > > My experience on windows is that if the PDB's are not 110% right you > will get all kinds of misleading stuff above the first ?? in the > displayed backtrace.