> On Jul 20 16:16, David Allsopp wrote: > > I've pushed a repro case for this to > > https://github.com/dra27/cygwin-nanosleep-bug.git > > > > Originally noticed as the main CI system for OCaml has been failing > > sporadically for the signal.ml test mentioned in that repo. This > > morning I tried hammering that test on my dev machine and discovered > > that it fails very frequently. No idea if that's drivers, Windows 10 > > updates, number of cores or what, but it was definitely happening, and > > easily. > > > > Drilling further, it appears that NtQueryTimer is able to return a > > negative value in the TimeRemaining field even when SignalState is > > false. The values I've seen have always been < 15ms - i.e. less than > > the timer resolution, so I wonder if there is a point at which the > > timer has elapsed but has not been signalled, but WaitForMultipleObjects > returns because of the EINTR signal. > > Mildly surprising that it seems to be so reproducible. > > > > Anyway, a patch is attached which simply guards a negative return > > value. The test on tbi.SignalState is in theory unnecessary. > > Thanks for the patch, I think your patch is fine. However, I'd like to > dig a bit into this to see what exactly happens. Do you have a very > simple testcase in plain C, by any chance?
https://github.com/dra27/cygwin-nanosleep-bug/blob/main/signal.c was as simple as I'd gone at this stage (eliminating OCaml from the equation!). It might be possible to get it to happen without all the pthreads stuff: having confirmed it definitely wasn't OCaml and been able to put the appropriate system_printf's into cygwait to see that NtQueryTimer really was returning this small negative value, I stopped simplifying. Does that repro case trigger on your system too? Best, D