Milan, I satisfied myself by running the final version of the test some 8k times and then pushed the change. Thanks for your patience and persistence. I saw your question on the net-dev and nio-dev mailing lists. Thanks.
-Pavel > On 24 Sep 2019, at 13:41, Milan Mimica <milan.mim...@gmail.com> wrote: > > Pavel, > > Deal. Handling early returns too: > http://cr.openjdk.java.net/~mmimica/8228580/webrev.05/ > I will ask there about socket timeout semantics. > > On Tue, 24 Sep 2019 at 12:51, Pavel Rappo <pavel.ra...@oracle.com> wrote: >> >> Milan, >> >> Thanks for looking into this. I think you should ask a question on the >> expected timing semantics and guarantees on net-dev (with maybe a cc to >> nio-dev). >> As for our test. I agree with you that we should simply work a possibility >> of early returns into the check. >> >> ... >> >> /* The acceptable variation of early returns from timed socket operations. */ >> private static final long PREMATURE_RETURN = adjustTimeout(100); >> >> ... >> >> long elapsed = NANOSECONDS.toMillis(System.nanoTime() - startNanos); >> if (elapsed < timeout - PREMATURE_RETURN || elapsed > timeout + TOLERANCE) { >> String msg = String.format( >> "elapsed=%s, timeout=%s, TOLERANCE=%s, PREMATURE_RETURN=%s", >> timeout, timeout, TOLERANCE, PREMATURE_RETURN); >> throw new RuntimeException(msg); >> } >> >> ... >> >> Thoughts? >> >> -Pavel >> >>> On 24 Sep 2019, at 09:12, Milan Mimica <milan.mim...@gmail.com> wrote: >>> >>> Hi Pavel >>> >>> Wow, I find this awesome. I don't have a Windows machine to play with, >>> but I think I may have found something. >>> The difference is how Java_sun_nio_ch_Net_poll is implemented. On unix >>> it uses poll(2), on Windows it uses select(2). Regarding timeouts, >>> poll() has "wait at least" semantics and overruns by design[1], while >>> select() on windows has "waits at most" semantics, or how they put >>> it[2]: "specifies the maximum time that select should wait before >>> returning.". It returns early by design! Is this a known thing? I >>> don't think there is much one can do here. It probably makes no sense >>> to loop it and wait for time remainder. >>> Java's soTimeout does not specify[3] should it wait at least or at >>> most the specified timeout, so it's fine I guess. Old, "plain" socket >>> impl are not much different. >>> >>> If the above is correct, should I just add a tolerance for the lower bound? >>> >>> [1] http://man7.org/linux/man-pages/man2/poll.2.html >>> [2] >>> https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-select >>> [3] >>> https://docs.oracle.com/javase/9/docs/api/java/net/Socket.html#setSoTimeout-int- >>> >>> On Mon, 23 Sep 2019 at 16:15, Pavel Rappo <pavel.ra...@oracle.com> wrote: >>>> >>>> Milan, >>>> >>>> I'm observing the latest version (.04) of this test failing quite >>>> frequently (4/100) on Windows (Windows Server 2012 R2 6.3 (amd64)) >>>> machines. The test passes fine on macOS and Linux. Here's the typical >>>> output I see in the logs: >>>> >>>> java.lang.RuntimeException: Query took 4997 ms. . Timeout value is 5000 >>>> java.lang.RuntimeException: Query took 4999 ms. . Timeout value is 5000 >>>> java.lang.RuntimeException: Query took 4995 ms. . Timeout value is 5000 >>>> java.lang.RuntimeException: Query took 4998 ms. . Timeout value is 5000 >>>> ... >>>> >>>> Now, there might be many reasons for that. One of which would be that the >>>> DnsClient code is buggy. The other reason would be that the accuracy >>>> guaranteed by Windows implementation of `read` is not what we would >>>> expect. Would you be able to investigate that? >>>> >>>> P.S. The good news is that the CSR has been approved: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8230965 >>>> >>>>> On 23 Sep 2019, at 14:20, Milan Mimica <milan.mim...@gmail.com> wrote: >>>>> >>>>> Got it. Thanks Pavel! >>>>> >>>>> >>>>> On Mon, 23 Sep 2019 at 13:37, Pavel Rappo <pavel.ra...@oracle.com> wrote: >>>>>> >>>>>> Milan, >>>>>> >>>>>> How do you check which tests are run? That's what I see in the >>>>>> /test-support/jtreg_open_test_jdk_com_sun_jndi_dns_ConfigTests_TcpTimeout_java/com/sun/jndi/dns/ConfigTests/TcpTimeout.jtr >>>>>> file after I have run the test locally on my machine: >>>>>> >>>>>> ----------messages:(5/233)---------- >>>>>> command: main TcpTimeout >>>>>> reason: User specified action: run main TcpTimeout >>>>>> Mode: othervm >>>>>> Additional options from @modules: --add-modules java.base --add-exports >>>>>> java.base/sun.security.util=ALL-UNNAMED >>>>>> elapsed time (seconds): 1.751 >>>>>> >>>>>> ... >>>>>> >>>>>> ----------messages:(5/313)---------- >>>>>> command: main TcpTimeout -Dcom.sun.jndi.dns.timeout.initial=5000 >>>>>> reason: User specified action: run main TcpTimeout >>>>>> -Dcom.sun.jndi.dns.timeout.initial=5000 >>>>>> Mode: othervm >>>>>> Additional options from @modules: --add-modules java.base --add-exports >>>>>> java.base/sun.security.util=ALL-UNNAMED >>>>>> elapsed time (seconds): 5.498 >>>>>> >>>>>> ------------------------------------ >>>>>> >>>>>> Which is consistent with what I would expect given the timeout values. >>>>>> >>>>>> The following output does not tell the full story, just the name of the >>>>>> test: >>>>>> >>>>>> ============================== >>>>>> Test summary >>>>>> ============================== >>>>>> TEST TOTAL PASS FAIL ERROR >>>>>> jtreg:open/test/jdk/com/sun/jndi/dns/ConfigTests/TcpTimeout.java >>>>>> 1 1 0 0 >>>>>> ============================== >>>>>> TEST SUCCESS >>>>>> >>>>>> -Pavel >>>>>> >>>>>>> On 20 Sep 2019, at 15:42, Milan Mimica <milan.mim...@gmail.com> wrote: >>>>>>> >>>>>>> Pavel, >>>>>>> >>>>>>> Here it is: http://cr.openjdk.java.net/~mmimica/8228580/webrev.04/ >>>>>>> I don't see the test is run twice when I execute "make test >>>>>>> TEST=jtreg:test/jdk/com/sun/jndi/dns/ConfigTests/TcpTimeout.java". Am >>>>>>> I missing something? >>>>>>> >>>>> >>>>> >>>>> -- >>>>> Milan Mimica >>>> >>> >>> >>> -- >>> Milan Mimica >> > > > -- > Milan Mimica