Milan,

Thanks for looking into this. I think you should ask a question on the expected 
timing semantics and guarantees on net-dev (with maybe a cc to nio-dev).
As for our test. I agree with you that we should simply work a possibility of 
early returns into the check.

...

/* The acceptable variation of early returns from timed socket operations. */
private static final long PREMATURE_RETURN = adjustTimeout(100);

...

long elapsed = NANOSECONDS.toMillis(System.nanoTime() - startNanos);
if (elapsed < timeout - PREMATURE_RETURN || elapsed > timeout + TOLERANCE) {
    String msg = String.format(
            "elapsed=%s, timeout=%s, TOLERANCE=%s, PREMATURE_RETURN=%s",
            timeout, timeout, TOLERANCE, PREMATURE_RETURN);
    throw new RuntimeException(msg);
}

...

Thoughts?

-Pavel

> On 24 Sep 2019, at 09:12, Milan Mimica <milan.mim...@gmail.com> wrote:
> 
> Hi Pavel
> 
> Wow, I find this awesome. I don't have a Windows machine to play with,
> but I think I may have found something.
> The difference is how Java_sun_nio_ch_Net_poll is implemented. On unix
> it uses poll(2), on Windows it uses select(2). Regarding timeouts,
> poll() has "wait at least" semantics and overruns by design[1], while
> select() on windows has "waits at most" semantics, or how they put
> it[2]: "specifies the maximum time that select should wait before
> returning.". It returns early by design! Is this a known thing? I
> don't think there is much one can do here. It probably makes no sense
> to loop it and wait for time remainder.
> Java's soTimeout does not specify[3] should it wait at least or at
> most the specified timeout, so it's fine I guess. Old, "plain" socket
> impl are not much different.
> 
> If the above is correct, should I just add a tolerance for the lower bound?
> 
> [1] http://man7.org/linux/man-pages/man2/poll.2.html
> [2] 
> https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-select
> [3] 
> https://docs.oracle.com/javase/9/docs/api/java/net/Socket.html#setSoTimeout-int-
> 
> On Mon, 23 Sep 2019 at 16:15, Pavel Rappo <pavel.ra...@oracle.com> wrote:
>> 
>> Milan,
>> 
>> I'm observing the latest version (.04) of this test failing quite frequently 
>> (4/100) on Windows (Windows Server 2012 R2 6.3 (amd64)) machines. The test 
>> passes fine on macOS and Linux. Here's the typical output I see in the logs:
>> 
>>    java.lang.RuntimeException: Query took 4997 ms. . Timeout value is 5000
>>    java.lang.RuntimeException: Query took 4999 ms. . Timeout value is 5000
>>    java.lang.RuntimeException: Query took 4995 ms. . Timeout value is 5000
>>    java.lang.RuntimeException: Query took 4998 ms. . Timeout value is 5000
>>    ...
>> 
>> Now, there might be many reasons for that. One of which would be that the 
>> DnsClient code is buggy. The other reason would be that the accuracy 
>> guaranteed by Windows implementation of `read` is not what we would expect. 
>> Would you be able to investigate that?
>> 
>> P.S. The good news is that the CSR has been approved:
>> 
>>    https://bugs.openjdk.java.net/browse/JDK-8230965
>> 
>>> On 23 Sep 2019, at 14:20, Milan Mimica <milan.mim...@gmail.com> wrote:
>>> 
>>> Got it. Thanks Pavel!
>>> 
>>> 
>>> On Mon, 23 Sep 2019 at 13:37, Pavel Rappo <pavel.ra...@oracle.com> wrote:
>>>> 
>>>> Milan,
>>>> 
>>>> How do you check which tests are run? That's what I see in the 
>>>> /test-support/jtreg_open_test_jdk_com_sun_jndi_dns_ConfigTests_TcpTimeout_java/com/sun/jndi/dns/ConfigTests/TcpTimeout.jtr
>>>>  file after I have run the test locally on my machine:
>>>> 
>>>> ----------messages:(5/233)----------
>>>> command: main TcpTimeout
>>>> reason: User specified action: run main TcpTimeout
>>>> Mode: othervm
>>>> Additional options from @modules: --add-modules java.base --add-exports 
>>>> java.base/sun.security.util=ALL-UNNAMED
>>>> elapsed time (seconds): 1.751
>>>> 
>>>> ...
>>>> 
>>>> ----------messages:(5/313)----------
>>>> command: main TcpTimeout -Dcom.sun.jndi.dns.timeout.initial=5000
>>>> reason: User specified action: run main TcpTimeout 
>>>> -Dcom.sun.jndi.dns.timeout.initial=5000
>>>> Mode: othervm
>>>> Additional options from @modules: --add-modules java.base --add-exports 
>>>> java.base/sun.security.util=ALL-UNNAMED
>>>> elapsed time (seconds): 5.498
>>>> 
>>>> ------------------------------------
>>>> 
>>>> Which is consistent with what I would expect given the timeout values.
>>>> 
>>>> The following output does not tell the full story, just the name of the 
>>>> test:
>>>> 
>>>> ==============================
>>>> Test summary
>>>> ==============================
>>>>  TEST                                              TOTAL  PASS  FAIL ERROR
>>>>  jtreg:open/test/jdk/com/sun/jndi/dns/ConfigTests/TcpTimeout.java
>>>>                                                        1     1     0     0
>>>> ==============================
>>>> TEST SUCCESS
>>>> 
>>>> -Pavel
>>>> 
>>>>> On 20 Sep 2019, at 15:42, Milan Mimica <milan.mim...@gmail.com> wrote:
>>>>> 
>>>>> Pavel,
>>>>> 
>>>>> Here it is: http://cr.openjdk.java.net/~mmimica/8228580/webrev.04/
>>>>> I don't see the test is run twice when I execute "make test
>>>>> TEST=jtreg:test/jdk/com/sun/jndi/dns/ConfigTests/TcpTimeout.java". Am
>>>>> I missing something?
>>>>> 
>>> 
>>> 
>>> --
>>> Milan Mimica
>> 
> 
> 
> -- 
> Milan Mimica

Reply via email to