Hi Vyom, ok, thanks. I have one addition to my proposal: I don't think there's a need for a global NET_GetCurrentTime function. This one should probably be a static function inside net_util_md.c (static long getCurrentTime()).
Best Christoph > -----Original Message----- > From: Vyom Tewari [mailto:vyom.tew...@oracle.com] > Sent: Mittwoch, 7. September 2016 18:31 > To: Langer, Christoph <christoph.lan...@sap.com> > Cc: net-dev@openjdk.java.net; Chris Hegarty <chris.hega...@oracle.com> > Subject: Re: RFR 8075484:SocketInputStream.socketRead0 can hang even with > soTimeout set > > Hi Chris, > > Sorry I was mostly focusing on our test failure first, i will check > your suggestions. > > Thanks, > > Vyom > > > On Wednesday 07 September 2016 08:44 PM, Langer, Christoph wrote: > > Hi Vyom, > > > > did you have a look at my suggestions regarding AIX and refactoring? I can't > find none of it in the new webrev nor did you comment on it. > > > > Best regards > > Christoph > > > >> -----Original Message----- > >> From: net-dev [mailto:net-dev-boun...@openjdk.java.net] On Behalf Of > Vyom > >> Tewari > >> Sent: Mittwoch, 7. September 2016 17:10 > >> To: Chris Hegarty <chris.hega...@oracle.com> > >> Cc: net-dev@openjdk.java.net > >> Subject: Re: RFR 8075484:SocketInputStream.socketRead0 can hang even > with > >> soTimeout set > >> > >> Hi All, > >> > >> Please find the latest > >> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.4/index.html > >> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.4/index.html>). > >> there were some test failures in JPRT and Chris also pointed the same > >> issue in modified code. Now with latest code JPRT is clean. > >> > >> Thanks, > >> > >> Vyom > >> > >> > >> On Wednesday 07 September 2016 03:18 PM, Chris Hegarty wrote: > >>> > >>> On 07/09/16 07:45, Vyom Tewari wrote: > >>>> Hi Chris, > >>>> > >>>> Please find the latest > >>>> > webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.3/index.html > >>>> > <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.3/index.html>). I > >>>> did some code refactoring, JPRT is in progress. > >>> In terms of NET_Timeout**, I'm happy with this version, but I > >>> have an additional comment. > >>> > >>> If NET_ReadWithTimeout returns -1 because > NET_TimeoutWithCurrentTime > >>> returns <= 0, then a JNI pending exception will be set. So the code > >>> calling NET_ReadWithTimeout should, a) free bufP, and b) return -1, > >>> immediately rather than continuing and possibly trying to set another > >>> JNI pending exception. > >>> > >>> One option is to check the return value from NET_ReadWithTimeout and > >>> also (*env)->ExceptionCheck(env). Another is to possibly consolidate > >>> the setting of JNI pending exceptions in one place, towards the end > >>> of Java_java_net_SocketInputStream_socketRead0, for example EBADF is > >>> handled in two places. > >>> > >>> -Chris. > >>> > >>>> I need help from some one to build and test latest changes on AIX, may > >>>> be Christoph can help. > >>>> > >>>> Thanks, > >>>> > >>>> Vyom > >>>> > >>>> > >>>> On Tuesday 06 September 2016 06:25 PM, Chris Hegarty wrote: > >>>>> Vyom, > >>>>> > >>>>>> On 6 Sep 2016, at 10:20, Vyom Tewari <vyom.tew...@oracle.com> > wrote: > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> Please find the latest > >>>>>> > >> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.2/index.html > >>>>>> > >> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.2/index.html>). > >>>>>> I > >>>>>> have incorporated the review comments. > >>>>> Your changes completely avoid NET_Timeout, which is quite complex on > >>>>> Linux, Mac, and AIX, for various reasons. ( NET_Timeout will use the > >>>>> async close mechanism to signal/interrupt a thread blocked in a poll / > >>>>> select ). Also, select is used on Mac, which will use poll after your > >>>>> changes ( I do remember that there was a bug in poll on Mac around > >>>>> the time of the original Mac port ). > >>>>> > >>>>> Would is be better, rather than replicating the logic in NET_Timeout, > >>>>> to have a version that supports passing a function pointer, to run if > >>>>> the poll/select returns before the timeout? Just an idea. > >>>>> > >>>>> -Chris. > >>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Vyom > >>>>>> > >>>>>> > >>>>>> On Monday 05 September 2016 08:37 PM, Chris Hegarty wrote: > >>>>>>> On 05/09/16 15:37, Mark Sheppard wrote: > >>>>>>>> if the desire is to avoid making double calls on gettimeofday in the > >>>>>>>> NET_ReadWithTimeout's while loop for its main call flow, > >>>>>>> Yes, the desire is to make no more calls to gettimeofday, than are > >>>>>>> already done. > >>>>>>> > >>>>>>>> then consider a modification to NET_Timeout and create a version > >>>>>>>> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval * > >>>>>>>> current_time) > >>>>>>>> > >>>>>>>> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval * > >>>>>>>> current_time) { > >>>>>>>> int result; > >>>>>>>> struct timeval t; > >>>>>>>> long prevtime, newtime; > >>>>>>>> struct pollfd pfd; > >>>>>>>> pfd.fd = s; > >>>>>>>> pfd.events = POLLIN; > >>>>>>>> > >>>>>>>> if (timeout > 0) { > >>>>>>>> prevtime = (current_time->tv_sec * 1000) + > >>>>>>>> current_time->tv_usec / 1000; > >>>>>>>> } > >>>>>>>> > >>>>>>>> for(;;) { > >>>>>>>> result = poll(&pfd, 1, timeout); > >>>>>>>> if (result < 0 && errno == EINTR) { > >>>>>>>> if (timeout > 0) { > >>>>>>>> gettimeofday(&t, NULL); > >>>>>>>> newtime = (t.tv_sec * 1000) + t.tv_usec /1000; > >>>>>>>> timeout -= newtime - prevtime; > >>>>>>>> if (timeout <= 0) > >>>>>>>> return 0; > >>>>>>>> prevtime = newtime; > >>>>>>>> } > >>>>>>>> } else { > >>>>>>>> return result; > >>>>>>>> } > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> the NET_ReadWithTimeout is modified with. > >>>>>>>> > >>>>>>>> while (timeout > 0) { > >>>>>>>> result = NET_TimeoutWithCurrentTime(fd, timeout, &t); > >>>>>>>> > >>>>>>>> in any case there is the potential for multiple invocation of > >>>>>>>> gettimeofday due to a need to make > >>>>>>>> poll restartable, > >>>>>>> Yes, and that is fine. Just no more than is already there. > >>>>>>> > >>>>>>> and adjust the originally specified timeout > >>>>>>>> accordingly, and for the edge case > >>>>>>>> after the non blocking read. > >>>>>>> After an initial skim, your suggestion seems reasonable. > >>>>>>> > >>>>>>> -Chris. > >>>>>>> > >>>>>>>> regards > >>>>>>>> Mark > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On 05/09/2016 12:02, Chris Hegarty wrote: > >>>>>>>>> Vyom, > >>>>>>>>> > >>>>>>>>> > >> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> There is some concern about the potential performance effect, etc, > >>>>>>>>> of gettimeofday, maybe there is a way out of this. The reuse of > >>>>>>>>> NET_Timeout is good, but it also calls gettimeofday. It seems that > >>>>>>>>> a specific NET_ReadWithTimeout could be written to NOT reuse > >>>>>>>>> NET_Timeout, but effectively inline its interruptible operation. > >>>>>>>>> Or write a variant of NET_Timeout that takes a function to > >>>>>>>>> execute. Rather than effectively two loops conditioned on the > >>>>>>>>> timeout. Disclaimer: I have not actually tried to do this, but > >>>>>>>>> it seems worth considering / evaluating. > >>>>>>>>> > >>>>>>>>> -Chris. > >>>>>>>>> > >>>>>>>>> On 02/09/16 04:39, Vyom Tewari wrote: > >>>>>>>>>> hi Dimitry, > >>>>>>>>>> > >>>>>>>>>> thanks for review, I did consider to use a monotonically > >>>>>>>>>> increasing > >>>>>>>>>> clock like "clock_gettime" but existing nearby > >>>>>>>>>> code("NET_Timeout") uses > >>>>>>>>>> "gettimeofday" so i choose to be consistent with the existing > >>>>>>>>>> code. > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> > >>>>>>>>>> Vyom > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Friday 02 September 2016 01:38 AM, Dmitry Samersoff wrote: > >>>>>>>>>>> Vyom, > >>>>>>>>>>> > >>>>>>>>>>> Did you consider to use select() to calculate timeout instead of > >>>>>>>>>>> gettimeofday ? > >>>>>>>>>>> > >>>>>>>>>>> gettimeofday is affected by system time changes, so running ntpd > >>>>>>>>>>> can > >>>>>>>>>>> cause unpredictable behavior of this code. Also it's rather > >>>>>>>>>>> expensive > >>>>>>>>>>> syscall. > >>>>>>>>>>> > >>>>>>>>>>> -Dmitry > >>>>>>>>>>> > >>>>>>>>>>> On 2016-09-01 19:03, Vyom Tewari wrote: > >>>>>>>>>>>> please find the updated > >>>>>>>>>>>> > >> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.1/index.html>). > >>>>>>>>>>>> > >>>>>>>>>>>> I > >>>>>>>>>>>> incorporated the review comments. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> > >>>>>>>>>>>> Vyom > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Tuesday 30 August 2016 04:11 PM, Mark Sheppard wrote: > >>>>>>>>>>>>> Hi > >>>>>>>>>>>>> perhaps there is an opportunity to do some refactoring > >>>>>>>>>>>>> here (... > >>>>>>>>>>>>> for me a "goto " carries a code smell! ) > >>>>>>>>>>>>> > >>>>>>>>>>>>> along the lines > >>>>>>>>>>>>> > >>>>>>>>>>>>> if (timeout) { > >>>>>>>>>>>>> nread = NET_ReadWithTimeout(...); > >>>>>>>>>>>>> } else { > >>>>>>>>>>>>> nread = NET_Read(...); > >>>>>>>>>>>>> } > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> the NET_ReadWithTimeout (...) function will contain a > >>>>>>>>>>>>> restructuring of > >>>>>>>>>>>>> your goto loop > >>>>>>>>>>>>> while (_timeout > 0) { nread = NET_Timeout(fd, _timeout); > >>>>>>>>>>>>> if (nread <= 0) { > >>>>>>>>>>>>> if (nread == 0) { > >>>>>>>>>>>>> JNU_ThrowByName(env, JNU_JAVANETPKG > >>>>>>>>>>>>> "SocketTimeoutException", > >>>>>>>>>>>>> "Read timed out"); > >>>>>>>>>>>>> } else if (nread == -1) { > >>>>>>>>>>>>> if (errno == EBADF) { > >>>>>>>>>>>>> JNU_ThrowByName(env, JNU_JAVANETPKG > >>>>>>>>>>>>> "SocketException", "Socket closed"); > >>>>>>>>>>>>> } else if (errno == ENOMEM) { > >>>>>>>>>>>>> JNU_ThrowOutOfMemoryError(env, > >>>>>>>>>>>>> "NET_Timeout > >>>>>>>>>>>>> native heap allocation failed"); > >>>>>>>>>>>>> } else { > >>>>>>>>>>>>> JNU_ThrowByNameWithMessageAndLastError > >>>>>>>>>>>>> (env, JNU_JAVANETPKG > >>>>>>>>>>>>> "SocketException", > >>>>>>>>>>>>> "select/poll failed"); > >>>>>>>>>>>>> } > >>>>>>>>>>>>> } > >>>>>>>>>>>>> // release buffer in main call flow > >>>>>>>>>>>>> // if (bufP != BUF) { > >>>>>>>>>>>>> // free(bufP); > >>>>>>>>>>>>> // } > >>>>>>>>>>>>> nread = -1; > >>>>>>>>>>>>> break; > >>>>>>>>>>>>> } else { > >>>>>>>>>>>>> nread = NET_NonBlockingRead(fd, bufP, len); > >>>>>>>>>>>>> if (nread == -1 && ((errno == EAGAIN) || (errno == > >>>>>>>>>>>>> EWOULDBLOCK))) { > >>>>>>>>>>>>> gettimeofday(&t, NULL); > >>>>>>>>>>>>> newtime = t.tv_sec * 1000 + t.tv_usec / 1000; > >>>>>>>>>>>>> _timeout -= newtime - prevtime; > >>>>>>>>>>>>> if(_timeout > 0){ > >>>>>>>>>>>>> prevtime = newtime; > >>>>>>>>>>>>> } > >>>>>>>>>>>>> } else { break; } } } return nread; > >>>>>>>>>>>>> > >>>>>>>>>>>>> e&oe > >>>>>>>>>>>>> > >>>>>>>>>>>>> regards > >>>>>>>>>>>>> Mark > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 29/08/2016 10:58, Vyom Tewari wrote: > >>>>>>>>>>>>>> gentle reminder, please review the below code change. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Vyom > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Monday 22 August 2016 05:12 PM, Vyom Tewari wrote: > >>>>>>>>>>>>>>> Hi All, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Please review the code changes for below issue. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Bug : > >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8075484 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> webrev : > >>>>>>>>>>>>>>> > >> http://cr.openjdk.java.net/~vtewari/8075484/webrev0.0/index.html > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.0/index.html> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> This issue is SocketInputStream.socketread0() hangs even > with > >>>>>>>>>>>>>>> "soTimeout" set.the implementation of > >>>>>>>>>>>>>>> Java_java_net_SocketInputStream_socketRead0 assumes > that > >>>>>>>>>>>>>>> read() > >>>>>>>>>>>>>>> won't block after poll() reports that a read is possible. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> This assumption does not hold, as noted on the man page for > >>>>>>>>>>>>>>> select > >>>>>>>>>>>>>>> (referenced by the man page for poll): Under Linux, select() > >>>>>>>>>>>>>>> may > >>>>>>>>>>>>>>> report a socket file descriptor as "ready for reading", while > >>>>>>>>>>>>>>> nevertheless a subsequent read blocks. This could for > example > >>>>>>>>>>>>>>> happen > >>>>>>>>>>>>>>> when data has arrived but upon examination has wrong > >> checksum > >>>>>>>>>>>>>>> and is > >>>>>>>>>>>>>>> discarded. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Vyom > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>