Throwing my two cents in here just to state my opinion. This is an effort that could pay dividends if it were done with a view toward the future - absolutely no pun intended. I do not know the details of the code but if futures could be useful here, they would be welcomed by myself.
MG On Apr 29, 2011, at 1:33 PM, Christopher Dolan wrote: > Thanks, Tom. > > I don't really understand your Future suggestion. Are you suggesting > changing the async handshake to a Future? If so, that sounds like a very > involved change, touching a lot of code in Mux and its subclasses. > > setDown() changes the value of the muxDown boolean to true, so it's a > valid way out of the loop. > > Chris > > -----Original Message----- > From: Tom Hobbs [mailto:[email protected]] > Sent: Friday, April 29, 2011 2:27 PM > To: [email protected] > Subject: Re: client hang in com.sun.jini.jeri.internal.mux.Mux.start() > > The proposed code looks fine to me. Two points leap out, more > discussion > points than anything else. > > For some reason I've recently developed an aversion to writing my on > timeout > logic. did you consider using something like a Future here or might > that be > serious overkill (it wouldn't surprise me if it was)? > > Also is Setdown intended to break out of the while loop? Because I > can't > see a way to escape it. (I don't have the rest of the code in front of > me) > > Thanks for keep raising these issues - especially because you usually > supply > fixes for them! > > Tom > > On 29 Apr 2011 19:41, "Christopher Dolan" <[email protected]> > wrote: >> I've experienced occasional cases where clients get stuck in the >> following block of code in Mux.start. Has anyone experienced this >> problem? I have a proposed solution below. Has anyone thought about a >> similar solution already? >> >> -- Current code -- >> 1 asyncSendClientConnectionHeader(); >> 2 synchronized (muxLock) { >> 3 while (!muxDown && !clientConnectionReady) { >> 4 try { >> 5 muxLock.wait(); // REMIND: timeout? >> 6 } catch (InterruptedException e) { >> 7 ... >> 8 } >> 9 } >> 10 if (muxDown) { >> 11 IOException ioe = new IOException(muxDownMessage); >> 12 ioe.initCause(muxDownCause); >> 13 throw ioe; >> 14 } >> 15 } >> >> -- Explanation of the code -- >> This code handles the initial client-server handshake that starts a > JERI >> connection. In line 1, the client sends its 8-byte greeting to the >> server. Then in the loop on lines 3-9, it waits for the server's >> response. If the reader thread gets a satisfactory response from the >> server, it sets clientConnectionReady=true and calls >> muxLock.notifyAll(). In all other cases (aborted connection, > mismatched >> protocol version, etc) the reader invokes Mux.setDown() which sets >> muxDown=true and calls muxLock.notifyAll(). In lines 10-14, it throws > if >> the handshake was a failure. >> >> In my scenario (which uses simple TCP sockets, nothing fancy), the >> invoker thread sits on line 5 indefinitely. My problem hard to >> reproduce, so I haven't found out what the server is doing in this > case. >> I hope to figure that out eventually, but presently I'm interested in >> the "REMIND: timeout?" comment. >> >> -- Timeout solution -- >> It seems obvious to me that there should be a timeout here. There are >> lots of imaginable cases where the client could get stuck here: >> server-side deadlock, abrupt server crash, logic error in client Mux >> code. You'd expect that the server would either respond with its > 8-byte >> handshake very quickly or never, so a modest timeout (like 15 or 30 >> seconds) should be good. If that timeout is triggered, I would expect >> that the code above would call Mux.setDown() and throw an IOException. >> That exception would either cause a retry or be thrown up to the > invoker >> as a RemoteException. >> >> -- Proposed code (untested) -- >> 3 long now = System.currentTimeMillis(); >> 4 long endTime = now + timeoutMillis; >> 5 while (!muxDown && !clientConnectionReady) { >> 6 if (now >= endTime) { >> 7 setDown("timeout waiting for server to respond >> to handshake", null); >> 8 } else { >> 9 try { >> 10 muxLock.wait(endTime - now); >> 11 now = System.currentTimeMillis(); >> 12 } catch (InterruptedException e) { >> 13 setDown("interrupt waiting for connection >> header", e); >> 14 } >> 15 } >> 16 } >> >> This code assumes a configurable timeoutMillis parameter has been set >> earlier. >> >> I can't think of any alternative solutions. Putting the timeout in the >> Reader logic seems higher risk. There's incomplete code in JERI to >> implement a ping packet (see Mux.asyncSendPing, never used), but that >> would only be relevant after the initial handshake and wouldn't help >> here. >> >> Thanks, >> Chris >> Michael McGrady Chief Systems Architect Topia Technology, Inc. Cel 1.253.720.3365 Work 1.253.572.9712 extension 2037 Skype ID: michael.mcgrady5 [email protected]
