Hello,
I need some help while trying to integrate my custom service with Knox. Here’s
how the service is deployed:
_ Knox (Instance 1) _
_ My Service1 (Master).
|
| |
UI Client -> Load Balancer --> | | --
For Availability --> |
(HA Proxy) |
| (HA Proxy) |
| _ Knox (Instance 2) _
| | _ My Service2 (Standby)
My custom service UI client uses both web sockets as well as REST calls to talk
to My Service backend. I have written service definition xml and rewrite xml
files and deployed them on a test setup as per the above architecture. My Knox
version is 1.0 (HDP 3.1). I have added the service in Knox default topology and
default topology uses Shiro Provider for authentication and
‘gateway.websocket.feature.enabled’ is set to true.
When Knox service is started, I see the following behavior:
* When I launch the UI from browser, I get Shiro authentication dialog
popup, I enter my credentials and get redirected to my Service UI. Initially,
for the first couple of minutes, UI works fine. All calls from UI are getting
routed properly to backend.
* After first few minutes, I start seeing following exception in Knox
gateway logs:
2019-08-10 15:37:32,053 ERROR gateway.websockets
(ProxyWebSocketAdapter.java:onWebSocketConnect(105)) - Unable to connect to
websocket server: java.io.IOException: Connect failure
java.io.IOException: Connect failure
Caused by: org.eclipse.jetty.websocket.api.UpgradeException: 0 null
at
org.eclipse.jetty.websocket.client.WebSocketUpgradeRequest.onComplete(WebSocketUpgradeRequest.java:515)
Caused by: java.io.EOFException:
HttpConnectionOverHTTP@1e8a6128::SocketChannelEndPoint@5526bc43{rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443<->/192.168.141.31:49860,ISHUT,fill=-,flush=-,to=3/0}{io=0/0,kio=0,kro=1}->HttpConnectionOverHTTP@1e8a6128(l:/192.168.141.31:49860
<->
r:rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443,closed=false)=>HttpChannelOverHTTP@500cd5d(exchange=HttpExchange@75bf9768
req=TERMINATED/null@null
res=PENDING/null@null)[send=HttpSenderOverHTTP@6678aae8(req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator@6dcd6c0{s=START}],recv=HttpReceiverOverHTTP@5358963(rsp=IDLE,failure=null)[HttpParser{s=CLOSED,0
of -1}]]
... 13 more
* UI starts slowing down and response times increase.
* I also see DEBUG exceptions such as:
DEBUG io.FillInterest (FillInterest.java:onFail(134)) - onFail
FillInterest@1d803962{null}
java.util.concurrent.TimeoutException: Idle timeout expired: 300000/300000 ms
at
org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
DEBUG io.WriteFlusher (WriteFlusher.java:onFail(471)) - ignored:
WriteFlusher@3a2f4fc0{IDLE}->null
java.nio.channels.ClosedChannelException
at org.eclipse.jetty.io.WriteFlusher.onClose(WriteFlusher.java:502)
at
org.eclipse.jetty.io.AbstractEndPoint.onClose(AbstractEndPoint.java:353)
at
org.eclipse.jetty.io.ChannelEndPoint.onClose(ChannelEndPoint.java:216)
at
org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
at
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
at
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
at
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doClose(SslConnection.java:1132)
at
org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:220)
DEBUG ssl.SslConnection (SslConnection.java:flush(950)) -
SslConnection@1cb9d1f9::SocketChannelEndPoint@6fa560e8{/192.168.141.31:44258<->/192.168.141.31:8443,ISHUT,fill=-,flush=-,to=8/300000}{io=0/0,kio=0,kro=1}->SslConnection@1cb9d1f9{NEED_UNWRAP,eio=-1/-1,di=-1,fill=IDLE,flush=IDLE}~>DecryptedEndPoint@205f9957{/192.168.141.31:44258<->/192.168.141.31:8443,CLOSED,fill=-,flush=-,to=9/300000}=>HttpConnection@11b5199c[p=HttpParser{s=CLOSED,0
of
-1},g=HttpGenerator@40706618{s=START}]=>HttpChannelOverHttp@655d0b98{r=0,c=false,a=IDLE,uri=null,age=0}
java.io.IOException: Broken pipe
at
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.flush(SslConnection.java:847)
at
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doShutdownOutput(SslConnection.java:1076)
at
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doClose(SslConnection.java:1131)
at
org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:220)
at
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
at
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
at
org.eclipse.jetty.io.AbstractConnection.close(AbstractConnection.java:248)
at
org.eclipse.jetty.server.HttpChannelOverHttp.earlyEOF(HttpChannelOverHttp.java:234)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1551)
at
org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:360)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:250)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:411)
* Few minutes later, UI hangs. I start seeing following Exceptions in Knox
gateway log:
2019-08-10 15:47:41,366 WARN knox.gateway
(DefaultDispatch.java:executeOutboundRequest(147)) - Connection exception
dispatching request:
https://rafint001-mgt-01.cloud.in.guavus.com:9443/_sock/iframe.html
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for
connection from pool
at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:313)
at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:279)
2019-08-10 15:47:41,371 ERROR knox.gateway
(AbstractGatewayFilter.java:doFilter(63)) - Failed to execute filter:
java.io.IOException: Service connectivity error.
java.io.IOException: Service connectivity error.
at
org.apache.knox.gateway.dispatch.DefaultDispatch.executeOutboundRequest(DefaultDispatch.java:148)
at
org.apache.knox.gateway.dispatch.DefaultDispatch.executeRequest(DefaultDispatch.java:116)
at
org.apache.knox.gateway.dispatch.DefaultDispatch.doGet(DefaultDispatch.java:278)
* Few minutes later, when I refresh UI page (after clearing any cookies) I
see HTTP 401 Error on UI. This time no Shiro authentication dialog popup.
Nothing seems to work hereafter. I see following logs only in gateway.log:
2019-08-10 15:40:10,667 DEBUG knox.gateway
(GatewayFilter.java:doFilter(119)) - Received request: GET /pdie/cdap/
2019-08-10 15:40:10,667 DEBUG authc.BasicHttpAuthenticationFilter
(BasicHttpAuthenticationFilter.java:sendChallenge(274)) - Authentication
required: sending 401 Authentication challenge response.
* When I restart Knox service from Ambari or run a ‘touch’ command on
topology file (/etc/knox/conf/topologies/default.xml), UI starts working again
and the above set of issues repeats.
* I tried increasing gateway threads (gateway.threadpool.max to 500) and
HTTP connections (gateway.httpclient.maxConnections to 100). With these
settings, UI works fine for a little more time as compared to scenario with
default values but eventually hit the same issues as above.
* When UI hangs, I tried to thread dump of Knox service. I see that many of
the threads in Knox are stuck in socket read calls with following trace:
"qtp2099051403-139" #139 prio=5 os_prio=0
tid=0x00007f4d7c002000 nid=0x6ed8 runnable [0x00007f4dee0e0000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
- locked <0x00000006f471e218> (a java.lang.Object)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
- locked <0x00000006f471edc0> (a sun.security.ssl.AppInputStream)
at
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at
org.apache.knox.gateway.dispatch.DefaultDispatch.executeOutboundRequest(DefaultDispatch.java:130)
at
org.apache.knox.gateway.dispatch.DefaultDispatch.executeRequest(DefaultDispatch.java:116)
at
org.apache.knox.gateway.dispatch.DefaultDispatch.doPost(DefaultDispatch.java:305)
at
org.apache.knox.gateway.dispatch.GatewayDispatchFilter$PostAdapter.doMethod(GatewayDispatchFilter.java:177)
at
org.apache.knox.gateway.dispatch.GatewayDispatchFilter.doFilter(GatewayDispatchFilter.java:122)
at
org.apache.knox.gateway.filter.AbstractGatewayFilter.doFilter(AbstractGatewayFilter.java:61)
at
org.apache.knox.gateway.GatewayFilter$Holder.doFilter(GatewayFilter.java:372)
at
org.apache.knox.gateway.GatewayFilter$Chain.doFilter(GatewayFilter.java:272)
Can some please help on resolving the above set of issues:
1. With the deployment architecture as described above: UI -> HAProxy ->
Knox -> HAProxy -> Service backend, are there any specific configurations
required in HA Proxy and Knox to make this work. Is this right way to deploy ?
2. ‘org.eclipse.jetty.websocket.api.UpgradeException’ and how can this be
fixed?
3. How to debug and fix Knox Jetty threads hang ?
Thanks & Regards,
Rajat