Hello Rajat,

Do you see Websocket traffic work as expected the first time when the UI
works ? can you check using the developer console if the initial Websocket
connection was established and data is flowing correctly?

Looks like you are connecting to a secure backend, is the Websocket backend
secure as well ?
Can you also post a redacted gateway.log file with DEBUG logging, it would
be helpful to see the entire log file with the stacktrace.

Also, do you see any errors in your service log files ?

Best,
Sandeep


On Mon, Aug 12, 2019 at 1:08 AM Rajat Goel <[email protected]>
wrote:

> Hello,
>
>
>
> I need some help while trying to integrate my custom service with Knox.
> Here’s how the service is deployed:
>
>
>
>                                                         _   Knox (Instance
> 1) _                                           _  My Service1 (Master).
>
>
>                                 |
> |
>  |
>
> UI Client -> Load Balancer  à |                                         |
> --  For Availability à |
>
>                      (HA Proxy)
>             |                                         |       (HA
> Proxy)             |
>
>                                                      | _  Knox (Instance
> 2) _  |                                       | _ My Service2 (Standby)
>
>
>
>
>
> My custom service UI client uses both web sockets as well as REST calls to
> talk to My Service backend. I have written service definition xml and
> rewrite xml files and deployed them on a test setup as per the above
> architecture. My Knox version is 1.0 (HDP 3.1). I have added the service in
> Knox default topology and  default topology uses Shiro Provider for
> authentication and ‘gateway.websocket.feature.enabled’ is set to true.
>
>
>
> When Knox service is started, I see the following behavior:
>
>    - When I launch the UI from browser, I get Shiro authentication dialog
>    popup, I enter my credentials and get redirected to my Service UI.
>    Initially, for the first couple of minutes, UI works fine. All calls from
>    UI are getting routed properly to backend.
>    - After first few minutes, I start seeing following exception in Knox
>    gateway logs:
>
>                        2019-08-10 15:37:32,053 ERROR gateway.websockets
> (ProxyWebSocketAdapter.java:onWebSocketConnect(105)) - Unable to connect to
> websocket server: java.io.IOException: Connect failure
>
> java.io.IOException: Connect failure
>
> Caused by: org.eclipse.jetty.websocket.api.UpgradeException: 0 null
>
>         at
> org.eclipse.jetty.websocket.client.WebSocketUpgradeRequest.onComplete(WebSocketUpgradeRequest.java:515)
>
> Caused by: java.io.EOFException: HttpConnectionOverHTTP@1e8a6128
> ::SocketChannelEndPoint@5526bc43{
> rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443<->/
> 192.168.141.31:49860
> ,ISHUT,fill=-,flush=-,to=3/0}{io=0/0,kio=0,kro=1}->HttpConnectionOverHTTP@1e8a6128
> (l:/192.168.141.31:49860 <-> r:
> rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443,closed=false)=
> >HttpChannelOverHTTP@500cd5d(exchange=HttpExchange@75bf9768
> req=TERMINATED/null@null res=PENDING/null@null
> )[send=HttpSenderOverHTTP@6678aae8
> (req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator@6dcd6c0
> {s=START}],recv=HttpReceiverOverHTTP@5358963(rsp=IDLE,failure=null)[HttpParser{s=CLOSED,0
> of -1}]]
>
>         ... 13 more
>
>
>
>    - UI starts slowing down and response times increase.
>    - I also see DEBUG exceptions such as:
>
> DEBUG io.FillInterest (FillInterest.java:onFail(134)) - onFail
> FillInterest@1d803962{null}
>
> java.util.concurrent.TimeoutException: Idle timeout expired: 300000/300000
> ms
>
>         at
> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
>
>         at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
>
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
>         at java.lang.Thread.run(Thread.java:748)
>
>
>
>
>
> DEBUG io.WriteFlusher (WriteFlusher.java:onFail(471)) - ignored:
> WriteFlusher@3a2f4fc0{IDLE}->null
>
> java.nio.channels.ClosedChannelException
>
>         at org.eclipse.jetty.io.WriteFlusher.onClose(WriteFlusher.java:502)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.onClose(AbstractEndPoint.java:353)
>
>         at
> org.eclipse.jetty.io.ChannelEndPoint.onClose(ChannelEndPoint.java:216)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doClose(SslConnection.java:1132)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:220)
>
>
>
> DEBUG ssl.SslConnection (SslConnection.java:flush(950)) -
> SslConnection@1cb9d1f9::SocketChannelEndPoint@6fa560e8{/
> 192.168.141.31:44258<->/192.168.141.31:8443
> ,ISHUT,fill=-,flush=-,to=8/300000}{io=0/0,kio=0,kro=1}->SslConnection@1cb9d1f9
> {NEED_UNWRAP,eio=-1/-1,di=-1,fill=IDLE,flush=IDLE}~>DecryptedEndPoint@205f9957
> {/192.168.141.31:44258<->/192.168.141.31:8443
> ,CLOSED,fill=-,flush=-,to=9/300000}=>HttpConnection@11b5199c[p=HttpParser{s=CLOSED,0
> of -1},g=HttpGenerator@40706618{s=START}]=>HttpChannelOverHttp@655d0b98
> {r=0,c=false,a=IDLE,uri=null,age=0}
>
> java.io.IOException: Broken pipe
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.flush(SslConnection.java:847)
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doShutdownOutput(SslConnection.java:1076)
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doClose(SslConnection.java:1131)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:220)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
>
>         at
> org.eclipse.jetty.io.AbstractConnection.close(AbstractConnection.java:248)
>
>         at
> org.eclipse.jetty.server.HttpChannelOverHttp.earlyEOF(HttpChannelOverHttp.java:234)
>
>         at
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1551)
>
>         at
> org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:360)
>
>         at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:250)
>
>         at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
>
>         at
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:411)
>
>
>
>    - Few minutes later, UI hangs. I start seeing following Exceptions in
>    Knox gateway log:
>
>                               2019-08-10 15:47:41,366 WARN  knox.gateway
> (DefaultDispatch.java:executeOutboundRequest(147)) - Connection exception
> dispatching request:
> https://rafint001-mgt-01.cloud.in.guavus.com:9443/_sock/iframe.html
> org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for
> connection from pool
>
>         at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:313)
>
>         at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:279)
>
>
>
>
>
>                                2019-08-10 15:47:41,371 ERROR knox.gateway
> (AbstractGatewayFilter.java:doFilter(63)) - Failed to execute filter:
> java.io.IOException: Service connectivity error.
>
> java.io.IOException: Service connectivity error.
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.executeOutboundRequest(DefaultDispatch.java:148)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.executeRequest(DefaultDispatch.java:116)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.doGet(DefaultDispatch.java:278)
>
>
>
>    - Few minutes later, when I refresh UI page (after clearing any
>    cookies) I see HTTP 401 Error on UI. This time no Shiro authentication
>    dialog popup. Nothing seems to work hereafter. I see following logs only in
>    gateway.log:
>
>                      2019-08-10 15:40:10,667 DEBUG knox.gateway
> (GatewayFilter.java:doFilter(119)) - Received request: GET /pdie/cdap/
>
> 2019-08-10 15:40:10,667 DEBUG authc.BasicHttpAuthenticationFilter
> (BasicHttpAuthenticationFilter.java:sendChallenge(274)) - Authentication
> required: sending 401 Authentication challenge response.
>
>
>
>    - When I restart Knox service from Ambari or run a ‘touch’ command on
>    topology file (/etc/knox/conf/topologies/default.xml), UI starts working
>    again and the above set of issues repeats.
>    - I tried increasing gateway threads (gateway.threadpool.max to 500)
>    and HTTP connections (gateway.httpclient.maxConnections to 100). With these
>    settings, UI works fine for a little more time as compared to scenario with
>    default values but eventually hit the same issues as above.
>    - When UI hangs, I tried to thread dump of Knox service. I see that
>    many of the threads in Knox are stuck in socket read calls with following
>    trace:
>
>                                   "qtp2099051403-139" #139 prio=5
> os_prio=0 tid=0x00007f4d7c002000 nid=0x6ed8 runnable [0x00007f4dee0e0000]
>
>    java.lang.Thread.State: RUNNABLE
>
>         at java.net.SocketInputStream.socketRead0(Native Method)
>
>         at
> java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>
>         at java.net.SocketInputStream.read(SocketInputStream.java:171)
>
>         at java.net.SocketInputStream.read(SocketInputStream.java:141)
>
>         at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
>
>         at sun.security.ssl.InputRecord.read(InputRecord.java:503)
>
>         at
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
>
>         - locked <0x00000006f471e218> (a java.lang.Object)
>
>         at
> sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
>
>         at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
>
>         - locked <0x00000006f471edc0> (a sun.security.ssl.AppInputStream)
>
>         at
> org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
>
>         at
> org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
>
>         at
> org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
>
>         at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
>
>         at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
>
>         at
> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
>
>         at
> org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
>
>         at
> org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
>
>         at
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
>
>         at
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>
>         at
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
>
>         at
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>
>         at
> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>
>         at
> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>
>         at
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>
>         at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>
>         at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
>
>         at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.executeOutboundRequest(DefaultDispatch.java:130)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.executeRequest(DefaultDispatch.java:116)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.doPost(DefaultDispatch.java:305)
>
>         at
> org.apache.knox.gateway.dispatch.GatewayDispatchFilter$PostAdapter.doMethod(GatewayDispatchFilter.java:177)
>
>         at
> org.apache.knox.gateway.dispatch.GatewayDispatchFilter.doFilter(GatewayDispatchFilter.java:122)
>
>         at
> org.apache.knox.gateway.filter.AbstractGatewayFilter.doFilter(AbstractGatewayFilter.java:61)
>
>         at
> org.apache.knox.gateway.GatewayFilter$Holder.doFilter(GatewayFilter.java:372)
>
>         at
> org.apache.knox.gateway.GatewayFilter$Chain.doFilter(GatewayFilter.java:272)
>
>
>
> Can some please help on resolving the above set of issues:
>
>    1. With the deployment architecture as described above: UI -> HAProxy
>    -> Knox -> HAProxy -> Service backend, are there any specific
>    configurations required in HA Proxy and Knox to make this work. Is this
>    right way to deploy ?
>    2. ‘org.eclipse.jetty.websocket.api.UpgradeException’ and how can this
>    be fixed?
>    3. How to debug and fix Knox Jetty threads hang ?
>
>
>
> Thanks & Regards,
>
> Rajat
>

Reply via email to