Hello Rajat,

I agree with you, knox should receive a "Connection: close" header from the
backend to terminate the connection.
Something like this - "DEBUG http.wire (Wire.java:wire(73)) -
http-outgoing-35 << "Connection: close[\r][\n]""

You initially reported issues with Websockets hanging as well, do you see
the same there ? or it's just http connections.

Thanks,
Sandeep

On Mon, Aug 19, 2019 at 1:45 PM Rajat Goel <[email protected]>
wrote:

> Hi Sandeep,
>
>
>
> I tried removing all load balancers, HA etc so now there is just UI ->
> Knox -> Backend, but the same issue is seen again.
>
>
>
> On debugging the issue further, I found following is happening:
>
>    1. UI client sends HTTPS request to Knox gateway (Eg. GET
>    /_sock/493/tet0wey4/htmlfile?c=_jp.ab0qmfk HTTP/1.1)
>    2. Knox proxies the request to UI Backend server which is a Node js
>    server
>    3. Backend Node js server returns the response with header
>    ‘Transfer-Encoding: chunked’. Node js server uses SockJs library for
>    sockets connections. I looked at  logs of HTTP wire message at Knox, which
>    is received as response from Node js server, following is the content:
>
>
>
> *2019-08-17 10:34:42,699 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "Transfer-Encoding: chunked[\r][\n]"*
>
> *2019-08-17 10:34:42,699 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "[\r][\n]"*
>
> *2019-08-17 10:34:42,700 DEBUG http.wire (Wire.java:wire(87)) -
> http-outgoing-15 << "40e"*
>
> *2019-08-17 10:34:42,700 DEBUG http.headers
> (LoggingManagedHttpClientConnection.java:onResponseReceived(122)) -
> http-outgoing-15 << HTTP/1.1 200 OK*
>
> *2019-08-17 10:34:42,701 DEBUG http.headers
> (LoggingManagedHttpClientConnection.java:onResponseReceived(125)) -
> http-outgoing-15 << Cache-Control: no-store, no-cache, no-transform,
> must-revalidate, max-age=0*
>
> *2019-08-17 10:34:42,701 DEBUG http.headers
> (LoggingManagedHttpClientConnection.java:onResponseReceived(125)) -
> http-outgoing-15 << Content-Type: text/html; charset=UTF-8*
>
> *2019-08-17 10:34:42,701 DEBUG http.headers
> (LoggingManagedHttpClientConnection.java:onResponseReceived(125)) -
> http-outgoing-15 << Date: Sat, 17 Aug 2019 10:34:42 GMT*
>
> *2019-08-17 10:34:42,702 DEBUG http.headers
> (LoggingManagedHttpClientConnection.java:onResponseReceived(125)) -
> http-outgoing-15 << Connection: keep-alive*
>
> *2019-08-17 10:34:42,702 DEBUG http.headers
> (LoggingManagedHttpClientConnection.java:onResponseReceived(125)) -
> http-outgoing-15 << **Transfer-Encoding: chunked*
>
> *2019-08-17 10:34:42,706 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "[\r][\n]"*
>
> *2019-08-17 10:34:42,706 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "<!doctype html>[\n]"*
>
> *2019-08-17 10:34:42,707 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "<html><head>[\n]"*
>
> *2019-08-17 10:34:42,707 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "  <meta http-equiv="X-UA-Compatible" content="IE=edge"
> />[\n]"*
>
> *2019-08-17 10:34:42,707 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "  <meta http-equiv="Content-Type" content="text/html;
> charset=UTF-8" />[\n]"*
>
> *2019-08-17 10:34:42,707 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "</head><body><h2>Don't panic!</h2>[\n]"*
>
> *2019-08-17 10:34:42,708 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "  <script>[\n]"*
>
> *2019-08-17 10:34:42,708 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "    document.domain = document.domain;[\n]"*
>
> *2019-08-17 10:34:42,708 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "    var c = parent._jp.ab0qmfk;[\n]"*
>
> *2019-08-17 10:34:42,708 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "    c.start();[\n]"*
>
> *2019-08-17 10:34:42,709 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "    function p(d) {c.message(d);};[\n]"*
>
> *2019-08-17 10:34:42,709 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "    window.onload = function() {c.stop();};[\n]"*
>
> *2019-08-17 10:34:42,709 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "  </script>*
>
> *
> [\r][\n]"*
>
> *2019-08-17 10:34:42,710 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "[\r][\n]"*
>
> *2019-08-17 10:34:42,713 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "[\r][\n]"*
>
> *2019-08-17 10:34:42,714 DEBUG http.wire (Wire.java:wire(87)) -
> http-outgoing-15 << "1c"*
>
> *2019-08-17 10:34:42,714 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "[\r][\n]"*
>
> *2019-08-17 10:34:42,715 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "<script>[\n]"*
>
> *2019-08-17 10:34:42,715 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "p("o");[\n]"*
>
> *2019-08-17 10:34:42,715 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "</script>[\r][\n]"*
>
> *2019-08-17 10:34:42,716 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "[\r][\n]"*
>
> *2019-08-17 10:35:07,696 DEBUG http.wire (Wire.java:wire(87)) -
> http-outgoing-15 << "1c"*
>
> *2019-08-17 10:35:07,696 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "[\r][\n]"*
>
> *2019-08-17 10:35:07,697 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "<script>[\n]"*
>
> *2019-08-17 10:35:07,697 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "p("h");[\n]"*
>
> *2019-08-17 10:35:07,697 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "</script>[\r][\n]"*
>
> *2019-08-17 10:35:07,698 DEBUG http.wire (Wire.java:wire(73)) -
> http-outgoing-15 << "[\r][\n]"*
>
> *….*
>
>
>
> The last few messages in bold is recurrent data sent with a periodic
> heartbeat and is being sent by Node js backend server every 25 seconds.
> Notice that no last/end chunk message is sent from backend server and Knox
> server is just waiting for this last chunk message. This results in
> connection hang. Similar messages are seen for rest of the connections
> which are held up.  Looks like the cause here is non-standard
> implementation of chunked transfer encoding in SockJS library. Am I on the
> right path here or you feel issue might be something else ?
>
>
>
> Looking further into the issue with SockJS library and how it can be
> fixed. Thanks for the help so far.
>
>
>
> Regards,
>
> Rajat
>
>
>
> *From: *Sandeep Moré <[email protected]>
> *Reply to: *"[email protected]" <[email protected]>
> *Date: *Saturday, 17 August 2019 at 8:00 PM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Knox jetty threads stuck and Seeing
> ConnectionPoolTimeoutException while proxying a custom service
>
>
>
> This could be multiple sockets, this is weird, looks like for some reason
> there are multiple sockets opened between Knox and the backend.
>
> Can you try using just Knox and the backend and see what you get, i.e. get
> rid of the load balancers and HA to isolate the problem.
>
>
>
> On Sat, Aug 17, 2019 at 8:14 AM Rajat Goel <[email protected]>
> wrote:
>
> Hi Sandeep,
>
>
>
> While reproducing the issue, I observed that ‘netstat’ command shows 32
> connections in ESTABLISHED state with UI backend service:
>
> *tcp6       0      0 192.168.133.69:48510 <http://192.168.133.69:48510>
> 192.168.133.69:9443 <http://192.168.133.69:9443>     ESTABLISHED 19122/java*
>
> *tcp6       0      0 192.168.133.69:53848 <http://192.168.133.69:53848>
> 192.168.133.69:9443 <http://192.168.133.69:9443>     ESTABLISHED 19122/java*
>
> *tcp6       0      0 192.168.133.69:52994 <http://192.168.133.69:52994>
> 192.168.133.69:9443 <http://192.168.133.69:9443>     ESTABLISHED 19122/java*
>
> *…*
>
>
>
> However, when I search for port number in gaeway.log file (full debugs
> logs enabled), there is only log mentioning that particular port:
>
> *[root@rafd001-mst-01 knox]# cat gateway.log | grep -w 48510*
>
> *2019-08-17 10:34:40,717 DEBUG conn.DefaultHttpClientConnectionOperator
> (DefaultHttpClientConnectionOperator.java:connect(146)) - Connection
> established 192.168.133.69
> <http://192.168.133.69>:48510<->192.168.133.69:9443
> <http://192.168.133.69:9443>*
>
>
>
> Looks like the connection was opened but not used? Other connections in
> netstat output show same trend. No logs in gateway.log file as to how the
> connection is getting used and these connections stay in ESTABLISHED state
> forever. They don’t even timeout. Any thoughts on why this could be
> happening ?
>
>
>
> Regards,
>
> Rajat
>
>
>
> *From: *Rajat Goel <[email protected]>
> *Reply to: *"[email protected]" <[email protected]>
> *Date: *Friday, 16 August 2019 at 11:18 PM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Knox jetty threads stuck and Seeing
> ConnectionPoolTimeoutException while proxying a custom service
>
>
>
> Hi Sandeep,
>
>
>
> You are right, websocket connection are not being established properly. I
> debugged this more today and looked at Knox’s GatewayWebsocketHandler.java
> code. From the code, I found that Knox uses service Url protocol to create
> backend ws or wss websocket connection whereas my service Url had ‘https’
> as I thought Knox would use incoming request URL (protocol as well as path)
> to construct backend websocket URL. After making some changes, I was able
> to fix UpgradeException issues.  Just a thought there: why not use request
> URI in getMatchedBackendURL() API { GatewayWebsocketHandler.java} to
> generate backend URL. This way we won’t have to write separate service
> definition for http(s) and ws requests, in many cases.
>
>
>
> Coming back to original issue, after fixing UpgradeException I still see
> ClosedChannel Exceptions, ConnectionPool timeout, Service connectivity
> errors, UI slowing down and eventually hang.
>
> ‘netstat’ output shows 32 connections in ESTABLISHED  state between Knox
> and UI backend server, all in Idle state. I had configured socket timeout
> as well as wbesocket idle timeouts to 60 seconds, still connections stay in
> ESTABLISHED state. Why so many connections are getting established ? When U
> is accessed directly i.e. there is no Knox proxy between UI client and
> backend, I only see 4-5 TCP connections between UI client and UI backend.
>
>
>
> Please check.
>
>
>
> Regards,
>
> Rajat
>
>
>
> *From: *Sandeep Moré <[email protected]>
> *Reply to: *"[email protected]" <[email protected]>
> *Date: *Friday, 16 August 2019 at 10:37 PM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Knox jetty threads stuck and Seeing
> ConnectionPoolTimeoutException while proxying a custom service
>
>
>
> Rajat,
>
> I could not see data flowing through websocket in the logs. I also noticed
> that your backend uses https but your websocket connection is using ws://,
> shouldn't it be using wss:// ?
>
>
>
> On Fri, Aug 16, 2019 at 7:02 AM Rajat Goel <[email protected]>
> wrote:
>
> Hi Sandeep, Knox Team,
>
>
>
> Need urgent help/pointers in debugging this issue as this is very critical
> for us. So request you to please check on the same.
>
>
>
> Thanks & Regards,
>
> Rajat
>
>
>
> *From: *Rajat Goel <[email protected]>
> *Reply to: *"[email protected]" <[email protected]>
> *Date: *Thursday, 15 August 2019 at 12:02 AM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Knox jetty threads stuck and Seeing
> ConnectionPoolTimeoutException while proxying a custom service
>
>
>
> Hi Sandeep,
>
>
>
> I added Knox certificate as trusted certificate on my UI Client machine
> and "avax.net.ssl.SSLException: Received fatal alert: certificate_unknown"
> errors no longer appear. However, other issues such as
> ‘org.eclipse.jetty.websocket.api.UpgradeException’, ‘java.io.IOException:
> Service connectivity error’ are still coming and UI eventually slows down
> and hangs.
>
>
>
> On UpgradeException, I had posted in my other mail, copying snippet here,
> please check:
>
>
>
> <snip>
>
>
>
> *On analysing the debug logs further, I see that Websocket connection is
> not being established properly. Websocket connection from Service Client UI
> to Knox is being established properly but from Knox to Service backend is
> failing with UpgradeException. Not sure why. One thing I noticed is that
> upgrade request from Knox to Service Backend for websocket connection HTTP
> Destination is coming in logs as
> ‘ws://rafint001-mgt-01.cloud.in.guavus.com:9443’ (only till hostname and
> port) *
>
>
>
> *2019-08-11 04:43:33,208 DEBUG client.WebSocketClient
> (WebSocketClient.java:connect(372)) - connect websocket
> org.eclipse.jetty.websocket.jsr356.endpoints.EndpointInstance@3cd5ba15 to
> ws://rafint001-mgt-01.cloud.in.guavus.com:9443/
> <http://rafint001-mgt-01.cloud.in.guavus.com:9443/>*
>
> *2019-08-11 04:43:33,221 DEBUG component.ContainerLifeCycle
> (ContainerLifeCycle.java:addBean(347)) - HttpClient@aa3cf89{STARTED} added
> {HttpDestination[ws://rafint001-mgt-01.cloud.in.guavus.com:9443]@33202ded,queue=0,pool=null,MANAGED}*
>
> *2019-08-11 04:43:33,221 DEBUG component.AbstractLifeCycle
> (AbstractLifeCycle.java:setStarting(185)) - starting
> HttpDestination[ws://rafint001-mgt-01.cloud.in.guavus.com:9443]@33202ded,queue=0,pool=null*
>
>
>
>
>
> *whereas from UI to Knox this comes as
> ‘https://rafint001-mgt-01.cloud.in.guavus.com:8443/gateway/default/pdie/_sock/443/qu5llmni/websocket
> <https://rafint001-mgt-01.cloud.in.guavus.com:8443/gateway/default/pdie/_sock/443/qu5llmni/websocket>’.*
>
>
>
> *2019-08-11 04:43:33,060 DEBUG server.HttpChannel
> (HttpChannel.java:onRequest(691)) - REQUEST for
> //rafint001-mgt-01.cloud.in.guavus.com:8443/gateway/default/pdie/_sock/443/qu5llmni/websocket
> <http://rafint001-mgt-01.cloud.in.guavus.com:8443/gateway/default/pdie/_sock/443/qu5llmni/websocket>
> on
> HttpChannelOverHttp@7440214c{r=1,c=false,a=IDLE,uri=//rafint001-mgt-01.cloud.in.guavus.com:8443/gateway/default/pdie/_sock/443/qu5llmni/websocket,age=0
> <http://rafint001-mgt-01.cloud.in.guavus.com:8443/gateway/default/pdie/_sock/443/qu5llmni/websocket,age=0>}*
>
> *GET
> //rafint001-mgt-01.cloud.in.guavus.com:8443/gateway/default/pdie/_sock/443/qu5llmni/websocket
> <http://rafint001-mgt-01.cloud.in.guavus.com:8443/gateway/default/pdie/_sock/443/qu5llmni/websocket>
> HTTP/1.1*
>
> *Host: rafint001-mgt-01.cloud.in.guavus.com:8443
> <http://rafint001-mgt-01.cloud.in.guavus.com:8443>^M*
>
> *Connection: Upgrade^M*
>
> *Pragma: no-cache^M*
>
> *Cache-Control: no-cache^M*
>
> *User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0)
> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36^M*
>
> *Upgrade: websocket^M*
>
> *Origin: https://rafint001-mgt-01.cloud.in.guavus.com:8443^M
> <https://rafint001-mgt-01.cloud.in.guavus.com:8443%5eM>*
>
> *Sec-WebSocket-Version: 13^M*
>
> *Accept-Encoding: gzip, deflate, br^M*
>
> *Accept-Language: en-GB,en-US;q=0.9,en;q=0.8^M*
>
> *Cookie: KNOXSESSIONID=node0d0mr30gaibvc1y10buxlzrvkz0.node0;
> DEFAULT_UI=NEW^M*
>
> *Sec-WebSocket-Key: A/FYAfIjeAwb4QqqO2pepg==^M*
>
> *Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits*
>
>
>
> *Does this look right ? If not, please provide any pointers on how to
> debug this further.*
>
>
>
> </snip>
>
>
>
> Attaching a log file from a new instance run after fixing
> *certificate_unknown* error. Sorry about sending such big files. Please
> check.
>
>
>
> Regards,
>
> Rajat
>
>
>
> *From: *Rajat Goel <[email protected]>
> *Reply to: *"[email protected]" <[email protected]>
> *Date: *Wednesday, 14 August 2019 at 8:20 PM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Knox jetty threads stuck and Seeing
> ConnectionPoolTimeoutException while proxying a custom service
>
>
>
> Hi Sandeep,
>
>
>
> Thanks for the reply. I see that this exception coming for connections
> from UI Client to Knox Gateway:
>
>
>
> *2019-08-11 04:43:27,432 DEBUG ssl.SslConnection
> (SslConnection.java:handshakeFailed(764)) - handshake failed
> SslConnection@1d3697f3::SocketChannelEndPoint@6d000fcf{/192.168.141.33:35102
> <http://192.168.141.33:35102><->/192.168.141.31:8443
> <http://192.168.141.31:8443>,OPEN,fill=-,flush=-,to=2/300000}{io=0/0,kio=0,kro=1}->SslConnection@1d3697f3{NEED_UNWRAP,eio=0/-1,di=-1,fill=IDLE,flush=IDLE}~>DecryptedEndPoint@20e999e9{/192.168.141.33:35102
> <http://192.168.141.33:35102><->/192.168.141.31:8443
> <http://192.168.141.31:8443>,OPEN,fill=-,flush=-,to=99/300000}=>HttpConnection@264e99b9[p=HttpParser{s=START,0
> of
> -1},g=HttpGenerator@717c736d{s=START}]=>HttpChannelOverHttp@570d89da{r=0,c=false,a=IDLE,uri=null,age=0}
> javax.net.ssl.SSLException: Received fatal alert: certificate_unknown*
>
>
>
> Now to fix this, do I have to add Knox certificate in my UI Client machine
> i.e. my laptop’s cacert  or what else ?
>
>
>
> One more query: For any SSL enabled service to be integrated with Knox,
> the same step i.e. adding Knox certificate in UI Client’s truststore is
> mandatory ?
>
>
>
> Regards,
>
> Rajat
>
>
>
> *From: *Sandeep Moré <[email protected]>
> *Reply to: *"[email protected]" <[email protected]>
> *Date: *Wednesday, 14 August 2019 at 7:36 PM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Knox jetty threads stuck and Seeing
> ConnectionPoolTimeoutException while proxying a custom service
>
>
>
> Hello Rajat,
>
>
>
> I see "avax.net.ssl.SSLException: Received fatal alert:
> certificate_unknown" errors in the logs, this appears to be the root cause.
>
> Appears to be a certificate issue.
>
>
>
> On Mon, Aug 12, 2019 at 1:51 PM Rajat Goel <[email protected]>
> wrote:
>
> Attaching full debug log as well.
>
>
>
> *From: *Rajat Goel <[email protected]>
> *Reply to: *"[email protected]" <[email protected]>
> *Date: *Monday, 12 August 2019 at 10:50 PM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Knox jetty threads stuck and Seeing
> ConnectionPoolTimeoutException while proxying a custom service
>
>
>
> Hi Sandeep,
>
>
>
> Thanks for your reply.
>
>
>
> Yes, the websocket traffic works as expected initially. Connection is
> established and data flows correctly. Websocket backend is secured as well.
> To check if the issue is due to SSL or not, I disabled SSL on My service
> and the issue got reproduced with no secure (non SSL) setup as well.
>
>
>
> Attaching instance logs from one cycle of reproduction. This instance uses
> SSOCookieProvider (for SSO) and had DEBUG logs enabled only for
> org.apache.knox.gateway package. Full debug logs file instance is large in
> size so cannot send via mail but let me know if you need that as well. Also
> attaching thread dump of Knox for one particular instance when Knox threads
> were stuck.
>
>
>
> I don’t see any errors in service log files. The issue is reproducible
> every time so if you need any other information, please do let me know.
>
>
>
> Thanks & Regards,
>
> Rajat
>
>
>
> *From: *Sandeep Moré <[email protected]>
> *Reply to: *"[email protected]" <[email protected]>
> *Date: *Monday, 12 August 2019 at 7:37 PM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Knox jetty threads stuck and Seeing
> ConnectionPoolTimeoutException while proxying a custom service
>
>
>
> Hello Rajat,
>
>
>
> Do you see Websocket traffic work as expected the first time when the UI
> works ? can you check using the developer console if the initial Websocket
> connection was established and data is flowing correctly?
>
>
>
> Looks like you are connecting to a secure backend, is the Websocket
> backend secure as well ?
>
> Can you also post a redacted gateway.log file with DEBUG logging, it would
> be helpful to see the entire log file with the stacktrace.
>
>
>
> Also, do you see any errors in your service log files ?
>
>
>
> Best,
>
> Sandeep
>
>
>
>
>
> On Mon, Aug 12, 2019 at 1:08 AM Rajat Goel <[email protected]>
> wrote:
>
> Hello,
>
>
>
> I need some help while trying to integrate my custom service with Knox.
> Here’s how the service is deployed:
>
>
>
>                                                         _   Knox (Instance
> 1) _                                           _  My Service1 (Master).
>
>
>                                 |
> |
>  |
>
> UI Client -> Load Balancer  à |                                         |
> --  For Availability à |
>
>                      (HA Proxy)
>             |                                         |       (HA
> Proxy)             |
>
>                                                      | _  Knox (Instance
> 2) _  |                                       | _ My Service2 (Standby)
>
>
>
>
>
> My custom service UI client uses both web sockets as well as REST calls to
> talk to My Service backend. I have written service definition xml and
> rewrite xml files and deployed them on a test setup as per the above
> architecture. My Knox version is 1.0 (HDP 3.1). I have added the service in
> Knox default topology and  default topology uses Shiro Provider for
> authentication and ‘gateway.websocket.feature.enabled’ is set to true.
>
>
>
> When Knox service is started, I see the following behavior:
>
>    - When I launch the UI from browser, I get Shiro authentication dialog
>    popup, I enter my credentials and get redirected to my Service UI.
>    Initially, for the first couple of minutes, UI works fine. All calls from
>    UI are getting routed properly to backend.
>    - After first few minutes, I start seeing following exception in Knox
>    gateway logs:
>
>                        2019-08-10 15:37:32,053 ERROR gateway.websockets
> (ProxyWebSocketAdapter.java:onWebSocketConnect(105)) - Unable to connect to
> websocket server: java.io.IOException: Connect failure
>
> java.io.IOException: Connect failure
>
> Caused by: org.eclipse.jetty.websocket.api.UpgradeException: 0 null
>
>         at
> org.eclipse.jetty.websocket.client.WebSocketUpgradeRequest.onComplete(WebSocketUpgradeRequest.java:515)
>
> Caused by: java.io.EOFException: HttpConnectionOverHTTP@1e8a6128
> ::SocketChannelEndPoint@5526bc43{
> rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443<->/
> 192.168.141.31:49860
> ,ISHUT,fill=-,flush=-,to=3/0}{io=0/0,kio=0,kro=1}->HttpConnectionOverHTTP@1e8a6128
> (l:/192.168.141.31:49860 <-> r:
> rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443,closed=false)=
> >HttpChannelOverHTTP@500cd5d(exchange=HttpExchange@75bf9768
> req=TERMINATED/null@null res=PENDING/null@null
> )[send=HttpSenderOverHTTP@6678aae8
> (req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator@6dcd6c0
> {s=START}],recv=HttpReceiverOverHTTP@5358963(rsp=IDLE,failure=null)[HttpParser{s=CLOSED,0
> of -1}]]
>
>         ... 13 more
>
>
>
>    - UI starts slowing down and response times increase.
>    - I also see DEBUG exceptions such as:
>
> DEBUG io.FillInterest (FillInterest.java:onFail(134)) - onFail
> FillInterest@1d803962{null}
>
> java.util.concurrent.TimeoutException: Idle timeout expired: 300000/300000
> ms
>
>         at
> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
>
>         at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
>
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
>         at java.lang.Thread.run(Thread.java:748)
>
>
>
>
>
> DEBUG io.WriteFlusher (WriteFlusher.java:onFail(471)) - ignored:
> WriteFlusher@3a2f4fc0{IDLE}->null
>
> java.nio.channels.ClosedChannelException
>
>         at org.eclipse.jetty.io.WriteFlusher.onClose(WriteFlusher.java:502)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.onClose(AbstractEndPoint.java:353)
>
>         at
> org.eclipse.jetty.io.ChannelEndPoint.onClose(ChannelEndPoint.java:216)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doClose(SslConnection.java:1132)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:220)
>
>
>
> DEBUG ssl.SslConnection (SslConnection.java:flush(950)) -
> SslConnection@1cb9d1f9::SocketChannelEndPoint@6fa560e8{/
> 192.168.141.31:44258<->/192.168.141.31:8443
> ,ISHUT,fill=-,flush=-,to=8/300000}{io=0/0,kio=0,kro=1}->SslConnection@1cb9d1f9
> {NEED_UNWRAP,eio=-1/-1,di=-1,fill=IDLE,flush=IDLE}~>DecryptedEndPoint@205f9957
> {/192.168.141.31:44258<->/192.168.141.31:8443
> ,CLOSED,fill=-,flush=-,to=9/300000}=>HttpConnection@11b5199c[p=HttpParser{s=CLOSED,0
> of -1},g=HttpGenerator@40706618{s=START}]=>HttpChannelOverHttp@655d0b98
> {r=0,c=false,a=IDLE,uri=null,age=0}
>
> java.io.IOException: Broken pipe
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.flush(SslConnection.java:847)
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doShutdownOutput(SslConnection.java:1076)
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doClose(SslConnection.java:1131)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:220)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
>
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
>
>         at
> org.eclipse.jetty.io.AbstractConnection.close(AbstractConnection.java:248)
>
>         at
> org.eclipse.jetty.server.HttpChannelOverHttp.earlyEOF(HttpChannelOverHttp.java:234)
>
>         at
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1551)
>
>         at
> org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:360)
>
>         at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:250)
>
>         at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
>
>         at
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
>
>         at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:411)
>
>
>
>    - Few minutes later, UI hangs. I start seeing following Exceptions in
>    Knox gateway log:
>
>                               2019-08-10 15:47:41,366 WARN  knox.gateway
> (DefaultDispatch.java:executeOutboundRequest(147)) - Connection exception
> dispatching request:
> https://rafint001-mgt-01.cloud.in.guavus.com:9443/_sock/iframe.html
> org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for
> connection from pool
>
>         at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:313)
>
>         at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:279)
>
>
>
>
>
>                                2019-08-10 15:47:41,371 ERROR knox.gateway
> (AbstractGatewayFilter.java:doFilter(63)) - Failed to execute filter:
> java.io.IOException: Service connectivity error.
>
> java.io.IOException: Service connectivity error.
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.executeOutboundRequest(DefaultDispatch.java:148)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.executeRequest(DefaultDispatch.java:116)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.doGet(DefaultDispatch.java:278)
>
>
>
>    - Few minutes later, when I refresh UI page (after clearing any
>    cookies) I see HTTP 401 Error on UI. This time no Shiro authentication
>    dialog popup. Nothing seems to work hereafter. I see following logs only in
>    gateway.log:
>
>                      2019-08-10 15:40:10,667 DEBUG knox.gateway
> (GatewayFilter.java:doFilter(119)) - Received request: GET /pdie/cdap/
>
> 2019-08-10 15:40:10,667 DEBUG authc.BasicHttpAuthenticationFilter
> (BasicHttpAuthenticationFilter.java:sendChallenge(274)) - Authentication
> required: sending 401 Authentication challenge response.
>
>
>
>    - When I restart Knox service from Ambari or run a ‘touch’ command on
>    topology file (/etc/knox/conf/topologies/default.xml), UI starts working
>    again and the above set of issues repeats.
>    - I tried increasing gateway threads (gateway.threadpool.max to 500)
>    and HTTP connections (gateway.httpclient.maxConnections to 100). With these
>    settings, UI works fine for a little more time as compared to scenario with
>    default values but eventually hit the same issues as above.
>    - When UI hangs, I tried to thread dump of Knox service. I see that
>    many of the threads in Knox are stuck in socket read calls with following
>    trace:
>
>                                   "qtp2099051403-139" #139 prio=5
> os_prio=0 tid=0x00007f4d7c002000 nid=0x6ed8 runnable [0x00007f4dee0e0000]
>
>    java.lang.Thread.State: RUNNABLE
>
>         at java.net.SocketInputStream.socketRead0(Native Method)
>
>         at
> java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>
>         at java.net.SocketInputStream.read(SocketInputStream.java:171)
>
>         at java.net.SocketInputStream.read(SocketInputStream.java:141)
>
>         at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
>
>         at sun.security.ssl.InputRecord.read(InputRecord.java:503)
>
>         at
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
>
>         - locked <0x00000006f471e218> (a java.lang.Object)
>
>         at
> sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
>
>         at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
>
>         - locked <0x00000006f471edc0> (a sun.security.ssl.AppInputStream)
>
>         at
> org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
>
>         at
> org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
>
>         at
> org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
>
>         at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
>
>         at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
>
>         at
> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
>
>         at
> org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
>
>         at
> org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
>
>         at
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
>
>         at
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>
>         at
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
>
>         at
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>
>         at
> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>
>         at
> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>
>         at
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>
>         at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>
>         at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
>
>         at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.executeOutboundRequest(DefaultDispatch.java:130)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.executeRequest(DefaultDispatch.java:116)
>
>         at
> org.apache.knox.gateway.dispatch.DefaultDispatch.doPost(DefaultDispatch.java:305)
>
>         at
> org.apache.knox.gateway.dispatch.GatewayDispatchFilter$PostAdapter.doMethod(GatewayDispatchFilter.java:177)
>
>         at
> org.apache.knox.gateway.dispatch.GatewayDispatchFilter.doFilter(GatewayDispatchFilter.java:122)
>
>         at
> org.apache.knox.gateway.filter.AbstractGatewayFilter.doFilter(AbstractGatewayFilter.java:61)
>
>         at
> org.apache.knox.gateway.GatewayFilter$Holder.doFilter(GatewayFilter.java:372)
>
>         at
> org.apache.knox.gateway.GatewayFilter$Chain.doFilter(GatewayFilter.java:272)
>
>
>
> Can some please help on resolving the above set of issues:
>
>    1. With the deployment architecture as described above: UI -> HAProxy
>    -> Knox -> HAProxy -> Service backend, are there any specific
>    configurations required in HA Proxy and Knox to make this work. Is this
>    right way to deploy ?
>    2. ‘org.eclipse.jetty.websocket.api.UpgradeException’ and how can this
>    be fixed?
>    3. How to debug and fix Knox Jetty threads hang ?
>
>
>
> Thanks & Regards,
>
> Rajat
>
>

Reply via email to