[ 
https://issues.apache.org/jira/browse/NIFI-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898390#comment-15898390
 ] 

Dima Kovalyov commented on NIFI-2699:
-------------------------------------

I hit this problem all the time when i work with large NiFi flows, 6 groups, 
up-to 8 processors in each with 30000+ flow files in queue.

> Improve handling of response timeouts in cluster
> ------------------------------------------------
>
>                 Key: NIFI-2699
>                 URL: https://issues.apache.org/jira/browse/NIFI-2699
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework, Core UI
>            Reporter: Jeff Storck
>            Priority: Minor
>
> When running as a cluster, if a node is unable to respond within the socket 
> timeout (eg, hitting a breakpoint while debugging), an 
> IllegalClusterStateException will be thrown that causes the UI to show the 
> "check config and fix errors" page.  Once the node is communicating with the 
> cluster again (i.e., breakpoint in the code is passed), the UI can be 
> reloaded and the cluster recovers from the timeout without any user 
> intervention at the service level. However, user experience could be 
> improved.  If a user initiates a replicated request to a node that is unable 
> to respond within the socket timeout duration, the user might think NiFi 
> crashed, when it in fact didn't.
> Here is the stack trace that was encountered during testing:
> {code}
> 2016-08-29 11:36:59,041 DEBUG [NiFi Web Server-22] 
> o.a.n.w.a.c.IllegalClusterStateExceptionMapper
> org.apache.nifi.cluster.manager.exception.IllegalClusterStateException: Node 
> localhost:8443 is unable to fulfill this request due to: Unexpected Response 
> Code 500
>         at 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$2.onCompletion(ThreadPoolRequestReplicator.java:471)
>  ~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>         at 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$NodeHttpRequest.run(ThreadPoolRequestReplicator.java:729)
>  ~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_92]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_92]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_92]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[na:1.8.0_92]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
> Caused by: com.sun.jersey.api.client.ClientHandlerException: 
> java.net.SocketTimeoutException: Read timed out
>         at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
>  ~[jersey-client-1.19.jar:1.19]
>         at com.sun.jersey.api.client.Client.handle(Client.java:652) 
> ~[jersey-client-1.19.jar:1.19]
>         at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682) 
> ~[jersey-client-1.19.jar:1.19]
>         at 
> com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) 
> ~[jersey-client-1.19.jar:1.19]
>         at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:560) 
> ~[jersey-client-1.19.jar:1.19]
>         at 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator.replicateRequest(ThreadPoolRequestReplicator.java:537)
>  ~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>         at 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$NodeHttpRequest.run(ThreadPoolRequestReplicator.java:720)
>  ~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>         ... 5 common frames omitted
> Caused by: java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method) 
> ~[na:1.8.0_92]
>         at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) 
> ~[na:1.8.0_92]
>         at java.net.SocketInputStream.read(SocketInputStream.java:170) 
> ~[na:1.8.0_92]
>         at java.net.SocketInputStream.read(SocketInputStream.java:141) 
> ~[na:1.8.0_92]
>         at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) 
> ~[na:1.8.0_92]
>         at sun.security.ssl.InputRecord.read(InputRecord.java:503) 
> ~[na:1.8.0_92]
>         at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) 
> ~[na:1.8.0_92]
>         at 
> sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930) 
> ~[na:1.8.0_92]
>         at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) 
> ~[na:1.8.0_92]
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) 
> ~[na:1.8.0_92]
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) 
> ~[na:1.8.0_92]
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:345) 
> ~[na:1.8.0_92]
>         at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) 
> ~[na:1.8.0_92]
>         at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) 
> ~[na:1.8.0_92]
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)
>  ~[na:1.8.0_92]
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
>  ~[na:1.8.0_92]
>         at 
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) 
> ~[na:1.8.0_92]
>         at 
> sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338)
>  ~[na:1.8.0_92]
>         at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
>  ~[jersey-client-1.19.jar:1.19]
>         at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
>  ~[jersey-client-1.19.jar:1.19]
>         ... 11 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to