[ 
https://issues.apache.org/jira/browse/NIFI-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590726#comment-16590726
 ] 

Mark Payne commented on NIFI-5522:
----------------------------------

[~Diego Queiroz] thanks for the detailed info in this ticket. I have been able 
to reproduce this issue (extremely consistently - so far 100% of the time) by 
following these steps:

1) Build two flows: the first is HandleHttpRequest -> HandleHttpResponse (reply 
with 200 status code). The second is GenerateFlowFile (generate 1 MB FlowFiles) 
-> PostHTTP (configure the URL to point to the HandleHttpRequest processor)

2) Set a breakpoint in PostHTTP, line 607, where it calls StreamUtils.copy(). 
Attach debugger.

3) Start all Processors.

4) Step into StreamUtils.copy. Step to line 36, where it calls 
{{destination.write(buffer, 0, len);}} Before running that line, use the 
debugger to change the value of the {{len}} variable from {{8192}} to {{1}} and 
then resume the application. This will result in writing out a 
{{Content-Length}} header of 1 MB but sending less data than this. As a result, 
the {{HandleHttpRequest}} processor will block, waiting for this data to become 
available. Eventually, it will timeout.

5) Once the timeout occurs, the Jetty Server gets into a bad state that it does 
not recover from. Stack trace for HandleHttpRequest is identical to that in 
NIFI-5132.

At this point, regardless of what we do with PostHTTP, or any other processor, 
HandleHttpRequest is blocked indefinitely and won't recover.

Unfortunately, this appears to be a bug in Jetty. I can also verify that this 
issue did not occur in older versions of NiFi, before Jetty was upgraded.

Now for the good news! I pulled down PR 2961 (for NIFI-5479). I did a full 
build of that PR and repeated the above steps (numerous times) and have not 
been able to cause the poor behavior. The upgrade to the latest version of 
Jetty appears to have addressed this bug. I've not fully finished reviewing PR 
2961 yet, but from my testing so far all is looking great.

> HandleHttpRequest enters in fault state and does not recover
> ------------------------------------------------------------
>
>                 Key: NIFI-5522
>                 URL: https://issues.apache.org/jira/browse/NIFI-5522
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.7.0, 1.7.1
>            Reporter: Diego Queiroz
>            Priority: Critical
>              Labels: security
>         Attachments: HandleHttpRequest_Error_Template.xml, 
> image-2018-08-15-21-10-27-926.png, image-2018-08-15-21-10-33-515.png, 
> image-2018-08-15-21-11-57-818.png, image-2018-08-15-21-15-35-364.png, 
> image-2018-08-15-21-19-34-431.png, image-2018-08-15-21-20-31-819.png, 
> test_http_req_resp.xml
>
>
> HandleHttpRequest randomly enters in a fault state and does not recover until 
> I restart the node. I feel the problem is triggered when some exception 
> occurs (ex.: broken request, connection issues, etc), but I am usually able 
> to reproduce this behavior stressing the node with tons of simultaneous 
> requests:
> {{# example script to stress server}}
>  {{for i in `seq 1 10000`; do}}
>  {{   wget ‐T10 ‐t10 ‐qO‐ 'http://127.0.0.1:64080/'>/dev/null &}}
>  {{done}}
> When this happens, HandleHttpRequest start to return "HTTP ERROR 503 - 
> Service Unavailable" and does not recover from this state:
> !image-2018-08-15-21-10-33-515.png!
> If I try to stop the HandleHttpRequest processor, the running threads does 
> not terminate:
> !image-2018-08-15-21-11-57-818.png!
> If I force them to terminate, the listen port continue being bound by NiFi:
> !image-2018-08-15-21-15-35-364.png!
> If I try to connect again, I got a HTTP ERROR 500:
> !image-2018-08-15-21-19-34-431.png!
>  
> If I try to start the HandleHttpRequest processor again, it doesn't start 
> with the message:
>  * {{ERROR [Timer-Driven Process Thread-11] 
> o.a.n.p.standard.HandleHttpRequest 
> HandleHttpRequest[id=9bae326b-5ac3-3e9f-2dac-c0399d8f2ddb] 
> {color:#FF0000}*Failed to process session due to 
> org.apache.nifi.processor.exception.ProcessException: Failed to initialize 
> the server: org.apache.nifi.processor.exception.ProcessException: Failed to 
> initialize the server*{color}}}{\{ 
> org.apache.nifi.processor.exception.ProcessException: Failed to initialize 
> the server}}\{{ {{ at 
> org.apache.nifi.processors.standard.HandleHttpRequest.onTrigger(HandleHttpRequest.java:501)}}}}\{{
>  {{ at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)}}}}\{{
>  {{ at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)}}}}\{{
>  {{ at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)}}}}\{{
>  {{ at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)}}}}\{{
>  {{ at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}}}\{{
>  {{ at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)}}}}\{{ {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)}}}}\{{
>  {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)}}}}\{{
>  {{ at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}}}\{{
>  {{ at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}}}\{{
>  {{ at java.lang.Thread.run(Thread.java:748)}}}}{{ {color:#FF0000}*Caused by: 
> java.net.BindException: Address already in use*{color}}}\{{ {{ at 
> sun.nio.ch.Net.bind0(Native Method)}}}}\{{ {{ at 
> sun.nio.ch.Net.bind(Net.java:433)}}}}\{{ {{ at 
> sun.nio.ch.Net.bind(Net.java:425)}}}}\{{ {{ at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)}}}}\{{
>  {{ at 
> sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)}}}}\{{ {{ at 
> org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:298)}}}}\{{
>  {{ at 
> org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)}}}}\{{
>  {{ at 
> org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)}}}}\{{
>  {{ at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)}}}}\{{
>  {{ at org.eclipse.jetty.server.Server.doStart(Server.java:431)}}}}\{{ {{ at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)}}}}\{{
>  {{ at 
> org.apache.nifi.processors.standard.HandleHttpRequest.initializeServer(HandleHttpRequest.java:430)}}}}\{{
>  {{ at 
> org.apache.nifi.processors.standard.HandleHttpRequest.onTrigger(HandleHttpRequest.java:489)}}
>  \{ Unknown macro: { ... 11 common frames omitted}}}{{}}}
> !image-2018-08-15-21-20-31-819.png!
>  
> The only way to workaround this when it happens is chaging the port it 
> listens to or restarting NiFi service. I flagged this as a security issue 
> because it allows someone to cause a DoS to the service.
> I found several similar issues, but most of them are related with old 
> versions, I am can confirm this affects versions 1.7.0 and 1.7.1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to