[ 
https://issues.apache.org/jira/browse/NIFI-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16582895#comment-16582895
 ] 

Diego Queiroz commented on NIFI-5522:
-------------------------------------

I can confirm there is nothing related with the state of the queue, or the 
script itself. I assure there is no way to recover from this error when it 
happens, except by restarting NiFi.

I would ask you to try to run this script several times (with higher numbers, 
like 100k requests) and even from other machine, until you're able to fully 
stress your system. In fact, I am using a average shared server, which does not 
perform lika a high end personal computer.

Also, you said you're using a Mac. I didn't said, but I am using Linux (Oracle 
Linux 7, specifically), so they may behave differently. I am not a Mac expert, 
but I think you don't use wget (AFAIK, wget isn't available in macOS by 
default). If you adapted my script using curl, just assure all requests are 
being made at the same time, in parallel.

If even doing this you're not able to reproduce, I would guess this is somewhat 
related with some bug that was already fixed in master, but continues in the 
last release (I was able to reproduce it on 1.6.0, 1.7.0 and 1.7.1). I also 
think that this bug may be related with Jetty backend. Maybe your Jetty version 
is different than mine, doesn't it?

> HandleHttpRequest enters in fault state and does not recover
> ------------------------------------------------------------
>
>                 Key: NIFI-5522
>                 URL: https://issues.apache.org/jira/browse/NIFI-5522
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.7.0, 1.7.1
>            Reporter: Diego Queiroz
>            Priority: Critical
>              Labels: security
>         Attachments: HandleHttpRequest_Error_Template.xml, 
> image-2018-08-15-21-10-27-926.png, image-2018-08-15-21-10-33-515.png, 
> image-2018-08-15-21-11-57-818.png, image-2018-08-15-21-15-35-364.png, 
> image-2018-08-15-21-19-34-431.png, image-2018-08-15-21-20-31-819.png, 
> test_http_req_resp.xml
>
>
> HandleHttpRequest randomly enters in a fault state and does not recover until 
> I restart the node. I feel the problem is triggered when some exception 
> occurs (ex.: broken request, connection issues, etc), but I am usually able 
> to reproduce this behavior stressing the node with tons of simultaneous 
> requests:
> {{# example script to stress server}}
>  {{for i in `seq 1 10000`; do}}
>  {{   wget ‐T10 ‐t10 ‐qO‐ 'http://127.0.0.1:64080/'>/dev/null &}}
>  {{done}}
> When this happens, HandleHttpRequest start to return "HTTP ERROR 503 - 
> Service Unavailable" and does not recover from this state:
> !image-2018-08-15-21-10-33-515.png!
> If I try to stop the HandleHttpRequest processor, the running threads does 
> not terminate:
> !image-2018-08-15-21-11-57-818.png!
> If I force them to terminate, the listen port continue being bound by NiFi:
> !image-2018-08-15-21-15-35-364.png!
> If I try to connect again, I got a HTTP ERROR 500:
> !image-2018-08-15-21-19-34-431.png!
>  
> If I try to start the HandleHttpRequest processor again, it doesn't start 
> with the message:
>  * {{ERROR [Timer-Driven Process Thread-11] 
> o.a.n.p.standard.HandleHttpRequest 
> HandleHttpRequest[id=9bae326b-5ac3-3e9f-2dac-c0399d8f2ddb] 
> {color:#FF0000}*Failed to process session due to 
> org.apache.nifi.processor.exception.ProcessException: Failed to initialize 
> the server: org.apache.nifi.processor.exception.ProcessException: Failed to 
> initialize the server*{color}}}{\{ 
> org.apache.nifi.processor.exception.ProcessException: Failed to initialize 
> the server}}\{{ {{ at 
> org.apache.nifi.processors.standard.HandleHttpRequest.onTrigger(HandleHttpRequest.java:501)}}}}\{{
>  {{ at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)}}}}\{{
>  {{ at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)}}}}\{{
>  {{ at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)}}}}\{{
>  {{ at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)}}}}\{{
>  {{ at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}}}\{{
>  {{ at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)}}}}\{{ {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)}}}}\{{
>  {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)}}}}\{{
>  {{ at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}}}\{{
>  {{ at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}}}\{{
>  {{ at java.lang.Thread.run(Thread.java:748)}}}}{{ {color:#FF0000}*Caused by: 
> java.net.BindException: Address already in use*{color}}}\{{ {{ at 
> sun.nio.ch.Net.bind0(Native Method)}}}}\{{ {{ at 
> sun.nio.ch.Net.bind(Net.java:433)}}}}\{{ {{ at 
> sun.nio.ch.Net.bind(Net.java:425)}}}}\{{ {{ at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)}}}}\{{
>  {{ at 
> sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)}}}}\{{ {{ at 
> org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:298)}}}}\{{
>  {{ at 
> org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)}}}}\{{
>  {{ at 
> org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)}}}}\{{
>  {{ at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)}}}}\{{
>  {{ at org.eclipse.jetty.server.Server.doStart(Server.java:431)}}}}\{{ {{ at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)}}}}\{{
>  {{ at 
> org.apache.nifi.processors.standard.HandleHttpRequest.initializeServer(HandleHttpRequest.java:430)}}}}\{{
>  {{ at 
> org.apache.nifi.processors.standard.HandleHttpRequest.onTrigger(HandleHttpRequest.java:489)}}
>  \{ Unknown macro: { ... 11 common frames omitted}}}{{}}}
> !image-2018-08-15-21-20-31-819.png!
>  
> The only way to workaround this when it happens is chaging the port it 
> listens to or restarting NiFi service. I flagged this as a security issue 
> because it allows someone to cause a DoS to the service.
> I found several similar issues, but most of them are related with old 
> versions, I am can confirm this affects versions 1.7.0 and 1.7.1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to