Andre,

In that case, I agree with you that a 4 would be the proper response. Things 
that I can
think of that may cause it not to respond:

1) Long Garbage Collection pause
2) Stuck in some sort of infinite loop or just way overtaxed CPU
3) Too many open files prevents it from accepting the connection

Not sure what else may cause this...

Thanks
-Mark

> On Oct 14, 2016, at 9:08 PM, Andre <andre-li...@fucs.org> wrote:
> 
> devs,
> 
> I am reviewing PR#1093, which happens to be a great contribution towards a
> LSB compliant NiFi (something the overall community seems to be eager to
> have).
> 
> The PR basically changes RunNiFi.java so that it returns a numeric exit
> code compatible with the LSB specifications.
> 
> I am happy with the overall code but there's one sticking point:
> 
> Should we return 0 (i.e. "healthy") when "Apache NiFi is running at PID {}
> but is not responding to ping requests" ?
> 
> The LSB defines:
> 
> "
> If the status action is requested, the init script will return the
> following exit status codes.
> 
> 0 program is running or service is OK
> 1 program is dead and /var/run pid file exists
> 2 program is dead and /var/lock lock file exists
> 3 program is not running
> 4 program or service status is unknown
> 5-99 reserved for future LSB use
> 100-149 reserved for distribution use
> 150-199 reserved for application use
> 200-254 reserved
> "
> 
> My reading is that we should return 4, for the JVM PID is currently
> running, however, the absence of a ping response could signal the NiFi
> program running within the JVM is not healthy. (the PR contribution returns
> 0).
> 
> Would anyone have a view on what usually would cause a NiFi instance to be
> "running" but unable to respond to pings? Whenever that happens should we
> return 0 (running/service ok) or 4 (program/service status unknown)?
> 
> I thank you in advance

Reply via email to