Hi David, Pull request 177 <https://github.com/qos-ch/logback/pull/177> has a fairly thorough implementation of a shutdown hook that handles flushing the queue as discussed (a queue idle period used by the hook to decide that other threads are done logging and a maximum runtime to prevent an excessive delay during shutdown). I added a few unit tests to cover the queue idle period, maximum runtime, and to verify worker and hook threads. I just thought of one more unit test that should be added (to verify shutdown hook installation and removal) - I'll add that tomorrow.
Regards, Mike Reinhold On Thu, Mar 6, 2014 at 2:16 PM, David Roussel <[email protected]>wrote: > Sounds good. Create a pull request and link back to it from this thread. > > On 6 Mar 2014, at 18:48, Michael Reinhold <[email protected]> wrote: > > Hi David, Chris, > > Thanks for the feedback. In the test case that I ran with the > AsyncAppender, LoggerContext.stop did not wait for the async appender to > finish flushing before exiting - the stop method returned immediately. I > just retested this scenario to be sure. My test should generate exactly 135 > log events (also going to a local file appender, so I can confirm that > entries are present or missing). Without the async appender I get all of > the expected events (at much lower throughput); with the async appender > (using the call to LoggerContext.stop prior to exiting) I get a variable > number (generally less than 20 events). It's possible that the lower > latency of flushing to local files results in the async appender being able > to flush completely to disk before the JVM exits, whereas the latency to > Loggly is just too high. Queue size probably also has an impact - I allow > for a fairly large queue because the penalty of falling back to synchronous > logging for Loggly is large and my logging tends to come in bursts. > > If stop() did wait for async appenders to flush, that would improve the > situation. Although the use-case that involves shutdown hooks that want to > log would still need to be addressed. The only way that I can see that does > address allowing shutdown hooks to log is via a queue flushing shutdown > hook (requiring a call to LoggerContext.stop won't help for shutdown hooks > or applications that use JVM implicit shutdowns). My opinion is that a > shutdown hook can satisfy both needs with minimal impact to other classes > or semantics on how LoggerContext.stop functions. One hook per async > appender would be simple to implement and would also allow for parallel > flushing of async appender queues. > > A default shutdown timeout as you described makes sense to me and fits > with what I was thinking for implementation of a shutdown hook. > > I could probably throw together a quick and dirty implementation tonight > as a proof of concept. Thoughts? > > Regards, > > Mike Reinhold > > > > On Thu, Mar 6, 2014 at 1:01 PM, Chris Pratt <[email protected]>wrote: > >> Another option might be to require calling the LoggerContext.stop when >> using AsyncAppender's. The stop method could signal to the AsyncAppenders >> that they should drain their queue's and exit, which would allow the >> application to shut down naturally (if the AsyncAppenders aren't started as >> demon's) (*Chris*) >> >> >> On Thu, Mar 6, 2014 at 9:55 AM, David Roussel < >> [email protected]> wrote: >> >>> Mike, >>> >>> I would expect the shutdown of logback to wait for async appenders to >>> drain their queues before we can consider the shutdown down. >>> >>> I have used async appenders to provide faster logging when writing to >>> local files. And that's what I expected happens. >>> >>> Of course there is the problem of waiting for too long. How about an >>> async appender has a default shutdown timeout which can be overridden by >>> config. So it can wait 2 seconds by default, but can be overridden to 30 >>> seconds for the loggly usecase. >>> >>> The simple solution is to put this all in AsyncAppender(Base), but if we >>> want to allow multiple async appenders to shutdown in parallel, then the >>> fix might involve more changes. >>> >>> David >>> >>> On 6 Mar 2014, at 12:51, Michael Reinhold <[email protected]> >>> wrote: >>> >>> Hi David, >>> >>> I forgot to mention this in my original note and it is a good thought, >>> but yes I have tried to cleanly shutdown Logback. Even when the >>> LogbackContext stop method is invoked prior to JVM exit, the async queue is >>> not flushed completely. When Logback is stopped cleanly, the async worker >>> thread exits the "waiting for events in the queue" while loop and moves on >>> to the "flush remaining elements from the queue" for loop (confirmed by the >>> info message when debug is enabled and via breakpoint). Ultimately, because >>> the thread is a daemon thread - this process is still interrupted by JVM >>> shutdown. There is no guarantee that the async worker will get enough time >>> to process all events. >>> >>> From a theoretical standpoint, stopping Logback properly does not solve >>> flushing issues with the AsycAppender in a number of cases (it may be fair >>> to argue that it should not attempt to cover all of these cases): >>> >>> - Queue is relatively full or wrapped appender latency is high - the >>> async worker may not be able to finish processing the queued items >>> between >>> when the LogbackContext.stop method is invoked and when the JVM actually >>> exits >>> - Implicit JVM exit upon end of last user thread - some applications >>> don't explicitly call System.exit but rather rely on implicit JVM exit. >>> In >>> that case, where would you cleanly shutdown Logback? It can probably be >>> done, but would require an application redesign to avoid reliance on >>> implicit exit or some form of thread tracking thread. >>> - Shutdown hooks are used to clean up resources - some applications >>> have components or modules that need to clean up after themselves when >>> the >>> JVM shuts down. Typically, shutdown hooks are used for this purpose, >>> however the AsyncAppender's worker thread is unable to process queued >>> events after shutdown has been initiated (while the shutdown hooks are >>> running). This also prevents shutdown hooks from being able to log events >>> - Signal handling on *nix systems - signals can be sent to the >>> application for a number of reasons, most often to request graceful >>> shutdown or to terminate something that is non-responsive. If the signal >>> corresponds to something that generally means "graceful shutdown" on the >>> host system, normal cleanup routines should execute in the application >>> (typically implemented as shutdown hooks, so maybe this is the same as >>> the >>> previous one). >>> >>> From a practical standpoint, I am running into the first and the third >>> scenarios. Even shutting down cleanly, the latency and queue depth can >>> cause elements to be missed; additionally, I have shutdown hooks that clean >>> up resources and expect the ability to log events and errors. >>> >>> My thought is that a modified AsyncAppenderBase implementation could >>> (optionally) install a shutdown hook that continues watching the queue for >>> new events. A configurable "queue idle time" could be used to tell the >>> shutdown hook that it has probably caught all events and can safely >>> shutdown (not foolproof, I'm still thinking about other ways of handling >>> it). Additionally, the hook could have a configurable "max processing time" >>> so that it doesn't keep the JVM alive for an undesirable amount of time >>> (thinking in the context of automation software where shutdown requests >>> typically have a max execution time before they are considered as "failed >>> online"). >>> >>> I hope this helps clarify the scenario I am considering and a proposed >>> solution! >>> >>> Regards, >>> >>> Mike Reinhold >>> >>> >>> On Wed, Mar 5, 2014 at 5:30 PM, David Roussel < >>> [email protected]> wrote: >>> >>>> Did you try shutting down logback cleanly. Like this >>>> http://stackoverflow.com/questions/3678755/do-i-need-to-flush-events-when-shutting-down-using-logback >>>> >>>> David >>>> >>>> On 5 Mar 2014, at 20:44, Michael Reinhold <[email protected]> >>>> wrote: >>>> >>>> Hi Ceki, >>>> >>>> I am currently using the AsyncAppender in combination with the >>>> LogglyAppender from the Logback extensions project. While working on a >>>> number of aspects of my application, I found that I was not consistently >>>> getting all of the log messages that I expected. In particular, when the >>>> application shuts down immediately (or very shortly) after a burst of >>>> logging activity, the tail of those log events is often not present in >>>> Loggly. From a number of tests, this issue is not restricted to use with >>>> the LogglyAppender, but is simply more evident because of the latency >>>> involved. >>>> >>>> Looking through the source code for the AsyncAppenderBase, I saw the >>>> following: >>>> >>>> You create the Async sender thread as a Daemon thread >>>> >>>> >>>> addInfo("Setting discardingThreshold to " + discardingThreshold); >>>> worker.setDaemon(true); >>>> worker.setName("AsyncAppender-Worker-" + worker.getName()); >>>> // make sure this instance is marked as "started" before staring >>>> the worker Thread >>>> super.start(); >>>> worker.start(); >>>> >>>> >>>> In the sender thread, if you determine that the parent thread has >>>> stopped or the async sender thread has been interrupted, you allow the >>>> worker thread to flush remaining log events in the queue. >>>> >>>> >>>> while (parent.isStarted()) { >>>> try { >>>> E e = parent.blockingQueue.take(); >>>> aai.appendLoopOnAppenders(e); >>>> } catch (InterruptedException ie) { >>>> break; >>>> } >>>> } >>>> >>>> addInfo("Worker thread will flush remaining events before >>>> exiting. "); >>>> for (E e : parent.blockingQueue) { >>>> aai.appendLoopOnAppenders(e); >>>> } >>>> >>>> aai.detachAndStopAllAppenders(); >>>> >>>> >>>> From what I can tell, during JVM shutdown (for a standalone SE >>>> instance, the same is probably not true for EE instances with application >>>> servers) daemon threads may be terminated without allowing the the >>>> AsyncAppenderBase to escape the while loop and proceed onto the queue flush >>>> for loop. >>>> >>>> From Brian Goetz in Java Concurrency in Practice: >>>> "When a thread exits, the JVM performs an inventory of running threads, >>>> and if the only threads that are left are daemon threads, it initiates an >>>> orderly shutdown. When the JVM halts, any remaining daemon threads are >>>> abandoned finally blocks are not executed, stacks are not unwound the JVM >>>> just exits. Daemon threads should be used sparingly few processing >>>> activities can be safely abandoned at any time with no cleanup. In >>>> particular, it is dangerous to use daemon threads for tasks that might >>>> perform any sort of I/O. Daemon threads are best saved for "housekeeping" >>>> tasks, such as a background thread that periodically removes expired >>>> entries from an in-memory cache." >>>> >>>> To test this, I inserted a break point in the AsyncAppenderBase at the >>>> call to addInfo and ran a variety of different scenarios. At no point in >>>> time was I able to get the breakpoint to stop at the addInfo line. >>>> >>>> I don't think there are any clear cut solutions to this. Making the >>>> worker thread a user thread instead of daemon thread has its own >>>> implications, particularly that if all other user threads have exited the >>>> async threads would keep the JVM instance alive (unless System.exit has >>>> been called, in which case I believe that you will still have lost log >>>> events even if the async processing thread is not a daemon). It might be >>>> possible to create a shutdown hook that does the queue flushing for the >>>> async worker - though you may need to consider the possibility of other >>>> shutdown hooks wanting to log events as well. >>>> >>>> I'm currently mocking up a version of the AsyncAppenderBase and >>>> AsyncAppender that installs a shutdown hook as described previously. I >>>> think a "queue idle time" configuration might be the best way to handle >>>> other shutdown hooks adding log events (aka - after processing whatever was >>>> in the queue, if no new events are added within *x* ms then the >>>> shutdown hook can assume nothing else will be adding log events and can >>>> exit). An alternative might be to have the shutdown hook query the >>>> ThreadMXBean API to determine if other shutdown hooks are still executing >>>> and possibly adding log events (though the threads that are expected to be >>>> running during shutdown may be not only application specific but also JVM >>>> implementation specific... I'm not sure). >>>> >>>> Let me know what you think. I'll let you know if I feel that my mockup >>>> may be viable... >>>> >>>> Regards, >>>> >>>> Mike Reinhold >>>> >>>> _______________________________________________ >>>> Logback-user mailing list >>>> [email protected] >>>> http://mailman.qos.ch/mailman/listinfo/logback-user >>>> >>>> >>>> _______________________________________________ >>>> Logback-user mailing list >>>> [email protected] >>>> http://mailman.qos.ch/mailman/listinfo/logback-user >>>> >>> >>> _______________________________________________ >>> Logback-user mailing list >>> [email protected] >>> http://mailman.qos.ch/mailman/listinfo/logback-user >>> >>> >>> _______________________________________________ >>> Logback-user mailing list >>> [email protected] >>> http://mailman.qos.ch/mailman/listinfo/logback-user >>> >> >> >> _______________________________________________ >> Logback-user mailing list >> [email protected] >> http://mailman.qos.ch/mailman/listinfo/logback-user >> > > _______________________________________________ > Logback-user mailing list > [email protected] > http://mailman.qos.ch/mailman/listinfo/logback-user > > > > _______________________________________________ > Logback-user mailing list > [email protected] > http://mailman.qos.ch/mailman/listinfo/logback-user >
_______________________________________________ Logback-user mailing list [email protected] http://mailman.qos.ch/mailman/listinfo/logback-user
