[ 
https://issues.apache.org/jira/browse/LOG4J2-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823030#comment-15823030
 ] 

Charles Hache commented on LOG4J2-1784:
---------------------------------------

Thanks for the prompt replies!

Changing the configuration status to Trace is a good idea that I overlooked in 
my haste.  Thanks for the tip.  Simply discovering that the logger was 
affecting the application was a revelation that took a long time to realize, 
especially since the symptoms it caused are the same ones I'm trying to debug 
(namely funny inconsistent timing in written CSV files).

I did see the notes about renaming in the documentation, but I naively assumed 
that by choosing a max value that was large enough to never be hit I would 
sidestep any performance issues (since if the max is never reached, no renaming 
is ever done).  Of course upon reflection it's obvious that any such 
implementation would have to look at the potential existence of the files 
before creating the new one.

Now that I have a better understanding of how it is working, I can see that 
there isn't really a way to safely get around the issue completely 
(unreasonably large max sizes will always have the potential to cause this 
problem).

I propose a few rough changes to the docs which, to me at least, would have 
made things a bit clearer:

"When the ascending attribute is set to true (the default) then the counter 
will be incremented and the current log file will be renamed to include the 
counter value."  This ideally would more clearly indicate that a counter is not 
really being incremented, but that all the files in the directory are going to 
be looked at and a new counter will be calculated.  Maybe sometime like 
"...then a new counter will be calculated to be one larger than the existing 
largest in the target directory..."  Maybe that's a bit verbose, but something 
along those lines.

"Note that with this counting strategy specifying a large maximum value may 
entirely avoid renaming files."  This could be amended to "...may entirely 
avoid renaming files, but note that the logging system needs to check for the 
possible existence of log files up to the maximum value which could cause 
performance issues for very large maximum values."

Finally "Given that this rollover algorithm requires as many file renaming 
operations as the window size, large window sizes are discouraged."  This could 
be changed to specify that it has to do file lookups as well (not just 
potential renaming), so maybe like "Given that this rollover algorithm requires 
as many file lookups (and potential renaming operations) as the window size, 
large window sizes are discouraged."  

Hopefully this helps.  Thanks for all the great work!

> Large values for max in DefaultRolloverStrategy blocks application
> ------------------------------------------------------------------
>
>                 Key: LOG4J2-1784
>                 URL: https://issues.apache.org/jira/browse/LOG4J2-1784
>             Project: Log4j 2
>          Issue Type: Improvement
>          Components: Appenders
>    Affects Versions: 2.7
>         Environment: Played around a lot in 2.5, once realized issue was 
> related to log4j, upgraded to 2.7 and confirmed it is still happening.  Using 
> on a raspberry pi.
>            Reporter: Charles Hache
>            Assignee: Ralph Goers
>
> Hi all,
> I'm trying to debug an application which is using log4j2 for logging.  After 
> discovering that by default the RollingRandomAccessFile only kept the last 7 
> logs using the counter, I found the DefaultRolloverStrategy option and set 
> the max to something like 99999, so in theory I wouldn't have to worry about 
> it.
> A couple days later I started looking at everything again (log files from 
> log4j, timestamped CSV files from the application, etc) I can see something 
> is going terribly wrong.  What I'm finding is that when it comes time to 
> rotate the logs, it looks like the whole application hangs.  I see the 
> following:
> - All logging to the log file stops;
> - All my data going to the CSV files stop;
> - CPU usage goes to 100%;
> - I cannot stop my application cleanly;
> After a while (~4-5 minutes) everything goes back to normal (until the next 
> log rotation), but then the first lines of the logs have timestamps like this:
> 2017-01-14 07:43:00,001 INFO ....
> 2017-01-14 07:42:11,329 DEBUG ....
> 2017-01-14 07:47:26,648 TRACE ....
> 2017-01-14 07:47:26,695 INFO ....
> 2017-01-14 07:47:26,697 TRACE ....
> 2017-01-14 07:47:26,737 INFO ....
> 2017-01-14 07:47:26,739 TRACE ....
> 2017-01-14 07:47:26,741 TRACE ....
> 2017-01-14 07:47:26,744 TRACE ....
> 2017-01-14 07:47:26,768 TRACE ....
> Of note is that there is a gap of a few minutes in the first couple lines of 
> the file (I'm not too concerned about the out-of-order timestamps as lots of 
> different threads are logging).  The application always writes many (dozens, 
> maybe hundreds) of lines per second, so a gap of a couple minutes is 
> uncharacteristic.  The rest of the log file looks normal (has no gaps, has 
> log lines every few milliseconds).  From my specific log lines and where they 
> are w.r.t. each other, I looks to me like calls to the logger are blocking 
> (eg: in cases I'll have a couple log lines that usually log 1-2 milliseconds 
> apart, but when they occur at rollover time they'll be minutes apart).
> I shrunk the file size rollover to something small (256K), which I hit in 
> maybe 15 seconds or so, and it is clearly reproducible.  When I reduced the 
> default rollover strategy max file to 40, the problem when away and I can see 
> it rolling over the file practically instantly.
> The documentation for the DefaultRolloverStrategy says "When the ascending 
> attribute is set to true (the default) then the counter will be incremented 
> and the current log file will be renamed to include the counter value. If the 
> counter hits the maximum value then the oldest file, which will have the 
> smallest counter, will be deleted, all other files will be renamed to have 
> their counter decremented and then the current file will be renamed to have 
> the maximum counter value. Note that with this counting strategy specifying a 
> large maximum value may entirely avoid renaming files."
> I took this to mean that having a value so large that it would never be hit 
> would be more efficient, since it doesn't have to do anything, just keep 
> counting up.  I'm wondering now if it's doing something like "check if 
> log-1.log exists... check if log-2.log exist ....... check if log-99999.log 
> exists" but I don't see any iobound activity in iotop or anything (although I 
> can't say for sure if FS calls to see if something exists would show up 
> there, for example).
> This might not necessarily be a bug in the logger per se, but it would at 
> least be a documentation issue if there is some O(n) operation based on the 
> max value for the DefaultRolloverStrategy.  
> I'd usually dig deeper, but now I still have to debug the original issues 
> with the actual application.  Let me know if there is any more info I should 
> provide or if there is anything else I can do!
> Thanks,
> Charles



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org

Reply via email to