Re: log rotation and compression questions

James Hutton Wed, 28 May 2014 20:00:06 -0700

Paul,
Makes sense.  What I had coded and was going to propose follows the same
path.  Could be generalized into a composite rollover strategy that defers
actions until end of composite strategy.  I would suggest that various
fields (such as fieldPattern) be exposed via getters so that strategies
could use those to locate expired log files as well.


James


On Wed, May 28, 2014 at 3:48 PM, Paul Benedict <[email protected]> wrote:

> I believe we need to recognize that perhaps we have 3 different behaviors
> here. One is appending the log file, another is rotating it, another is
> discarding unnecessary backups. There should be a clear separation of
> concerns and that will be possible if separate configurations exist.
>
>
> Cheers,
> Paul
>
>
> On Wed, May 28, 2014 at 2:40 PM, David Hoa <[email protected]> wrote:
>
>> Thanks, Paul, Ralph, Matt, Remko. All good thoughts, with log4j1, we had
>> to have an external process do the compression, but with log4j2, the size
>> based trigger provides interesting possibilities, avoiding the extremely
>> large file problem issue in exchange for many timestamped files where there
>> used to be one. It also might require a mindset change from retaining N
>> days worth of logs to retaining M GB of logs (probably a good thing, as it
>> creates incentive to not have unnecessary log messages).
>>
>> thanks,
>> David
>>
>>
>> On Wed, May 28, 2014 at 11:00 AM, Paul Benedict <[email protected]>wrote:
>>
>>> I would definitely think if the file is too large, it should be rolled
>>> more frequently. At least with log4j 1, you could set to roll not only
>>> daily, but half-day, and hourly (and maybe finer). That's the ideal way to
>>> handle this. Personally, I don't like the idea of writing directly to a
>>> compressed archive since, conceptually, it's not "archived" until it's
>>> closed; rather it should be compressed when it rolls over.
>>>
>>>
>>> Cheers,
>>> Paul
>>>
>>>
>>> On Wed, May 28, 2014 at 12:55 PM, Ralph Goers <[email protected]> wrote:
>>>
>>>> So you are proposing writing two logs - one compressed and one
>>>> uncompressed - to handle this. I am wondering what the break-even point of
>>>> this would be.  Many users use a size-base trigger instead so that a) the
>>>> compression won't take long and b) manipulating a large file is not so much
>>>> of a problem.
>>>>
>>>> What has me wondering about the usefulness of this is that when the
>>>> file gets so large that compression at rollover is a problem the file is
>>>> probably too large to manipulate effectively in something like vi.
>>>>
>>>> Ralph
>>>>
>>>> Sent from my iPad
>>>>
>>>> On May 28, 2014, at 10:27 AM, David Hoa <[email protected]> wrote:
>>>>
>>>> Yup, the tricky part would come on crash before close, interrupt, etc,
>>>> because I assume that that partially compressed file would be irrecoverable
>>>> (haven't verified this). Ideally, we'd be able to close it properly, but if
>>>> not, the log could, on startup, be recovered and compressed from the
>>>> parallel uncompressed log that was simultaneously being written by
>>>> another/the same appender.
>>>>
>>>> That would incur start up time to recover, which may be more acceptable
>>>> in the rare case of a crash. Else, if there's another compression technique
>>>> that leaves behind readable files even if not closed properly, that'd
>>>> eliminate the need for recovery.
>>>>
>>>> I'll open a jira ticket. Thanks for letting me share my thoughts on
>>>> this.
>>>>
>>>> - David
>>>>
>>>>
>>>> On Wed, May 28, 2014 at 9:39 AM, Matt Sicker <[email protected]> wrote:
>>>>
>>>>> We can use GZIPOutputStream, DeflaterOutputStream, and ZipOutputStream
>>>>> all out of the box.
>>>>>
>>>>> What happens if you interrupt a stream in progress? No idea! But Gzip
>>>>> at least has CRC32 checksums on hand, so it can be detected if it's
>>>>> corrupted. We'll have to experiment a bit to see what really happens. I
>>>>> couldn't find anything in zlib.net's FAQ.
>>>>>
>>>>>
>>>>> On 28 May 2014 08:56, Ralph Goers <[email protected]> wrote:
>>>>>
>>>>>> What would happen to the file if the system crashed before the file
>>>>>> is closed? Would the file be able to be decompressed or would it be
>>>>>> corrupted?
>>>>>>
>>>>>> Sent from my iPad
>>>>>>
>>>>>> On May 28, 2014, at 6:35 AM, Remko Popma <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> David, thank you for the clarification. I understand better what you
>>>>>> are trying to achieve now.
>>>>>>
>>>>>> Interesting idea to have an appender that writes to a
>>>>>> GZipOutputStream. Would you mind raising a Jira
>>>>>> <https://issues.apache.org/jira/browse/LOG4J2>ticket for that
>>>>>> feature request?
>>>>>>
>>>>>> I would certainly be interested in learning about efficient
>>>>>> techniques for compressing very large files. Not sure if or how the
>>>>>> dd/direct I/O mentioned in the blog you linked to could be leveraged from
>>>>>> java. If you find a way that works well for log file rollover, and you're
>>>>>> interested in sharing it, please let us know.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, May 28, 2014 at 3:42 PM, David Hoa <[email protected]>wrote:
>>>>>>
>>>>>>> Hi Remko,
>>>>>>>
>>>>>>> My point about gzip, which we've experienced, is that compressing
>>>>>>> very large files (multi-GB) does have considerable impact on the system.
>>>>>>> The dd/direct I/O workaround avoid putting that much log data into your
>>>>>>> filesystem cache. For that problem, after I sent the email, I did look 
>>>>>>> at
>>>>>>> the log4j2 implementation, and saw that in
>>>>>>> DefaultRolloverStrategy::rollover() it calls GZCompressionAction, so I 
>>>>>>> see
>>>>>>> how I can write my own strategy and Action to customize how gzip is 
>>>>>>> called.
>>>>>>>
>>>>>>> My second question was not about adding to existing gzip files; from
>>>>>>> what I know that's not possible. But if the GZipOutputStream is kept 
>>>>>>> open
>>>>>>> and written to until closed by a rollover event, then the cost of 
>>>>>>> gzipping
>>>>>>> is amortized over time rather than incurred when the rollover event gets
>>>>>>> triggered. The benefit is amortization of gzip so there's no resource 
>>>>>>> usage
>>>>>>> spike; downside would be writing both compressed and uncompressed log 
>>>>>>> files
>>>>>>> and maintaining rollover strategies for both of them. So a built in
>>>>>>> appender that wrote directly to gz files would be useful for this.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>
>>>>>>> On Tue, May 27, 2014 at 4:52 PM, Remko Popma 
>>>>>>> <[email protected]>wrote:
>>>>>>>
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> I read the blog post you linked to. It seems that the author was
>>>>>>>> very, very upset that a utility called cp only uses a 512 byte buffer. 
>>>>>>>> He
>>>>>>>> then goes on to praise gzip for having a 32KB buffer.
>>>>>>>> So just based on your link, gzip is actually pretty good.
>>>>>>>>
>>>>>>>> That said, there are plans to improve the file rollover mechanism.
>>>>>>>> These plans are currently spread out over a number of Jira tickets. One
>>>>>>>> existing request is to delete archived log files that are older than 
>>>>>>>> some
>>>>>>>> number of days. (https://issues.apache.org/jira/browse/LOG4J2-656,
>>>>>>>> https://issues.apache.org/jira/browse/LOG4J2-524 )
>>>>>>>> This could be extended to cover your request to keep M compressed
>>>>>>>> files.
>>>>>>>>
>>>>>>>> I'm not sure about appending to existing gzip files. Why is this
>>>>>>>> desirable/What are you trying to accomplish with that?
>>>>>>>>
>>>>>>>> Sent from my iPhone
>>>>>>>>
>>>>>>>> On 2014/05/28, at 3:22, David Hoa <[email protected]> wrote:
>>>>>>>>
>>>>>>>> hi Log4j Dev,
>>>>>>>>
>>>>>>>> I am interested in the log rollover and compression feature in
>>>>>>>> log4j2. I read the documentation online, and still have some questions.
>>>>>>>>
>>>>>>>> - gzipping large files has performance impact on latencies/cpu/file
>>>>>>>> cache, and there's a workaround for that using dd and direct i/o. Is it
>>>>>>>> possible to customize how log4j2 gzips files (or does log4j2 already do
>>>>>>>> this)? See this link for a description of the common problem.
>>>>>>>>
>>>>>>>> http://kevinclosson.wordpress.com/2007/02/23/standard-file-utilities-with-direct-io/
>>>>>>>>
>>>>>>>> - is it possible to use the existing appenders to output directly
>>>>>>>> to their final gzipped files, maintain M of those gzipped files, and
>>>>>>>> rollover/maintain N of the uncompressed logs?  I suspect that the
>>>>>>>> complicated part would be in JVM crash recovery/ application restart. 
>>>>>>>> Any
>>>>>>>> suggestions on how best to add/extend/customize support for this?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Matt Sicker <[email protected]>
>>>>>
>>>>
>>>>
>>>
>>
>

Re: log rotation and compression questions

Reply via email to