Re: log rotation and compression questions

Paul Benedict Wed, 28 May 2014 12:49:29 -0700

I believe we need to recognize that perhaps we have 3 different behaviors
here. One is appending the log file, another is rotating it, another is
discarding unnecessary backups. There should be a clear separation of
concerns and that will be possible if separate configurations exist.



Cheers,
Paul


On Wed, May 28, 2014 at 2:40 PM, David Hoa <[email protected]> wrote:

> Thanks, Paul, Ralph, Matt, Remko. All good thoughts, with log4j1, we had
> to have an external process do the compression, but with log4j2, the size
> based trigger provides interesting possibilities, avoiding the extremely
> large file problem issue in exchange for many timestamped files where there
> used to be one. It also might require a mindset change from retaining N
> days worth of logs to retaining M GB of logs (probably a good thing, as it
> creates incentive to not have unnecessary log messages).
>
> thanks,
> David
>
>
> On Wed, May 28, 2014 at 11:00 AM, Paul Benedict <[email protected]>wrote:
>
>> I would definitely think if the file is too large, it should be rolled
>> more frequently. At least with log4j 1, you could set to roll not only
>> daily, but half-day, and hourly (and maybe finer). That's the ideal way to
>> handle this. Personally, I don't like the idea of writing directly to a
>> compressed archive since, conceptually, it's not "archived" until it's
>> closed; rather it should be compressed when it rolls over.
>>
>>
>> Cheers,
>> Paul
>>
>>
>> On Wed, May 28, 2014 at 12:55 PM, Ralph Goers <[email protected]> wrote:
>>
>>> So you are proposing writing two logs - one compressed and one
>>> uncompressed - to handle this. I am wondering what the break-even point of
>>> this would be.  Many users use a size-base trigger instead so that a) the
>>> compression won't take long and b) manipulating a large file is not so much
>>> of a problem.
>>>
>>> What has me wondering about the usefulness of this is that when the file
>>> gets so large that compression at rollover is a problem the file is
>>> probably too large to manipulate effectively in something like vi.
>>>
>>> Ralph
>>>
>>> Sent from my iPad
>>>
>>> On May 28, 2014, at 10:27 AM, David Hoa <[email protected]> wrote:
>>>
>>> Yup, the tricky part would come on crash before close, interrupt, etc,
>>> because I assume that that partially compressed file would be irrecoverable
>>> (haven't verified this). Ideally, we'd be able to close it properly, but if
>>> not, the log could, on startup, be recovered and compressed from the
>>> parallel uncompressed log that was simultaneously being written by
>>> another/the same appender.
>>>
>>> That would incur start up time to recover, which may be more acceptable
>>> in the rare case of a crash. Else, if there's another compression technique
>>> that leaves behind readable files even if not closed properly, that'd
>>> eliminate the need for recovery.
>>>
>>> I'll open a jira ticket. Thanks for letting me share my thoughts on this.
>>>
>>> - David
>>>
>>>
>>> On Wed, May 28, 2014 at 9:39 AM, Matt Sicker <[email protected]> wrote:
>>>
>>>> We can use GZIPOutputStream, DeflaterOutputStream, and ZipOutputStream
>>>> all out of the box.
>>>>
>>>> What happens if you interrupt a stream in progress? No idea! But Gzip
>>>> at least has CRC32 checksums on hand, so it can be detected if it's
>>>> corrupted. We'll have to experiment a bit to see what really happens. I
>>>> couldn't find anything in zlib.net's FAQ.
>>>>
>>>>
>>>> On 28 May 2014 08:56, Ralph Goers <[email protected]> wrote:
>>>>
>>>>> What would happen to the file if the system crashed before the file is
>>>>> closed? Would the file be able to be decompressed or would it be 
>>>>> corrupted?
>>>>>
>>>>> Sent from my iPad
>>>>>
>>>>> On May 28, 2014, at 6:35 AM, Remko Popma <[email protected]>
>>>>> wrote:
>>>>>
>>>>> David, thank you for the clarification. I understand better what you
>>>>> are trying to achieve now.
>>>>>
>>>>> Interesting idea to have an appender that writes to a
>>>>> GZipOutputStream. Would you mind raising a Jira
>>>>> <https://issues.apache.org/jira/browse/LOG4J2>ticket for that feature
>>>>> request?
>>>>>
>>>>> I would certainly be interested in learning about efficient techniques
>>>>> for compressing very large files. Not sure if or how the dd/direct I/O
>>>>> mentioned in the blog you linked to could be leveraged from java. If you
>>>>> find a way that works well for log file rollover, and you're interested in
>>>>> sharing it, please let us know.
>>>>>
>>>>>
>>>>>
>>>>> On Wed, May 28, 2014 at 3:42 PM, David Hoa <[email protected]> wrote:
>>>>>
>>>>>> Hi Remko,
>>>>>>
>>>>>> My point about gzip, which we've experienced, is that compressing
>>>>>> very large files (multi-GB) does have considerable impact on the system.
>>>>>> The dd/direct I/O workaround avoid putting that much log data into your
>>>>>> filesystem cache. For that problem, after I sent the email, I did look at
>>>>>> the log4j2 implementation, and saw that in
>>>>>> DefaultRolloverStrategy::rollover() it calls GZCompressionAction, so I 
>>>>>> see
>>>>>> how I can write my own strategy and Action to customize how gzip is 
>>>>>> called.
>>>>>>
>>>>>> My second question was not about adding to existing gzip files; from
>>>>>> what I know that's not possible. But if the GZipOutputStream is kept open
>>>>>> and written to until closed by a rollover event, then the cost of 
>>>>>> gzipping
>>>>>> is amortized over time rather than incurred when the rollover event gets
>>>>>> triggered. The benefit is amortization of gzip so there's no resource 
>>>>>> usage
>>>>>> spike; downside would be writing both compressed and uncompressed log 
>>>>>> files
>>>>>> and maintaining rollover strategies for both of them. So a built in
>>>>>> appender that wrote directly to gz files would be useful for this.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>
>>>>>> On Tue, May 27, 2014 at 4:52 PM, Remko Popma 
>>>>>> <[email protected]>wrote:
>>>>>>
>>>>>>> Hi David,
>>>>>>>
>>>>>>> I read the blog post you linked to. It seems that the author was
>>>>>>> very, very upset that a utility called cp only uses a 512 byte buffer. 
>>>>>>> He
>>>>>>> then goes on to praise gzip for having a 32KB buffer.
>>>>>>> So just based on your link, gzip is actually pretty good.
>>>>>>>
>>>>>>> That said, there are plans to improve the file rollover mechanism.
>>>>>>> These plans are currently spread out over a number of Jira tickets. One
>>>>>>> existing request is to delete archived log files that are older than 
>>>>>>> some
>>>>>>> number of days. (https://issues.apache.org/jira/browse/LOG4J2-656,
>>>>>>> https://issues.apache.org/jira/browse/LOG4J2-524 )
>>>>>>> This could be extended to cover your request to keep M compressed
>>>>>>> files.
>>>>>>>
>>>>>>> I'm not sure about appending to existing gzip files. Why is this
>>>>>>> desirable/What are you trying to accomplish with that?
>>>>>>>
>>>>>>> Sent from my iPhone
>>>>>>>
>>>>>>> On 2014/05/28, at 3:22, David Hoa <[email protected]> wrote:
>>>>>>>
>>>>>>> hi Log4j Dev,
>>>>>>>
>>>>>>> I am interested in the log rollover and compression feature in
>>>>>>> log4j2. I read the documentation online, and still have some questions.
>>>>>>>
>>>>>>> - gzipping large files has performance impact on latencies/cpu/file
>>>>>>> cache, and there's a workaround for that using dd and direct i/o. Is it
>>>>>>> possible to customize how log4j2 gzips files (or does log4j2 already do
>>>>>>> this)? See this link for a description of the common problem.
>>>>>>>
>>>>>>> http://kevinclosson.wordpress.com/2007/02/23/standard-file-utilities-with-direct-io/
>>>>>>>
>>>>>>> - is it possible to use the existing appenders to output directly to
>>>>>>> their final gzipped files, maintain M of those gzipped files, and
>>>>>>> rollover/maintain N of the uncompressed logs?  I suspect that the
>>>>>>> complicated part would be in JVM crash recovery/ application restart. 
>>>>>>> Any
>>>>>>> suggestions on how best to add/extend/customize support for this?
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Matt Sicker <[email protected]>
>>>>
>>>
>>>
>>
>

Re: log rotation and compression questions

Reply via email to