David, thank you for the clarification. I understand better what you are trying to achieve now.
Interesting idea to have an appender that writes to a GZipOutputStream. Would you mind raising a Jira <https://issues.apache.org/jira/browse/LOG4J2>ticket for that feature request? I would certainly be interested in learning about efficient techniques for compressing very large files. Not sure if or how the dd/direct I/O mentioned in the blog you linked to could be leveraged from java. If you find a way that works well for log file rollover, and you're interested in sharing it, please let us know. On Wed, May 28, 2014 at 3:42 PM, David Hoa <[email protected]> wrote: > Hi Remko, > > My point about gzip, which we've experienced, is that compressing very > large files (multi-GB) does have considerable impact on the system. The > dd/direct I/O workaround avoid putting that much log data into your > filesystem cache. For that problem, after I sent the email, I did look at > the log4j2 implementation, and saw that in > DefaultRolloverStrategy::rollover() it calls GZCompressionAction, so I see > how I can write my own strategy and Action to customize how gzip is called. > > My second question was not about adding to existing gzip files; from what > I know that's not possible. But if the GZipOutputStream is kept open and > written to until closed by a rollover event, then the cost of gzipping is > amortized over time rather than incurred when the rollover event gets > triggered. The benefit is amortization of gzip so there's no resource usage > spike; downside would be writing both compressed and uncompressed log files > and maintaining rollover strategies for both of them. So a built in > appender that wrote directly to gz files would be useful for this. > > Thanks, > David > > > On Tue, May 27, 2014 at 4:52 PM, Remko Popma <[email protected]>wrote: > >> Hi David, >> >> I read the blog post you linked to. It seems that the author was very, >> very upset that a utility called cp only uses a 512 byte buffer. He then >> goes on to praise gzip for having a 32KB buffer. >> So just based on your link, gzip is actually pretty good. >> >> That said, there are plans to improve the file rollover mechanism. These >> plans are currently spread out over a number of Jira tickets. One existing >> request is to delete archived log files that are older than some number of >> days. (https://issues.apache.org/jira/browse/LOG4J2-656, >> https://issues.apache.org/jira/browse/LOG4J2-524 ) >> This could be extended to cover your request to keep M compressed files. >> >> I'm not sure about appending to existing gzip files. Why is this >> desirable/What are you trying to accomplish with that? >> >> Sent from my iPhone >> >> On 2014/05/28, at 3:22, David Hoa <[email protected]> wrote: >> >> hi Log4j Dev, >> >> I am interested in the log rollover and compression feature in log4j2. I >> read the documentation online, and still have some questions. >> >> - gzipping large files has performance impact on latencies/cpu/file >> cache, and there's a workaround for that using dd and direct i/o. Is it >> possible to customize how log4j2 gzips files (or does log4j2 already do >> this)? See this link for a description of the common problem. >> >> http://kevinclosson.wordpress.com/2007/02/23/standard-file-utilities-with-direct-io/ >> >> - is it possible to use the existing appenders to output directly to >> their final gzipped files, maintain M of those gzipped files, and >> rollover/maintain N of the uncompressed logs? I suspect that the >> complicated part would be in JVM crash recovery/ application restart. Any >> suggestions on how best to add/extend/customize support for this? >> >> >> Thanks, >> David >> >> >> >
