[
https://issues.apache.org/jira/browse/LOG4J2-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
tzachi updated LOG4J2-1076:
---------------------------
Description:
Recently I was testing log4j2-flume appender performance and thought to share
the results as I believe they reflect some bugs/flaws in the current
implementation. I conducted a series of in which I used the same 2.2 GB base
file. I wrote a small java application that read he file line by line and log
each line log4j2 with flume appender that sends it to another flume instance on
a remote machine. I measured the time it took and traffic between the end
points as I was mainly curious in the avro compression abilities.
First, Log4j2 Flumes' appender support only GZIP compression (and only for the
body) so first I was curious if this feature actually compatible with the
flume's defalte compression method for avro. I found out that it didn't, and in
order to make it work I would have to write my own gzipDecoder on the other
side.
Type Avro:
1. The process of sending the logs was VERY VERY long (over 2 hours), and
crushed several times.
2. The more surprising part was that the traffic measured on the link was over
2G (when I used GZIP compression), and even closer to 3G (without compression).
I am not even sure why there was such an overhead, but that’s what I saw
several times.
3. The event were send one by one even when I defined a batch mode of 1000.
After reading the code a little bit, I found out that batch mode is currently
not supported and will might be possible on the next release -
https://issues.apache.org/jira/browse/LOG4J2-1044?jql=project%20%3D%20LOG4J2%20AND%20priority%20%3D%20Major%20AND%20resolution%20%3D%20Unresolved%20AND%20text%20~%20%22flume%22%20ORDER%20BY%20key%20DESC.
Type Persistent:
1. Crushed several times after a little while and stopped sending messages. It
didn’t look like the flume instance was the one that crushes so my guess is
that the BerkelyDB thread or something like that. I could not figured out what
exactly.
2. During the same time it crushed (which seems pretty connected to the issue)
I got alerts regarding IO stating Disk IO > 90%. Again, not sure why it
happened, but it happened on several occasions and only when I tried the
persistent type.
3. Batch mode though worked.
Type Embedded:
1. The documentation in log4j website about it does not reflect the the way to
configure this type. I had to work my way through the errors until I got my
code to run, and even then it didn’t really seem like it sends anything. Not
sure why, and I probably need to look deeper into it.
Since Avro type is the only one that seems to work without a significant crush,
I tested this mode of operation by adding a local Flume which get the data from
the log4j2 appender and ship it to the remote Flume using deflate compression.
Using this setup it took 1276484 ms ~ 21 Minutes.
Another important thing I wanted to point out, is once I removed all the
appenders, it took only 10781 ms (about 10-11 seconds) to read the file. With
file appender it took 99682 ms (about 1.5 minutes). So the performance drawback
when using the flume appender seems pretty huge, but it can probably be reduced
using the async logger mode.
was:
Recently I was testing log4j2-flume appender performance and thought to share
the results as I believe they reflect some bugs/flaws in the current
implementation. I conducted a series of in which I used the same 2.2 GB base
file. I wrote a small java application that read he file line by line and log
each line log4j2 with flume appender that sends it to another flume instance on
a remote machine. I measured the time it took and traffic between the end
points as I was mainly curious in the avro compression abilities.
First, Log4j2 Flumes' appender support only GZIP compression (and only for the
body) so first I was curious if this feature actually compatible with the
flume's defalte compression method for avro. I found out that it didn't, and in
order to make it work I would have to write my own gzipDecoder on the other
side.
Type Avro:
1. The process of sending the logs was VERY VERY long (over 2 hours), and
crushed several times.
2. The more surprising part was that the traffic measured on the link was over
2G (when I used GZIP compression), and even closer to 3G (without compression).
I am not even sure why there was such an overhead, but that’s what I saw
several times.
3. The event were send one by one even when I defined a batch mode of 1000.
After reading the code a little bit, I found out that batch mode is currently
not supported and will might be possible on the next release -
https://issues.apache.org/jira/browse/LOG4J2-1044?jql=project%20%3D%20LOG4J2%20AND%20priority%20%3D%20Major%20AND%20resolution%20%3D%20Unresolved%20AND%20text%20~%20%22flume%22%20ORDER%20BY%20key%20DESC.
Type Persistent:
1. Crushed several times after a little while and stopped sending messages. It
didn’t look like the flume instance was the one that crushes so my guess is
that the BerkelyDB thread or something like that. I could not figured out what
exactly.
2. During the same time it crushed (which seems pretty connected to the issue)
I got alerts regarding IO stating Disk IO > 90%. Again, not sure why it
happened, but it happened on several occasions and only when I tried the
persistent type.
3. Batch mode though worked.
Type Embedded:
1. The documentation in log4j website about it does not reflect the the way to
configure this type. I had to work my way through the errors until I got my
code to run, and even then it didn’t really seem like it sends anything. Not
sure why, and I probably need to look deeper into it.
Since Avro type is the only one that seems to work without a significant crush,
I tested this mode of operation by adding a local Flume which get the data from
the log4j2 appender and ship it to the remote Flume using deflate compression.
I tried 2 modes of flume appender - Async and Sync. The good news are that both
modes, although sending events 1 by 1 (as batch is not supported) used only
about 730MB of traffic between the local and remote Flume. The Sync mode took
1276484 ms ~ 21 Minutes, and the Async mode 1339501 ms ~ 22 Minutes.
Another important thing I wanted to point out, is once I removed all the
appenders, it took only 10781 ms (about 10-11 seconds) to read the file. With
file appender it took 99682 ms (about 1.5 minutes). So the performance drawback
when using the flume appender seems pretty huge.
> Flume appender fails to preform
> -------------------------------
>
> Key: LOG4J2-1076
> URL: https://issues.apache.org/jira/browse/LOG4J2-1076
> Project: Log4j 2
> Issue Type: Bug
> Reporter: tzachi
>
> Recently I was testing log4j2-flume appender performance and thought to share
> the results as I believe they reflect some bugs/flaws in the current
> implementation. I conducted a series of in which I used the same 2.2 GB base
> file. I wrote a small java application that read he file line by line and log
> each line log4j2 with flume appender that sends it to another flume instance
> on a remote machine. I measured the time it took and traffic between the end
> points as I was mainly curious in the avro compression abilities.
> First, Log4j2 Flumes' appender support only GZIP compression (and only for
> the body) so first I was curious if this feature actually compatible with the
> flume's defalte compression method for avro. I found out that it didn't, and
> in order to make it work I would have to write my own gzipDecoder on the
> other side.
> Type Avro:
> 1. The process of sending the logs was VERY VERY long (over 2 hours), and
> crushed several times.
> 2. The more surprising part was that the traffic measured on the link was
> over 2G (when I used GZIP compression), and even closer to 3G (without
> compression). I am not even sure why there was such an overhead, but that’s
> what I saw several times.
> 3. The event were send one by one even when I defined a batch mode of 1000.
> After reading the code a little bit, I found out that batch mode is currently
> not supported and will might be possible on the next release -
> https://issues.apache.org/jira/browse/LOG4J2-1044?jql=project%20%3D%20LOG4J2%20AND%20priority%20%3D%20Major%20AND%20resolution%20%3D%20Unresolved%20AND%20text%20~%20%22flume%22%20ORDER%20BY%20key%20DESC.
>
> Type Persistent:
> 1. Crushed several times after a little while and stopped sending messages.
> It didn’t look like the flume instance was the one that crushes so my guess
> is that the BerkelyDB thread or something like that. I could not figured out
> what exactly.
> 2. During the same time it crushed (which seems pretty connected to the
> issue) I got alerts regarding IO stating Disk IO > 90%. Again, not sure why
> it happened, but it happened on several occasions and only when I tried the
> persistent type.
> 3. Batch mode though worked.
>
> Type Embedded:
> 1. The documentation in log4j website about it does not reflect the the way
> to configure this type. I had to work my way through the errors until I got
> my code to run, and even then it didn’t really seem like it sends anything.
> Not sure why, and I probably need to look deeper into it.
>
> Since Avro type is the only one that seems to work without a significant
> crush, I tested this mode of operation by adding a local Flume which get the
> data from the log4j2 appender and ship it to the remote Flume using deflate
> compression. Using this setup it took 1276484 ms ~ 21 Minutes.
> Another important thing I wanted to point out, is once I removed all the
> appenders, it took only 10781 ms (about 10-11 seconds) to read the file. With
> file appender it took 99682 ms (about 1.5 minutes). So the performance
> drawback when using the flume appender seems pretty huge, but it can probably
> be reduced using the async logger mode.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]