(logging-log4j2) 02/13: Rewrite `performance.adoc`

vy Thu, 16 May 2024 00:42:17 -0700

This is an automated email from the ASF dual-hosted git repository.

vy pushed a commit to branch doc/perf-main
in repository https://gitbox.apache.org/repos/asf/logging-log4j2.git


commit 2240806aa2b2277f0391a5670ed2ddd9e387783e
Author: Volkan Yazıcı <vol...@yazi.ci>
AuthorDate: Mon May 13 10:39:25 2024 +0200

    Rewrite `performance.adoc`
---
 src/site/antora/antora.tmpl.yml                    |   1 +
 src/site/antora/antora.yml                         |   1 +
 .../antora/modules/ROOT/pages/manual/async.adoc    |  45 +-----
 .../antora/modules/ROOT/pages/manual/layouts.adoc  |  28 +---
 .../modules/ROOT/pages/manual/performance.adoc     | 166 +++++++++++++++++++++
 .../ROOT/partials/manual/async-drawbacks.adoc      |  36 +++++
 .../manual/garbagefree-intro.adoc}                 |   5 +-
 .../manual/layouts-location-information.adoc       |  49 ++++++
 8 files changed, 263 insertions(+), 68 deletions(-)

diff --git a/src/site/antora/antora.tmpl.yml b/src/site/antora/antora.tmpl.yml
index 61e9a581d9..afd7e6aba4 100644
--- a/src/site/antora/antora.tmpl.yml
+++ b/src/site/antora/antora.tmpl.yml
@@ -45,6 +45,7 @@ asciidoc:
     java-target-version: "${maven.compiler.target}"
     java-compiler-version: "${minimalJavaBuildVersion}"
     logging-services-url: "https://logging.apache.org";
+    lmax-disruptor-url: "https://lmax-exchange.github.io/disruptor";
     # Dependency versions
     commons-logging-version: "${site-commons-logging.version}"
     log4j-api-version: "${log4j-api.version}"
diff --git a/src/site/antora/antora.yml b/src/site/antora/antora.yml
index 68647cec44..50ccf7e30b 100644
--- a/src/site/antora/antora.yml
+++ b/src/site/antora/antora.yml
@@ -45,6 +45,7 @@ asciidoc:
     java-target-version: "17"
     java-compiler-version: "[17,18)"
     logging-services-url: "https://logging.apache.org";
+    lmax-disruptor-url: "https://lmax-exchange.github.io/disruptor";
     # Dependency versions
     commons-logging-version: "1.4.0"
     log4j-api-version: "1.2.3-api"
diff --git a/src/site/antora/modules/ROOT/pages/manual/async.adoc 
b/src/site/antora/modules/ROOT/pages/manual/async.adoc
index fdcece779c..4edd39c7c2 100644
--- a/src/site/antora/modules/ROOT/pages/manual/async.adoc
+++ b/src/site/antora/modules/ROOT/pages/manual/async.adoc
@@ -66,48 +66,9 @@ than synchronous loggers or even queue-based asynchronous 
appenders.
 
 === Drawbacks
 
-* Error handling. If a problem happens during the logging process and an
-exception is thrown, it is less easy for an asynchronous logger or
-appender to signal this problem to the application. This can partly be
-alleviated by configuring an `ExceptionHandler`, but this may still not
-cover all cases. For this reason, if logging is part of your business
-logic, for example if you are using Log4j as an audit logging framework,
-we would recommend to synchronously log those audit messages. (Note that
-you can still link:#MixedSync-Async[combine] them and use asynchronous
-logging for debug/trace logging in addition to synchronous logging for
-the audit trail.)
-* In some rare cases, care must be taken with mutable messages. Most of
-the time you don't need to worry about this. Log4 will ensure that log
-messages like `logger.debug("My object is {}", myObject)` will use the
-state of the `myObject` parameter at the time of the call to
-`logger.debug()`. The log message will not change even if `myObject` is
-modified later. It is safe to asynchronously log mutable objects because
-most
-link:../javadoc/log4j-api/org/apache/logging/log4j/message/Message.html[`Message`]
-implementations built-in to Log4j take a snapshot of the parameters.
-There are some exceptions however:
-link:../javadoc/log4j-api/org/apache/logging/log4j/message/MapMessage.html[`MapMessage`]
-and
-link:../javadoc/log4j-api/org/apache/logging/log4j/message/StructuredDataMessage.html[`StructuredDataMessage`]
-are mutable by design: fields can be added to these messages after the
-message object was created. These messages should not be modified after
-they are logged with asynchronous loggers or asynchronous appenders; you
-may or may not see the modifications in the resulting log output.
-Similarly, custom
-link:../javadoc/log4j-api/org/apache/logging/log4j/message/Message.html[`Message`]
-implementations should be designed with asynchronous use in mind, and
-either take a snapshot of their parameters at construction time, or
-document their thread-safety characteristics.
-* If your application is running in an environment where CPU resources
-are scarce, like a machine with one CPU with a single core, starting
-another thread is not likely to give better performance.
-* If the _sustained rate_ at which your application is logging messages
-is faster than the maximum sustained throughput of the underlying
-appender, the queue will fill up and the application will end up logging
-at the speed of the slowest appender. If this happens, consider
-selecting a xref:manual/performance.adoc#whichAppender[faster appender], or
-logging less. If neither of these is an option, you may get better
-throughput and fewer latency spikes by logging synchronously.
+There are certain drawbacks associated with asynchronous logging:
+
+include::partial$manual/async-drawbacks.adoc[]
 
 [#AllAsync]
 == Making All Loggers Asynchronous
diff --git a/src/site/antora/modules/ROOT/pages/manual/layouts.adoc 
b/src/site/antora/modules/ROOT/pages/manual/layouts.adoc
index 1c55650043..7a13314f58 100644
--- a/src/site/antora/modules/ROOT/pages/manual/layouts.adoc
+++ b/src/site/antora/modules/ROOT/pages/manual/layouts.adoc
@@ -1782,28 +1782,6 @@ within the message text.
 |===
 
 [#LocationInformation]
-== Location Information
-
-If one of the layouts is configured with a location-related attribute
-like HTML link:#HtmlLocationInfo[locationInfo], or one of the patterns
-link:#PatternClass[%C or %class], link:#PatternFile[%F or %file],
-link:#PatternLocation[%l or %location], link:#PatternLine[%L or %line],
-link:#PatternMethod[%M or %method], Log4j will take a snapshot of the
-stack, and walk the stack trace to find the location information.
-
-This is an expensive operation: 1.3 - 5 times slower for synchronous
-loggers. Synchronous loggers wait as long as possible before they take
-this stack snapshot. If no location is required, the snapshot will never
-be taken.
-
-However, asynchronous loggers need to make this decision before passing
-the log message to another thread; the location information will be lost
-after that point. The
-xref:manual/performance.adoc#asyncLoggingWithLocation[performance impact] of
-taking a stack trace snapshot is even higher for asynchronous loggers:
-logging with location is 30-100 times slower than without location. For
-this reason, asynchronous loggers and asynchronous appenders do not
-include location information by default.
-
-You can override the default behaviour in your logger or asynchronous
-appender configuration by specifying `includeLocation="true"`.
+== Location information
+
+include::partial$manual/layouts-location-information.adoc[]
diff --git a/src/site/antora/modules/ROOT/pages/manual/performance.adoc 
b/src/site/antora/modules/ROOT/pages/manual/performance.adoc
index 02993d899d..cbadfab36c 100644
--- a/src/site/antora/modules/ROOT/pages/manual/performance.adoc
+++ b/src/site/antora/modules/ROOT/pages/manual/performance.adoc
@@ -16,3 +16,169 @@
 ////
 
 = Performance
+
+In this page we will guide you through certain steps that will show how to 
improve the performance of your Log4j configuration to serve your particular 
use case best.
+
+The act of logging is an interplay between the logging API (i.e., Log4j API) 
where the programmer publishes logs and a logging implementation (i.e., Log4j 
Core) where published logs get consumed; filtered, enriched, encoded, and 
written to files, databases, network sockets, etc.
+Both parties can have dramatic impact on performance.
+Hence, we will discuss the performance optimization of both individually:
+
+. xref:#api[Using Log4j API efficiently]
+. xref:#core[Tuning Log4j Core for performance]
+
+[#api]
+== Using Log4j API efficiently
+
+Log4j API bundles a rich set of features to either totally avoid or minimize 
expensive computations whenever possible.
+We will walk you through these features with examples.
+
+[TIP]
+====
+Remember that a logging API and a logging implementation are two different 
things.
+You can use Log4j API in combination with a logging implementation other than 
Log4j Core (e.g., Logback).
+**The tips shared in this section are logging implementation agnostic.**
+====
+
+[#api-concat]
+=== Don't use string concatenation
+
+If you are using `String` concatenation while logging, you are doing something 
very wrong and dangerous!
+
+* [ ] Don't use `String` concatenation to format arguments!
+This circumvents the handling of arguments by message type and layout.
+More importantly, **this approach is prone to attacks!**
+Imagine `userId` being provided by the user with the following content:
+`placeholders for non-existing args to trigger failure: {} {} 
\{dangerousLookup}`
++
+[source,java]
+----
+/* BAD! */ LOGGER.info("failed for user ID: " + userId);
+----
+
+* [x] Use message parameters.
+Parameterized messages allow safe encoding of parameters and avoid formatting 
totally if the message is filtered.
+For instance, if the associated level for the logger is discarded, no 
formatting will take place.
++
+[source,java]
+----
+/* GOOD */ LOGGER.info("failed for user ID `{}`", userId);
+----
+
+[#api-supplier]
+=== Use ``Supplier``s to pass computationally expensive arguments
+
+If one or more arguments of the log statement are computationally expensive, 
it is not wise to evaluate them knowing that their results can be discarded.
+Consider the following example:
+
+[source,java]
+----
+/* BAD! */ LOGGER.info("failed for user ID `{}` and role `{}`", userId, 
db.findUserRoleById(userId));
+----
+
+The database query (i.e., `db.findUserNameById(userId)`) can be a significant 
bottleneck if the created the log event will be discarded anyway – maybe the 
`INFO` level or the associated xref:manual/markers.adoc[marker] is not accepted 
for this package, or due to some other filtering.
+
+* [ ] The old-school way of solving this problem is to level-guard the log 
statement:
++
+[source,java]
+----
+/* BAD! */ if (LOGGER.isInfoEnabled()) { LOGGER.info(...); }
+----
++
+While this would work for cases where the message can be dropped due to 
insufficient level, this approach is still prone to other filtering cases; 
e.g., maybe the associated xref:manual/markers.adoc[marker] is not accepted.
+
+* [x] Use ``Supplier``s to pass arguments:
++
+[source,java]
+----
+/* GOOD */ LOGGER.info("failed for user ID `{}` and role `{}`", () -> userId, 
() -> db.findUserRoleById(userId));
+----
+
+* [x] Use a `Supplier` to pass the message:
++
+[source,java]
+----
+/* GOOD */ LOGGER.info(() -> new ParameterizedMessage("failed for user ID `{}` 
and role `{}`", userId, db.findUserRoleById(userId)));
+----
+
+[#core]
+== Tuning Log4j Core for performance
+
+Below sections walk you through a set of features that can have significant 
impact on the performance of Log4j Core.
+
+[IMPORTANT]
+====
+Extra tuning of any application will deviate you away from defaults and add up 
to the maintenance load.
+You are strongly advised to measure your application's overall performance and 
then, if Log4j is found to be an important bottleneck factor, tune it 
accordingly.
+When this happens, we also recommend you to evaluate your assumptions on a 
regular basis to check if they still hold.
+Remember, 
https://en.wikipedia.org/wiki/Program_optimization#When_to_optimize[premature 
optimization is the root of all evil].
+====
+
+[TIP]
+====
+Remember that a logging API and a logging implementation are two different 
things.
+You can use Log4j Core in combination with a logging API other than Log4j API 
(e.g., SLF4J).
+**The tips shared in this section are logging API agnostic.**
+====
+
+[#layouts]
+=== Layouts
+
+xref:manual/layouts.adoc[Layouts] are responsible for encoding a log event in 
a certain format (human-readable text, JSON, etc.) and they can have 
significant impact in your overall logging performance.
+
+[#layouts-location]
+==== Location information
+
+include::partial$manual/layouts-location-information.adoc[]
+
+[#layouts-efficiency]
+==== Layout efficiency
+
+Not all layouts are designed with the same performance considerations in mind.
+Following layouts are known to be well-optimized for performance-sensitive 
workloads:
+
+xref:manual/json-template-layout.adoc[JSON Template Layout]::
+It encodes log events into JSON according to the structure described by the 
template provided.
+Its output can safely be ingested into several log storage solutions: 
Elasticsearch, Google Cloud Logging, Graylog, Logstash, etc.
+
+xref:manual/layouts.adoc#PatternLayout[Pattern Layout]::
+It encodes log events into human-readable text according to the pattern 
provided.
+
+[#async]
+=== Asynchronous logging
+
+Asynchronous logging is useful to deal with bursts of events.
+How this works is that a minimum amount of work is done by the application 
thread to capture all required information in a log event, and this log event 
is then put on a queue for later processing by a background thread.
+As long as the queue is sized large enough, the application threads should be 
able to spend very little time on the logging call and return to the business 
logic very quickly.
+
+[#async-drawbacks]
+==== Asynchronous logging drawbacks
+
+.There are certain drawbacks associated with asynchronous logging:
+[%collapsible]
+====
+include::partial$manual/async-drawbacks.adoc[]
+====
+
+[#async-strategies]
+==== Asynchronous logging strategies
+
+Log4j provides following strategies users can choose from to do asynchronous 
logging:
+
+[#async-logger]
+===== Asynchronous logger
+
+xref:manual/async.adoc[Asynchronous loggers] use {lmax-disruptor-url}[LMAX 
Disruptor] messaging library to consume log events.
+Their aim is to return from a `log()` call to the application as soon as 
possible.
+
+[#async-appender]
+===== Asynchronous appender
+
+The xref:manual/appenders.adoc#AsyncAppender[asynchronous appender] accepts 
references to other appenders and causes log events to be written to them on a 
separate thread.
+The backing queue uses `ArrayBlockingQueue` by default, though it can be 
replaced with a better performing one suitable for your use case.
+
+[#gcfree]
+=== Garbage-free logging
+
+include::partial$manual/garbagefree-intro.adoc[]
+
+See xref:manual/garbagefree.adoc[] for details.
diff --git a/src/site/antora/modules/ROOT/partials/manual/async-drawbacks.adoc 
b/src/site/antora/modules/ROOT/partials/manual/async-drawbacks.adoc
new file mode 100644
index 0000000000..e840334a73
--- /dev/null
+++ b/src/site/antora/modules/ROOT/partials/manual/async-drawbacks.adoc
@@ -0,0 +1,36 @@
+////
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+         http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+////
+
+Error handling::
+If a problem happens during the logging process and an exception is thrown, it 
is less easy for an asynchronous setting to signal this problem to the 
application.
+This can partly be alleviated by configuring an exception handler, but this 
may still not cover all cases.
+
+Stateful messages::
+In some rare cases, care must be taken with mutable messages.
+Most of the time you don't need to worry about this.
+Log4 will ensure that log messages like `logger.debug("My object is {}", 
myObject)` will use the state of the `myObject` parameter at the time of the 
call to `logger.debug()`.
+The log message will not change even if `myObject` is modified later.
+It is safe to asynchronously log mutable objects because most 
xref:manual/messages.adoc[built-in `Message` implementations] take a snapshot 
of the parameters.
+There are some exceptions however: 
xref:manual/messages.adoc#MapMessage[`MapMessage`] and 
xref:manual/messages.adoc#StructuredDataMessage[`StructuredDataMessage`] are 
mutable by design, fields can be added to these messages after the message 
object was created.
+These messages should not be modified after they are logged with asynchronous 
loggers or asynchronous appenders; you may or may not see the modifications in 
the resulting log output.
+Similarly, custom `Message` implementations should be designed with 
asynchronous use in mind, and either take a snapshot of their parameters at 
construction time, or document their thread-safety characteristics.
+
+Computational overhead::
+If your application is running in an environment where CPU resources are 
scarce, like a machine with one CPU with a single core, starting another thread 
is not likely to give better performance.
+
+Appender performance::
+If the sustained rate at which your application is logging messages is faster 
than the maximum sustained throughput of the underlying appender, the queue 
will fill up and the application will end up logging at the speed of the 
slowest appender. If this happens, consider selecting a faster appender, or 
logging less. If neither of these is an option, you may get better throughput 
and fewer latency spikes by logging synchronously.
diff --git a/src/site/antora/modules/ROOT/pages/manual/performance.adoc 
b/src/site/antora/modules/ROOT/partials/manual/garbagefree-intro.adoc
similarity index 63%
copy from src/site/antora/modules/ROOT/pages/manual/performance.adoc
copy to src/site/antora/modules/ROOT/partials/manual/garbagefree-intro.adoc
index 02993d899d..db3bc30852 100644
--- a/src/site/antora/modules/ROOT/pages/manual/performance.adoc
+++ b/src/site/antora/modules/ROOT/partials/manual/garbagefree-intro.adoc
@@ -15,4 +15,7 @@
     limitations under the License.
 ////
 
-= Performance
+Garbage collection pauses are a common cause of latency spikes and for many 
systems significant effort is spent on controlling these pauses.
+Log4j allocates temporary `LogEvent`, `String`, `char[]`, `byte[]`, etc. 
objects during steady state logging.
+This contributes to pressure on the garbage collector and increases the 
frequency with which garbage collection pauses occur.
+In _garbage-free mode_, Log4j buffers and reuses objects to lessen this 
pressure.
diff --git 
a/src/site/antora/modules/ROOT/partials/manual/layouts-location-information.adoc
 
b/src/site/antora/modules/ROOT/partials/manual/layouts-location-information.adoc
new file mode 100644
index 0000000000..ae6a3bb004
--- /dev/null
+++ 
b/src/site/antora/modules/ROOT/partials/manual/layouts-location-information.adoc
@@ -0,0 +1,49 @@
+////
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+         http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+////
+
+Several layouts offer directives to include the _location information_: the 
caller class, method, file, and line.
+Log4j takes a snapshot of the stack, and walks the stack trace to find the 
location information.
+**This is an expensive operation** and should be avoided in 
performance-sensitive setups.
+
+[TIP]
+====
+Note that _the caller class_ of the location information and _the logger name_ 
are two different things.
+In most setups just using _the logger name_ – which doesn't incur any overhead 
to obtain while logging! – is a sufficient and **zero-cost substitute for the 
caller class**.
+
+[source,java]
+----
+package com.mycompany;
+
+public class PaymentService {
+
+    // Logger name: `com.mycompany.PaymentService`
+    private static final Logger LOGGER = LogManager.getLogger();
+
+    private static final class PaymentTransaction {
+        void doPayment() {
+            // Caller class: `com.mycompany.PaymentService$PaymentTransaction`
+            LOGGER.trace("...");
+        }
+    }
+
+}
+----
+====
+
+Asynchronous loggers need to capture the location information before passing 
the log message to another thread; otherwise the location information will be 
lost after that point.
+Due to associated performance implications, 
xref:manual/async.adoc[asynchronous loggers] and 
xref:manual/appenders.adoc#AsyncAppender[asynchronous appenders] do **not** 
include location information by default.
+You can override the default behaviour in your asynchronous logger or 
asynchronous appender configuration.

(logging-log4j2) 02/13: Rewrite `performance.adoc`

Reply via email to