Re: JDBC optimizations in 3.x

2023-10-17 Thread Piotr P. Karwasz
Hi Gary,

On Mon, 16 Oct 2023 at 23:45, Gary Gregory  wrote:
> > 4. The appender itself is quite wasteful in the number of connections
> > it uses (one per log message). IIRC JDBC connections are not
> > thread-safe, but can be used from multiple threads if synchronization
> > is provided. `AbstractDatabaseManager#write` provides such a
> > synchronization. With the current connection usage pattern, the
> > "DriverManager" connection source is basically useless.
> > 5. We should consider integrating the JDBC pooling features with PR
> > 1401: https://github.com/apache/logging-log4j2/pull/1401.
>
> Pooling is already supported by the jdbc-dbcp2 module.
> What am I missing?

Regarding 4: I think that the current solution is suboptimal for both
DriverManager and DBCP2 users.

DBCP2 users have a wonderful connection pool, but due to double
synchronization (in `AbstractDatabaseAppender#append` and
`AbstractDatabaseManager#write`), they only use a single connection at
each time.

DriverManager could be good choice for an asynchronous
appender/logger: it just needs to create a connection at startup and
release it when the application is shut down. However, due to the way
connections are used right now each log event or batch of events
incurs in the overhead of creating a DB connection.

Regarding 5: I haven't figured out how `StringBuilder` caching/pooling
relates to `Connection` caching/pooling, but I think it is a direction
worth exploring. I am aware that DBCP2 is more than Commons Pool 2 +
DriverManager, so the JDBC appender itself will probably not profit
from the `Recycler` interface, but other database appenders might. I
have never user NoSQL databases, but I don't believe they all offer
thread-safe clients with an integrated connection pool.

6. Another improvement that we could add to the JDBC appender is
buffering the log events using `PreparedStatement#addBatch` instead of
taking memento's of events and appending them to a list.

Piotr


Re: JDBC optimizations in 3.x

2023-10-16 Thread Gary Gregory
Below...

On Mon, Oct 16, 2023 at 7:38 AM Piotr P. Karwasz
 wrote:
>
> Hi all,
>
> While checking out the module descriptors of `3.x` I have some
> propositions of improvements to `log4j-jdbc` that I would like to run
> by you before creating the appropriate Github issues:
>
> 1. The `log4j-jdbc` module depends on the optional presence of
> `log4j-jndi`. Maybe we should split JNDI support into a
> `log4j-jdbc-jndi` artifact. This way users that do not use JNDI can be
> 100% certain that no JNDI code is present. Users that require JNDI
> have a single dependency to add (`log4j-jdbc-jndi`),

Fine with me but in general I like to only split out this kind of
dependency to avoid optional jar dependencies in a pom.xml.
I'll go with the consensus on this one.


> 2. There are two ways to map event data to columns: ColumnConfig and
> ColumnMapping. It is unclear to me which way is the recommended one.
> Since we can perform breaking changes in 3.x, could we remove one of
> these methods?

I would not REMOVE one, I would merge the functionality of both into
one, probably into ColumnMapping since it is at the "db" level instead
of the lower "jdbc" level.
Either way, merging is the way to go, at least without taking a deep
dive back into it. I can tell you that I rely on the JDBC and DBCP
variant module.

> 3. Literal values are inserted AS-IS into the prepared statement. Java
> 9 introduced `Statement#enquoteLiteral`, maybe we should use it by
> default, so users are not required to quote the value themselves. We
> can provide an additional `quoteLiteral` attribute with a default of
> true, to allow users to add whatever they want to the prepared
> statement unquoted,

I can't say without trying the API and seeing how it works in
practice. I would do 3 above first to avoid duplicating work.

> 4. The appender itself is quite wasteful in the number of connections
> it uses (one per log message). IIRC JDBC connections are not
> thread-safe, but can be used from multiple threads if synchronization
> is provided. `AbstractDatabaseManager#write` provides such a
> synchronization. With the current connection usage pattern, the
> "DriverManager" connection source is basically useless.
> 5. We should consider integrating the JDBC pooling features with PR
> 1401: https://github.com/apache/logging-log4j2/pull/1401.

Pooling is already supported by the jdbc-dbcp2 module.
What am I missing?

Gary

>
> Piotr


Re: JDBC optimizations in 3.x

2023-10-16 Thread Matt Sicker

> On Oct 16, 2023, at 6:38 AM, Piotr P. Karwasz  wrote:
> 
> 2. There are two ways to map event data to columns: ColumnConfig and
> ColumnMapping. It is unclear to me which way is the recommended one.
> Since we can perform breaking changes in 3.x, could we remove one of
> these methods?

I added ColumnMapping in 2.8 originally for the Cassandra appender, but that 
was later integrated into the other database appenders. I suggest we stick with 
ColumnMapping as the more flexible approach.

Re: JDBC optimizations in 3.x

2023-10-16 Thread Gary Gregory
Great questions and I will address each one, hopefully later today.

Gary

On Mon, Oct 16, 2023, 7:38 AM Piotr P. Karwasz 
wrote:

> Hi all,
>
> While checking out the module descriptors of `3.x` I have some
> propositions of improvements to `log4j-jdbc` that I would like to run
> by you before creating the appropriate Github issues:
>
> 1. The `log4j-jdbc` module depends on the optional presence of
> `log4j-jndi`. Maybe we should split JNDI support into a
> `log4j-jdbc-jndi` artifact. This way users that do not use JNDI can be
> 100% certain that no JNDI code is present. Users that require JNDI
> have a single dependency to add (`log4j-jdbc-jndi`),
> 2. There are two ways to map event data to columns: ColumnConfig and
> ColumnMapping. It is unclear to me which way is the recommended one.
> Since we can perform breaking changes in 3.x, could we remove one of
> these methods?
> 3. Literal values are inserted AS-IS into the prepared statement. Java
> 9 introduced `Statement#enquoteLiteral`, maybe we should use it by
> default, so users are not required to quote the value themselves. We
> can provide an additional `quoteLiteral` attribute with a default of
> true, to allow users to add whatever they want to the prepared
> statement unquoted,
> 4. The appender itself is quite wasteful in the number of connections
> it uses (one per log message). IIRC JDBC connections are not
> thread-safe, but can be used from multiple threads if synchronization
> is provided. `AbstractDatabaseManager#write` provides such a
> synchronization. With the current connection usage pattern, the
> "DriverManager" connection source is basically useless.
> 5. We should consider integrating the JDBC pooling features with PR
> 1401: https://github.com/apache/logging-log4j2/pull/1401.
>
> Piotr
>