Re: JDBC optimizations in 3.x
Hi Gary, On Mon, 16 Oct 2023 at 23:45, Gary Gregory wrote: > > 4. The appender itself is quite wasteful in the number of connections > > it uses (one per log message). IIRC JDBC connections are not > > thread-safe, but can be used from multiple threads if synchronization > > is provided. `AbstractDatabaseManager#write` provides such a > > synchronization. With the current connection usage pattern, the > > "DriverManager" connection source is basically useless. > > 5. We should consider integrating the JDBC pooling features with PR > > 1401: https://github.com/apache/logging-log4j2/pull/1401. > > Pooling is already supported by the jdbc-dbcp2 module. > What am I missing? Regarding 4: I think that the current solution is suboptimal for both DriverManager and DBCP2 users. DBCP2 users have a wonderful connection pool, but due to double synchronization (in `AbstractDatabaseAppender#append` and `AbstractDatabaseManager#write`), they only use a single connection at each time. DriverManager could be good choice for an asynchronous appender/logger: it just needs to create a connection at startup and release it when the application is shut down. However, due to the way connections are used right now each log event or batch of events incurs in the overhead of creating a DB connection. Regarding 5: I haven't figured out how `StringBuilder` caching/pooling relates to `Connection` caching/pooling, but I think it is a direction worth exploring. I am aware that DBCP2 is more than Commons Pool 2 + DriverManager, so the JDBC appender itself will probably not profit from the `Recycler` interface, but other database appenders might. I have never user NoSQL databases, but I don't believe they all offer thread-safe clients with an integrated connection pool. 6. Another improvement that we could add to the JDBC appender is buffering the log events using `PreparedStatement#addBatch` instead of taking memento's of events and appending them to a list. Piotr
Re: JDBC optimizations in 3.x
Below... On Mon, Oct 16, 2023 at 7:38 AM Piotr P. Karwasz wrote: > > Hi all, > > While checking out the module descriptors of `3.x` I have some > propositions of improvements to `log4j-jdbc` that I would like to run > by you before creating the appropriate Github issues: > > 1. The `log4j-jdbc` module depends on the optional presence of > `log4j-jndi`. Maybe we should split JNDI support into a > `log4j-jdbc-jndi` artifact. This way users that do not use JNDI can be > 100% certain that no JNDI code is present. Users that require JNDI > have a single dependency to add (`log4j-jdbc-jndi`), Fine with me but in general I like to only split out this kind of dependency to avoid optional jar dependencies in a pom.xml. I'll go with the consensus on this one. > 2. There are two ways to map event data to columns: ColumnConfig and > ColumnMapping. It is unclear to me which way is the recommended one. > Since we can perform breaking changes in 3.x, could we remove one of > these methods? I would not REMOVE one, I would merge the functionality of both into one, probably into ColumnMapping since it is at the "db" level instead of the lower "jdbc" level. Either way, merging is the way to go, at least without taking a deep dive back into it. I can tell you that I rely on the JDBC and DBCP variant module. > 3. Literal values are inserted AS-IS into the prepared statement. Java > 9 introduced `Statement#enquoteLiteral`, maybe we should use it by > default, so users are not required to quote the value themselves. We > can provide an additional `quoteLiteral` attribute with a default of > true, to allow users to add whatever they want to the prepared > statement unquoted, I can't say without trying the API and seeing how it works in practice. I would do 3 above first to avoid duplicating work. > 4. The appender itself is quite wasteful in the number of connections > it uses (one per log message). IIRC JDBC connections are not > thread-safe, but can be used from multiple threads if synchronization > is provided. `AbstractDatabaseManager#write` provides such a > synchronization. With the current connection usage pattern, the > "DriverManager" connection source is basically useless. > 5. We should consider integrating the JDBC pooling features with PR > 1401: https://github.com/apache/logging-log4j2/pull/1401. Pooling is already supported by the jdbc-dbcp2 module. What am I missing? Gary > > Piotr
Re: JDBC optimizations in 3.x
> On Oct 16, 2023, at 6:38 AM, Piotr P. Karwasz wrote: > > 2. There are two ways to map event data to columns: ColumnConfig and > ColumnMapping. It is unclear to me which way is the recommended one. > Since we can perform breaking changes in 3.x, could we remove one of > these methods? I added ColumnMapping in 2.8 originally for the Cassandra appender, but that was later integrated into the other database appenders. I suggest we stick with ColumnMapping as the more flexible approach.
Re: JDBC optimizations in 3.x
Great questions and I will address each one, hopefully later today. Gary On Mon, Oct 16, 2023, 7:38 AM Piotr P. Karwasz wrote: > Hi all, > > While checking out the module descriptors of `3.x` I have some > propositions of improvements to `log4j-jdbc` that I would like to run > by you before creating the appropriate Github issues: > > 1. The `log4j-jdbc` module depends on the optional presence of > `log4j-jndi`. Maybe we should split JNDI support into a > `log4j-jdbc-jndi` artifact. This way users that do not use JNDI can be > 100% certain that no JNDI code is present. Users that require JNDI > have a single dependency to add (`log4j-jdbc-jndi`), > 2. There are two ways to map event data to columns: ColumnConfig and > ColumnMapping. It is unclear to me which way is the recommended one. > Since we can perform breaking changes in 3.x, could we remove one of > these methods? > 3. Literal values are inserted AS-IS into the prepared statement. Java > 9 introduced `Statement#enquoteLiteral`, maybe we should use it by > default, so users are not required to quote the value themselves. We > can provide an additional `quoteLiteral` attribute with a default of > true, to allow users to add whatever they want to the prepared > statement unquoted, > 4. The appender itself is quite wasteful in the number of connections > it uses (one per log message). IIRC JDBC connections are not > thread-safe, but can be used from multiple threads if synchronization > is provided. `AbstractDatabaseManager#write` provides such a > synchronization. With the current connection usage pattern, the > "DriverManager" connection source is basically useless. > 5. We should consider integrating the JDBC pooling features with PR > 1401: https://github.com/apache/logging-log4j2/pull/1401. > > Piotr >