Ceki et al.,
First, my apologies for being terse in my message, I'm just short on
time this month.
Problems:
addBatch is an optional method in Statement. It is not guarenteed to
be implemented by all drivers (in fact it is not implemented in the
mySQL driver I use).
Your logEvents table definition is not valid on all databases. And
in fact it's not valid on -most- databases. And any table we come up
with will lack something a user may want, until we have some much that
it's overly bloated for 95% of the users. Then we would likely need to
write several different versions of the table creation to support the
different databases.
Note that your prepared statements were in fact slower than simple
statements and not all databases truely support prepared statements
(mySQL is the only slacker that I know of but there may be others).
Suggestions/Solutions/Previous design considerations:
When I first wrote the JDBCAppender I knew that it would be used on a
variety of databases in different circumstances for users with different
logging needs. The base level JDBCAppender needs to support that level
of generality. The only way to support that variety is to leave the sql
and the table definition to the user. The current JDBCAppender supports
this in an easy and straight forward way. In particular, if users want
prepared statements they can set the sql to call a stored procedure of
their choice (in essence what the prepared statement does). The
advantage here is that the users' table can be matched to their
particular needs -- for example, storing Thread but not NDC, or Class &
line number, or an Event extension... It also allows them to log to an
existing table in a legacy installation.
Due to the problems above and the design goals in the previous
paragraph I am strong opposed to tieing the basic JDBCAppender to a
particular table definition.
There clearly is value to having other components able to query the
database for log events. Chainsaw is the obvious example. Someone just
posted to the user list that Chainsaw can already do this. It's my
opinion that the database reader bears a burden of being configurable to
the proper database.
Alternately, log4j could have a -preffered- table definition. A few
scripts and default configuration files could be included with the
distribution for this preferred definition. Then log4j would integrate
on that 'out of the box', but would easily be configurable to something
else. Richard Post's points about views and item-mapping are
particularly relevant here.
Along those same lines, making the JDBCAppender asynchronous via the
AsyncAppender is clearly of value to users who can't wait for the insert
time. An example config file in the distribution should solve this
issue very neatly.
Ceki G�lc� wrote:
>
> Kevin and others,
>
> Given the recent remarks on JDBCAppender performance, one possibility
> for improvement is to use prepared statements and batch processing.
>
> I have created a table called logEvents for testing purposes:
>
> CREATE TABLE logEvents (
> loggerName varchar(200),
> timestamp bigint,
> levelName varChar(32),
> message varchar(1024),
> NDC varchar(1024)
> );
>
> (I am using PostgresQL 7.1.3).
>
> Here is a some JDBC code to exercise the table.
>
> import java.sql.*;
> import org.apache.log4j.*;
> import org.apache.log4j.spi.*;
>
> public class JDBCTest {
>
> public static void main(String[] args) throws Exception {
>
> Logger root = Logger.getRootLogger();
>
> Connection conn = null;
> String driver = "org.postgresql.Driver";
>
> Class.forName(driver).newInstance();
>
> conn = DriverManager.getConnection(args[0], args[1], args[2]);
>
> double start;
> double end;
> int LOOP_LEN = 100;
> int counter = 0;
>
> // -------------------------- Normal statement:
> start = System.currentTimeMillis();
> Statement s = null;
>
> for(int i = 0; i < LOOP_LEN; i++) {
> s = conn.createStatement();
> NDC.push("counter "+(counter++));
> LoggingEvent event = new LoggingEvent(Category.class.getName(),
> root, Level.DEBUG,
> "message " + i,
> null);
>
> s.executeUpdate("INSERT INTO logEvents (loggerName, timestamp, "
> + "levelName, message, NDC)"
> + "VALUES ('"+ event.logger.getName()+ "', "
> + event.timeStamp + ", '"
> + event.level.toString() + "', '"
> + event.getRenderedMessage() + "', '"
> + event.getNDC() + "')");
> NDC.pop();
> }
> s.close();
> end = System.currentTimeMillis();
> System.out.println("Overall (simple statement) : "+(end-start));
> System.out.println("Average: "+((end-start)*1000)/LOOP_LEN + "in
> microsecs.");
>
> PreparedStatement stmt;
>
> // Prepared statement
> start = System.currentTimeMillis();
> stmt = conn.prepareStatement("INSERT INTO logEvents (loggerName,
> timestamp, "
> + "levelName, message, NDC)"
> + "VALUES (?, ?, ?, ?, ?)");
>
> for(int i = 0; i < LOOP_LEN; i++) {
> NDC.push("counter "+(counter++));
> MDC.put("hello", "x");
> LoggingEvent event = new LoggingEvent(Category.class.getName(),
> root, Level.DEBUG,
> "message " + i,
> null);
>
> stmt.setString(1, event.logger.getName());
> stmt.setLong(2, event.timeStamp);
> stmt.setString(3, event.level.toString());
> stmt.setString(4, event.getRenderedMessage());
>
> stmt.setString(5, event.getNDC());
> NDC.pop();
> stmt.executeUpdate();
> }
> stmt.close();
> end = System.currentTimeMillis();
> System.out.println("Overall (prepared statement) : "+(end-start));
> System.out.println("Average: "+((end-start)*1000)/LOOP_LEN + "in
> microsecs.");
>
> // --- Batch mode -----------------------
> start = System.currentTimeMillis();
> stmt = conn.prepareStatement("INSERT INTO logEvents (loggerName,
> timestamp, "
> + "levelName, message, NDC)"
> + "VALUES (?, ?, ?, ?, ?)");
>
> for(int i = 0; i < LOOP_LEN; i++) {
> NDC.push("counter "+(counter++));
> LoggingEvent event = new LoggingEvent(Category.class.getName(),
> root, Level.DEBUG,
> "message" + i,
> null);
>
> stmt.setString(1, event.logger.getName());
> stmt.setLong(2, event.timeStamp);
> stmt.setString(3, event.level.toString());
> stmt.setString(4, event.getRenderedMessage());
>
> stmt.setString(5, event.getNDC());
> NDC.pop();
> stmt.addBatch();
> }
> stmt.executeBatch();
> stmt.close();
> end = System.currentTimeMillis();
> System.out.println("Overall (prepared statement) : "+(end-start));
> System.out.println("Average: "+((end-start)*1000)/LOOP_LEN + "in
> microsecs.");
>
> conn.close();
>
> }
> }
>
>
> Running this test code gives:
>
> ~/ >java JDBCTest jdbc:postgresql://somehost/someDatabaseName ceki ****
>
> Overall (simple statement) : 411.0
> Average: 4110.0in microsecs.
>
> Overall (prepared statement) : 421.0
> Average: 4210.0in microsecs.
>
> Overall (prepared statement) : 150.0
> Average: 1500.0in microsecs.
>
>
> As you can see prepared batch statements are significantly faster (3x)
> than prepared statements, whereas prepared statements are only
> marginally faster than simple statements. These results depend on the
> database and your mileage may vary.
>
> Notice that I did not insert the MDC or the trowable string
> representation. IMHO, these fields are best represented as blobs
> which PostgresQL version 7.1.3 does not support, although 7.2 does
> reportedly support them.
>
> (The only alternative I see to BLOBs are bit fields or arrays.)
>
> Once we settle on the best representation of the table, I think a
> table structure (table name and column names) should be fixed once and
> fall all. This would allow other components to query the database and
> present the results to the user in a convenient form. This cannot be
> done if the underlying table name and columns are not fixed.
>
> That is it for the moment.
>
>
> --
> Ceki
>
>
> --
> To unsubscribe, e-mail:
> <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail:
> <mailto:[EMAIL PROTECTED]>
>
>
>
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>