Ceki et al.,

First, my apologies for being terse in my message, I'm just short on 
time this month.

Problems:
   addBatch is an optional method in Statement.  It is not guarenteed to 
be implemented by all drivers (in fact it is not implemented in the 
mySQL driver I use). 

   Your logEvents table definition is not valid on all databases.  And 
in fact it's not valid on -most- databases.  And any table we come up 
with will lack something a user may want, until we have some much that 
it's overly bloated for 95% of the users.  Then we would likely need to 
write several different versions of the table creation to support the 
different databases.

   Note that your prepared statements were in fact slower than simple 
statements and not all databases truely support prepared statements 
(mySQL is the only slacker that I know of but there may be others).

Suggestions/Solutions/Previous design considerations:

   When I first wrote the JDBCAppender I knew that it would be used on a 
variety of databases in different circumstances for users with different 
logging needs.  The base level JDBCAppender needs to support that level 
of generality.  The only way to support that variety is to leave the sql 
and the table definition to the user.  The current JDBCAppender supports 
this in an easy and straight forward way.  In particular, if users want 
prepared statements they can set the sql to call a stored procedure of 
their choice (in essence what the prepared statement does).  The 
advantage here is that the users' table can be matched to their 
particular needs -- for example, storing Thread but not NDC, or Class & 
line number, or an Event extension...  It also allows them to log to an 
existing table in a legacy installation.
   Due to the problems above and the design goals in the previous 
paragraph I am strong opposed to tieing the basic JDBCAppender to a 
particular table definition.

   There clearly is value to having other components able to query the 
database for log events.  Chainsaw is the obvious example.  Someone just 
posted to the user list that Chainsaw can already do this.  It's my 
opinion that the database reader bears a burden of being configurable to 
the proper database. 
   Alternately, log4j could have a -preffered- table definition.  A few 
scripts and default configuration files could be included with the 
distribution for this preferred definition.  Then log4j would integrate 
on that 'out of the box', but would easily be configurable to something 
else.  Richard Post's points about views and item-mapping are 
particularly relevant here. 

   Along those same lines, making the JDBCAppender asynchronous via the 
AsyncAppender is clearly of value to users who can't wait for the insert 
time.  An example config file in the distribution should solve this 
issue very neatly.







Ceki G�lc� wrote:

> 
> Kevin and others,
> 
> Given the recent remarks on JDBCAppender performance, one possibility
> for improvement is to use prepared statements and batch processing.
> 
> I have created a table called logEvents for testing purposes:
> 
> CREATE TABLE logEvents (
>    loggerName varchar(200),
>    timestamp bigint,
>    levelName varChar(32),
>    message varchar(1024),
>    NDC varchar(1024)
> );
> 
> (I am using PostgresQL 7.1.3).
> 
> Here is a some JDBC code to exercise the table.
> 
> import java.sql.*;
> import org.apache.log4j.*;
> import org.apache.log4j.spi.*;
> 
> public class JDBCTest {
> 
>   public static void main(String[] args) throws Exception {
> 
>     Logger root = Logger.getRootLogger();
> 
>     Connection conn = null;
>     String driver = "org.postgresql.Driver";
> 
>     Class.forName(driver).newInstance();
> 
>     conn = DriverManager.getConnection(args[0], args[1], args[2]);
> 
>     double start;
>     double end;
>     int LOOP_LEN = 100;
>     int counter = 0;
> 
>     // -------------------------- Normal statement:
>     start = System.currentTimeMillis();
>     Statement s = null;
> 
>     for(int i = 0; i <  LOOP_LEN; i++) {
>       s = conn.createStatement();
>       NDC.push("counter "+(counter++));
>       LoggingEvent event = new LoggingEvent(Category.class.getName(),
>                                             root, Level.DEBUG,
>                                             "message " + i,
>                                             null);
> 
>       s.executeUpdate("INSERT INTO logEvents (loggerName, timestamp, "
>                                             + "levelName, message, NDC)"
>       + "VALUES ('"+ event.logger.getName()+ "', "
>       + event.timeStamp + ", '"
>       + event.level.toString() + "', '"
>       + event.getRenderedMessage() + "', '"
>       + event.getNDC() + "')");
>       NDC.pop();
>     }
>     s.close();
>     end = System.currentTimeMillis();
>     System.out.println("Overall (simple statement) : "+(end-start));
>     System.out.println("Average: "+((end-start)*1000)/LOOP_LEN + "in 
> microsecs.");
> 
>     PreparedStatement stmt;
> 
>     // Prepared statement
>     start = System.currentTimeMillis();
>     stmt = conn.prepareStatement("INSERT INTO logEvents (loggerName, 
> timestamp, "
>                                             + "levelName, message, NDC)"
>                                         + "VALUES (?, ?, ?, ?, ?)");
> 
>     for(int i = 0; i <  LOOP_LEN; i++) {
>       NDC.push("counter "+(counter++));
>       MDC.put("hello", "x");
>       LoggingEvent event = new LoggingEvent(Category.class.getName(),
>                                             root, Level.DEBUG,
>                                             "message " + i,
>                                             null);
> 
>       stmt.setString(1, event.logger.getName());
>       stmt.setLong(2, event.timeStamp);
>       stmt.setString(3, event.level.toString());
>       stmt.setString(4, event.getRenderedMessage());
> 
>       stmt.setString(5, event.getNDC());
>       NDC.pop();
>       stmt.executeUpdate();
>     }
>     stmt.close();
>     end = System.currentTimeMillis();
>     System.out.println("Overall (prepared statement) : "+(end-start));
>     System.out.println("Average: "+((end-start)*1000)/LOOP_LEN + "in 
> microsecs.");
> 
>     // --- Batch mode -----------------------
>     start = System.currentTimeMillis();
>     stmt = conn.prepareStatement("INSERT INTO logEvents (loggerName, 
> timestamp, "
>                                             + "levelName, message, NDC)"
>                                        + "VALUES (?, ?, ?, ?, ?)");
> 
>     for(int i = 0; i <  LOOP_LEN; i++) {
>       NDC.push("counter "+(counter++));
>       LoggingEvent event = new LoggingEvent(Category.class.getName(),
>                                             root, Level.DEBUG,
>                                             "message" + i,
>                                             null);
> 
>       stmt.setString(1, event.logger.getName());
>       stmt.setLong(2, event.timeStamp);
>       stmt.setString(3, event.level.toString());
>       stmt.setString(4, event.getRenderedMessage());
> 
>       stmt.setString(5, event.getNDC());
>       NDC.pop();
>       stmt.addBatch();
>     }
>     stmt.executeBatch();
>     stmt.close();
>     end = System.currentTimeMillis();
>     System.out.println("Overall (prepared statement) : "+(end-start));
>     System.out.println("Average: "+((end-start)*1000)/LOOP_LEN + "in 
> microsecs.");
> 
>     conn.close();
> 
>   }
> }
> 
> 
> Running this test code gives:
> 
> ~/ >java JDBCTest  jdbc:postgresql://somehost/someDatabaseName ceki ****
> 
> Overall (simple statement) : 411.0
> Average: 4110.0in microsecs.
> 
> Overall (prepared statement) : 421.0
> Average: 4210.0in microsecs.
> 
> Overall (prepared statement) : 150.0
> Average: 1500.0in microsecs.
> 
> 
> As you can see prepared batch statements are significantly faster (3x)
> than prepared statements, whereas prepared statements are only
> marginally faster than simple statements. These results depend on the
> database and your mileage may vary.
> 
> Notice that I did not insert the MDC or the trowable string
> representation. IMHO, these fields are best represented as blobs
> which PostgresQL version 7.1.3 does not support, although 7.2 does
> reportedly support them.
> 
> (The only alternative I see to BLOBs are bit fields or arrays.)
> 
> Once we settle on the best representation of the table, I think a
> table structure (table name and column names) should be fixed once and
> fall all. This would allow other components to query the database and
> present the results to the user in a convenient form. This cannot be
> done if the underlying table name and columns are not fixed.
> 
> That is it for the moment.
> 
> 
> -- 
> Ceki
> 
> 
> -- 
> To unsubscribe, e-mail:   
> <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail: 
> <mailto:[EMAIL PROTECTED]>
> 
> 
> 


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to