On Fri, 17 Apr 2009, Luis Fernando Mu?oz Mej?as wrote: > Hi, > >> I am thinking on how to enhance the engine so that fastest-possible >> database writes (actually, any output) are possible. However, I come >> across a couple of points. I would like to do so in the most generic >> way. Let me quote those message parts that I have specific questions >> on (out of sequence, thus I preserve the full message below - if you >> need more context). >> >> > I made a small Python prototype to do something similar to what you >> > propose, with no batches, but committing each 1000 entries. The >> > speedup I got by introducing batches was about a factor 50. And the >> > statement was already prepared. >> >> Could you check what actually brings most of the speedup - the batches >> or the prepared statement. I am thinking along the lines of using >> batches but not prepared statements, as in this sample >> >> begin insert ... insert ... insert ... insert ... end > > I'll do, but please note that > > begin > execute(unprepared_insert_statement) > execute(unprepared_insert_statement) > execute(unprepared_insert_statement) > execute(unprepared_insert_statement) > commit > > Needs 4 message exchanges with the server. OTOH: > > <client> > push (@batch, $item); > push (@batch, $item); > push (@batch, $item); > push (@batch, $item); > <send to server> > begin > execute_many (insert_statement, @batch) > commit > > Requires only one, so the network overhead is *way* smaller. This is > true not only of Oracle, but also of PostgreSQL, and I suppose MySQL > provides similar API.
no disagreement that it's less network overhead, but in my experiance simple inserts aren't bottlenecked on the network overhead, they are bottlenecked on transaction overhead (including fsync overhead) > I'll try to verify where the hottest spot is, anyways. thanks. David Lang _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

