On Wed, 2006-06-07 at 00:03 -0500, David Wollmann wrote:
> Mark Drago wrote:
> > Hello,
> >
> > I'm writing a web cache and I want to use SQLite to store the log of all
> > of the accesses made through the web cache.  The idea is to install this
> > web cache in large institutions (1000-5000 workstations).  The log
> > database can grow in size very quickly and can reach in to the gigabytes
> > after just a few days.
> >
> > Writing to the database is speedy enough that I haven't seen much of a
> > problem.  I collect the data for 1000 web requests and then insert them
> > all in a single transaction using a prepared statement.  This works
> > rather well.
> >
> > The problem that I'm encountering has to do with generating reports on
> > the data in the log database.  SQLite is showing good performance on
> > some simple queries, but that is not the case once something more
> > advanced is involved, like an aggregate function for example.  More
> > over, once the SQLite file is cached in memory it is really quick.
> > However, I can't count on this file being cached at all when a user goes
> > to run the report.  So, I've been clearing my file cache before running
> > a test, and then running the same test again now that everything has
> > been loaded in to the cache.  Like I said, for most cases SQLite is
> > fine, but here is one example where it doesn't fare as well.
> >   
> [snip]
> 
> Our solution to a very similar problem was to generate new tables with
> daily subtotals (since our reports deal with daily usage), doing as much
> processing as possible in advance. This has worked very well, allowing
> all our reports to run quickly enough that they almost feel as if the
> results are cached. I'm not a database guru by any means, this approach
> was suggested to me by a friend who writes code to manage CDRs for a
> telecom billing system. I owe him a beer. Ok, several beers.

We tried doing this with some of our reports and it does help with the
speed.  But, we wanted to support a broader range of reporting options
than just daily snapshots.  We wanted to allow the user to view
arbitrary date ranges and to slice up the data in other ways, so
creating daily reports that were mostly static wasn't something we
really wanted to do.  We also wanted the reports to contain information
that was relatively recent (last 15 minutes or so).  We didn't want to
make the user wait a full day for their report to be available.

Thanks for the reply,
Mark.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to