After posting my question, I found the discussion of how aggregate
operations are performed in the VDBE Tutorial; that implies that memory
usage will correspond with the number of unique keys encountered by the
query, but I appreciate having it stated explicitly.

   How difficult would it be, in concept, to change the storage of the
hash buckets from in-memory to on-disk?  The VDBE Tutorial makes it
sound like it would be a matter of changing the AggFocus, AggNext, and
maybe a few other operations to store/retrieve buckets from disk before
operating on them in memory.  How dramatically am I oversimplifying
this? :)

   Thanks
   -Tom

> -----Original Message-----
> From: D. Richard Hipp [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, March 24, 2005 10:32 AM
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Memory usage for queries containing a 
> GROUP BY clause
> 
> On Thu, 2005-03-24 at 10:09 -0500, Thomas Briggs wrote:
> 
> >    I have a 1GB database containing a single table.  Simple queries
> > against this table (SELECT COUNT(*), etc.) run without 
> using more than a
> > few MBs of memory; the amount used seems to correspond 
> directly with the
> > size of the page cache, as I expected.  If I execute an 
> aggregate query
> > that contains a GROUP BY clause, however, the memory usage 
> seems to jump
> > quite a bit - the memory usage seems to correlate with the number of
> > columns in the GROUP BY.  Grouping by three columns using a couple
> > hundred megs of memory; grouping by eight columns uses more 
> than 1.3GB!.
> > 
> 
> The entire results set of an aggregate query is held in
> memory.  The only way to limit the memory usage of an
> aggregate query is to construct the query in such a way
> that the number of rows in the result set is limited.
> If you are using 1.3GB of memory to do an aggregate query,
> my guess is that there are a very large number of rows in
> the result set.
> 
> It is, in theory, possible to reduce the memory requirements
> for an aggregate query if the GROUP BY clause specifies columns
> that are indexed.  But SQLite does not currently implement
> that optimization.  You can implement the optimization manually,
> though, by turning the GROUP BY clause into an ORDER BY clause,
> dropping the aggregate functions, then doing the aggregation
> manually as the rows come out of the database in order.
> 
> -- 
> D. Richard Hipp <[EMAIL PROTECTED]>
> 
> 

Reply via email to