Perrin, Lionel wrote: > I plan to use sqlite to implement an 'aggregation tool'. Basically, > the design would be the following: > > 1. I implement a virtual table CFL(contract, amount) which may provide > up to 1 billion unsorted rows. > > 2. The aggregation phasis will be defined at run time and may consists > in something as simple as 'select contract, sum(amount) from cfl > group by contract'. > > 3. Since there will be only ~1000 different contracts, I expect SQLite > to aggregate 'on the fly' the rows from CFL. > > 4. Unfortunately, as shown by the explain plan, it looks that SQLite > first select all rows from CFL. When all rows are retrieved (which > implies something like a 8Go temporary file), the rows are sorted > and aggregated. > > [0, 0, 0, "SCAN TABLE CFL VIRTUAL TABLE INDEX 0: (~0 rows)"] > [0, 0, 0, "USE TEMP B-TREE FOR GROUP BY"] > > Is there a way to force rows to be dispatched (using the group by) and > aggregated on the fly instead of being stored, sorted and then > aggregated?
SQLite can optimize GROUP BY this way only if it can prove that the table is already sorted on the grouping column. Otherwise, it creates a temporary index for all records, and then groups on those. In theory, it would be possible to create a temporary index that stores the current aggregation _result_ for each group seen so far, and that is updated dynamically. However, this is not implemented in SQLite. If you have only 1000 groups, you can just do a "SELECT x,y FROM cfl" and do the aggregation in your code, in memory. > The information contained in this e-mail message, and any attachment > thereto, is confidential and may not be disclosed [...] *** DISCLAIMER *** This e-mail contains public information intended for any subscriber of this mailing list and for anybody else who bothers to read it; it will be copied, disclosed and distributed to the public. If you think you are not the intended recipient, please commit suicide immediately. These terms apply also to any e-mails quoted in, referenced from, or answering this e-mail, and supersede any disclaimers in those e-mails. Additionally, disclaimers in those e-mails will cause a legal processing fee of $42 per line; you have agreed to this by reading this disclaimer. *** END OF DISCLAIMER *** Regards, Clemens _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users