Perrin, Lionel wrote:
> I plan to use sqlite to implement an 'aggregation tool'. Basically,
> the design would be the following:
>
> 1. I implement a virtual table CFL(contract, amount) which may provide
>    up to 1 billion unsorted rows.
>
> 2. The aggregation phasis will be defined at run time and may consists
>    in something as simple as 'select contract, sum(amount) from cfl
>    group by contract'.
>
> 3. Since there will be only ~1000 different contracts, I expect SQLite
>    to aggregate 'on the fly' the rows from CFL.
>
> 4. Unfortunately, as shown by the explain plan, it looks that SQLite
>    first select all rows from CFL. When all rows are retrieved (which
>    implies something like a 8Go temporary file), the rows are sorted
>    and aggregated.
>
>    [0, 0, 0, "SCAN TABLE CFL VIRTUAL TABLE INDEX 0: (~0 rows)"]
>    [0, 0, 0, "USE TEMP B-TREE FOR GROUP BY"]
>
> Is there a way to force rows to be dispatched (using the group by) and
> aggregated on the fly instead of being stored, sorted and then
> aggregated?

SQLite can optimize GROUP BY this way only if it can prove that the
table is already sorted on the grouping column.  Otherwise, it creates
a temporary index for all records, and then groups on those.

In theory, it would be possible to create a temporary index that stores
the current aggregation _result_ for each group seen so far, and that is
updated dynamically.  However, this is not implemented in SQLite.


If you have only 1000 groups, you can just do a "SELECT x,y FROM cfl"
and do the aggregation in your code, in memory.


> The information contained in this e-mail message, and any attachment
> thereto, is confidential and may not be disclosed [...]

*** DISCLAIMER ***
This e-mail contains public information intended for any subscriber of
this mailing list and for anybody else who bothers to read it; it will
be copied, disclosed and distributed to the public.  If you think you
are not the intended recipient, please commit suicide immediately.
These terms apply also to any e-mails quoted in, referenced from, or
answering this e-mail, and supersede any disclaimers in those e-mails.
Additionally, disclaimers in those e-mails will cause a legal
processing fee of $42 per line; you have agreed to this by reading
this disclaimer.
*** END OF DISCLAIMER ***


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to