On Fri, May 25, 2012 at 1:38 PM, Udi Karni <uka...@gmail.com> wrote: > I am running the following query - > > CREATE TABLE XYZ AS > SELECT ID, MIN (DATE) > FROM SOURCE-TABLE > WHERE CRITERIA > GROUP BY ID ; > > SOURCE-TABLE has 600 million rows, 2 million meet the WHERE criteria, and > get grouped to 100,000 distinct IDs. There are no indexes as there is too > much variety in the WHERE clauses and index creation on this table takes > many hours. Temp_store=memory. When executed - the source table is read > very slowly, and runtime is 1/2 hour. > > I split this SQL into 2 steps - the first just extracting - the second just > grouping. The extract read the table extremely fast and finished in 3 > minutes. The GROUP BY ran in a few seconds. So a 10X improvement - which is > pretty incredible speed.
Meaning? Something like this: CREATE TABLE FOO AS SELECT ID, DATE FROM SOURCE_TABLE; CREATE TABLE BAR AS SELECT ID, MIN(DATE) FROM FOO GROUP BY ID; ? Are you sure that there are no indexes? Maybe there are indexes implied by UNIQUE constraints and maybe SQLite3 is incorrectly deciding to scan the index instead of the table? My guess is that you have a UNIQUE constraint on ID in SOURCE_TABLE and SQLite3 is scanning the index and then fetching the DATE from the table. But this makes no sense because then it makes no sense to select the min() of DATE as there would be a single row per-ID. My guess has to be wrong. Maybe there's a UNIQUE constraint on several columns including ID? That would not be incompatible with the semantics of your query. Can you post EXPLAIN QUERY PLAN output for your statements? And/or your schema. Nico -- _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users