On 2016/05/27 4:17 AM, Michael Hari wrote:
I have a database that contains 2014,2015 and 2016’s worth of data broken down 
by year and quarter. In total, this db was 2.36 GB. Because of a replication 
requirement where the DB has to be under 2 GB, I’ve split the DB into 3 smaller 
databases by year (2014.db,2015.db,2016.db).

I would need to access at most two databases in one query for a year over year 
calculation over 6 quarters. Do I have the same performance as having one table 
with all the data if I write my sql this way?

Attach 2014.db as 2014
Attach 2015.db as 2015

(..Calculation..) FROM (SELECT * FROM 2014.datatable  UNION ALL SELECT * FROM 
2015.datatable)

You can assume the separate databases have the same indexing as the single db, 
where it’s indexed by year and quarter.

This will work definitely, but I am weary of the performance of selecting two whole tables into a sub-query. If your calculation will filter on dates (for instance) you could probably do better by doing the filtering in the sub-Query. So basically what I am suggesting is that instead of doing:

(..Calculation..)
  FROM (SELECT * FROM 2014.datatable  UNION ALL SELECT * FROM 2015.datatable)
 WHERE dtCol BETWEEN '2014-10-01' AND '2015-03-31'

You could rather do (with I think quicker results):


(..Calculation..)
  FROM (
      SELECT * FROM 2014.datatable WHERE dtCol BETWEEN '2014-10-01' AND 
'2015-03-31'
    UNION ALL
      SELECT * FROM 2015.datatable WHERE dtCol BETWEEN '2014-10-01' AND 
'2015-03-31'
  )

Note that SQLite features several optimisations that might make the above moot depending on your schema specifics; best is to experiment a bit.


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to