On 2016/05/27 4:17 AM, Michael Hari wrote:
I have a database that contains 2014,2015 and 2016’s worth of data broken down
by year and quarter. In total, this db was 2.36 GB. Because of a replication
requirement where the DB has to be under 2 GB, I’ve split the DB into 3 smaller
databases by year (2014.db,2015.db,2016.db).
I would need to access at most two databases in one query for a year over year
calculation over 6 quarters. Do I have the same performance as having one table
with all the data if I write my sql this way?
Attach 2014.db as 2014
Attach 2015.db as 2015
(..Calculation..) FROM (SELECT * FROM 2014.datatable UNION ALL SELECT * FROM
2015.datatable)
You can assume the separate databases have the same indexing as the single db,
where it’s indexed by year and quarter.
This will work definitely, but I am weary of the performance of
selecting two whole tables into a sub-query. If your calculation will
filter on dates (for instance) you could probably do better by doing the
filtering in the sub-Query. So basically what I am suggesting is that
instead of doing:
(..Calculation..)
FROM (SELECT * FROM 2014.datatable UNION ALL SELECT * FROM 2015.datatable)
WHERE dtCol BETWEEN '2014-10-01' AND '2015-03-31'
You could rather do (with I think quicker results):
(..Calculation..)
FROM (
SELECT * FROM 2014.datatable WHERE dtCol BETWEEN '2014-10-01' AND
'2015-03-31'
UNION ALL
SELECT * FROM 2015.datatable WHERE dtCol BETWEEN '2014-10-01' AND
'2015-03-31'
)
Note that SQLite features several optimisations that might make the
above moot depending on your schema specifics; best is to experiment a bit.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users