Thanks for the hints so far.
Here is my existing Mysql setup and what I've tried with Sqlite....
*Mysql Table structure:*
CREATE TABLE `mydb`.`mytable` (
`c1`
enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
NOT NULL,
`c2`
enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
NOT NULL,
`c3`
enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
NOT NULL,
`c4`
enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
NOT NULL,
`c5`
enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
NOT NULL,
`c6`
enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
NULL,
`c7`
enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
NULL,
`price` smallint(4) NOT NULL, KEY `c1` (`c1`), KEY `c2` (`c2`), KEY
`c3` (`c3`), KEY `c4` (`c4`), KEY `c5` (`c5`), KEY `c6` (`c6`), KEY
`c7` (`c7`), KEY `price` (`price`) ) ENGINE=MyISAM DEFAULT
CHARSET=latin1 PACK_KEYS=1 ROW_FORMAT=COMPRESSED;
*CSV Import Data e.g.:*
1,A,P,0,0,X,X,300
1,A,P,0,0,X,P,9999
A,A,P,0,0,,,2000
B,3,Y,0,1,X,,300
........ approx 30 million row, 500MB csv, text file
If I compress the table it is approximately 900MB in size with an index
of approximately 550MB, which i can load into memory via LOAD INDEX INTO
CACHE
*Mysql Query example:*
SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 IN
('S','Z')) UNION ALL
SELECT * FROM mytable WHERE (c3 = 'W' AND c4 IN ('C','E','F') AND c5 =
'S' AND c6 IS NULL) UNION ALL
SELECT * FROM mytable WHERE (c4 = 'W' AND c5 IN ('C','E','F') AND c6 =
'S' AND c7 IS NULL) UNION ALL
SELECT * FROM mytable WHERE (c1 = 'W' AND c2 IN ('3','C','E','F') AND c3
IN ('2','5') ) UNION ALL
SELECT * FROM mytable WHERE (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6
IS NULL ) UNION ALL
SELECT * FROM mytable WHERE (c2 = '1' AND c3 = '1' AND c4 IN
('C','E','F') AND c5 = 'S' AND c7 IS NULL)
ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC;
*
The above query takes 37 seconds and produces approx 200,000 results and
uses 550MB RAM*
*So, in Sqlite3 I created a similar table structure like this:*
CREATE TABLE [mytable] ([c1] TEXT NOT NULL,[c2] TEXT NOT NULL,[c3]
TEXT NOT NULL,[c4] TEXT NOT NULL,[c5] TEXT NOT NULL,[c6] TEXT
NULL,[c7] TEXT NULL,[price] INTEGER NOT NULL)
CREATE INDEX [c1] ON [mytable]([c1] ASC)
CREATE INDEX [c2] ON [mytable]([c2] ASC)
CREATE INDEX [c3] ON [mytable]([c3] ASC)
CREATE INDEX [c4] ON [mytable]([c4] ASC)
CREATE INDEX [c5] ON [mytable]([c5] ASC)
CREATE INDEX [c6] ON [mytable]([c6] ASC)
CREATE INDEX [c7] ON [mytable]([c7] ASC)
CREATE INDEX [price] ON [mytable]([price] ASC)
*Then I imported the same csv data using:*
.separator ","
.import mycsv.csv mytable
*Then fixed the NULL values in the last two columns with:*
UPDATE mytable SET c6 = NULL where c6 = '';
UPDATE mytable SET c7 = NULL where c7 = '';
Then Vacuumed - took 6 hours!
This leaves me with a 4GB Sqlite table
*Then queried the Sqlite3 table with:*
PRAGMA cache_size = 20000000; */uses up 1.5GB RAM regardless*/
PRAGMA page_size = 20000000; /*this doesn't make any difference*/
SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 IN
('S','Z')) UNION ALL
SELECT * FROM mytable WHERE (c3 = 'W' AND c4 IN ('C','E','F') AND c5 =
'S' AND c6 IS NULL) UNION ALL
SELECT * FROM mytable WHERE (c4 = 'W' AND c5 IN ('C','E','F') AND c6 =
'S' AND c7 IS NULL) UNION ALL
SELECT * FROM mytable WHERE (c1 = 'W' AND c2 IN ('3','C','E','F') AND c3
IN ('2','5') ) UNION ALL
SELECT * FROM mytable WHERE (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6
IS NULL ) UNION ALL
SELECT * FROM mytable WHERE (c2 = '1' AND c3 = '1' AND c4 IN
('C','E','F') AND c5 = 'S' AND c7 IS NULL)
ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC;
*The above query takes 57 seconds, 20 seconds slower than Mysql but
produces the same results - but uses up 1.5GB RAM!*
Also if I try some more complex queries it always runs out steam at 2GB
RAM and crashes. Maybe that is a limitation of my PC or XP though.
I have also tried...
1. Sqlite table with no indexes - very slow!
2. Sqlite table with one unique index using all columns - still slower
than seperate indexes.
So I've had a fair crack of the whip with Sqlite. I was hoping it would
be faster and use less memory, no luck though.
Unless anyone has some good ideas I might have to give up on Sqlite.
Cheers.
Kim
Kim Boulton wrote:
> Hello,
>
> I'm trying out Sqlite3 with an eye to improving the performance of
> queries on an existing MySQL database.
>
> I've imported the data into sqlite which is approx. 30 million rows of
> part numbers each with a price.
>
> So far, it's approx. four times slower than the MySQL version, and the
> size of the sqlite database is too big to fit in memory (several GB)
> whereas I can get the MySQL data down to 900MB if it's compressed and
> read only.
>
> I would appreciate some tips or pointers on getting sqlite3 performance
> up and the data size down. I googled but couldn't find much.
>
> I don't need concurrency or inserts, it's single user, read only.
>
> TIA
>
> kimb
>
>
>
>
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users