Why do you need the 7 single-column indexes?  Do you ever do a
lookup on a single column?  Bear in mind that only 1 index is used per
query, so having seven separate indexes on seven separate columns
means that six are always unused.

   I'm curious why the UNION is faster than the OR'ed version... can
you post the EXPLAINs of the two?  (Feel free to send them to me
personally... I dunno if the rest of the list cares :)

   -T

On Mon, Feb 23, 2009 at 1:43 AM, Kim Boulton <k...@jesk.co.uk> wrote:
>
> Sorry, I forgot to mention that I was running the query twice, first to
> cache it, second to measure the speed.
>
> Yeah it's a horrible query but produces the fastest results in Mysql.
> maybe using OR instead works better on Sqlite, working on that one.
>
> Problem I have with indexes in sqlite is that the table balloons 400MB
> in size with every index i add. At the moment it's 4GB with one on each
> column, if i add more indexes across other columns i'm thinking it'll
> get too big to cache up.
>
> thanks
>
> Thomas Briggs wrote:
>>    For starters, I think that loading the index into the cache in
>> MySQL is biasing your performance measures.  SQLite will automatically
>> load pages of any necessary indexes into memory as part of executing
>> the query, but doing so takes time.  By preloading the index on MySQL,
>> you're removing that time from the amount of time MySQL spends
>> executing that query, so you're not really comparing apples to apples.
>>  Said another way: how long does the query take in MySQL if you don't
>> pre-load the index?
>>
>>    Secondly, the page size needs to be set before the database is
>> created, i.e. before you execute the CREATE TABLE statement.
>>
>>    Thirdly, that's a pretty horrible query. :)  Collapsing the
>> different queries into fewer queries - even if it isn't a single query
>> - will speed things up.  Each SELECT is a separate query that requires
>> that the table data be read; reducing the number of SELECTs will thus
>> reduce query time.
>>
>>    Lastly, your indexes need to reflect how the data is accessed.  For
>> example, if you're going to do "WHERE c5 = x AND c6 = y AND c7 = 7"
>> then you need one index on (c5, c6, c7) - not three individual indexes
>> on c5, c6, and c7.
>>
>>    Performance tuning rule #1 - problems are almost always in your use
>> of the database, not the database itself. :)
>>
>>    -Tom
>>
>> On Sun, Feb 22, 2009 at 3:48 AM, Kim Boulton <k...@jesk.co.uk> wrote:
>>
>>> Thanks for the hints so far.
>>>
>>> Here is my existing Mysql setup and what I've tried with Sqlite....
>>>
>>> *Mysql Table structure:*
>>> CREATE TABLE  `mydb`.`mytable` (
>>>  `c1`
>>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>>> NOT NULL,
>>>  `c2`
>>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>>> NOT NULL,
>>>  `c3`
>>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>>> NOT NULL,
>>>  `c4`
>>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>>> NOT NULL,
>>>  `c5`
>>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>>> NOT NULL,
>>>  `c6`
>>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>>> NULL,
>>>  `c7`
>>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>>> NULL,
>>>  `price` smallint(4) NOT NULL,  KEY `c1` (`c1`),  KEY `c2` (`c2`),  KEY
>>> `c3` (`c3`),  KEY `c4` (`c4`),  KEY `c5` (`c5`),  KEY `c6` (`c6`),  KEY
>>> `c7` (`c7`),  KEY `price` (`price`) ) ENGINE=MyISAM DEFAULT
>>> CHARSET=latin1 PACK_KEYS=1 ROW_FORMAT=COMPRESSED;
>>>
>>> *CSV Import Data e.g.:*
>>> 1,A,P,0,0,X,X,300
>>> 1,A,P,0,0,X,P,9999
>>> A,A,P,0,0,,,2000
>>> B,3,Y,0,1,X,,300
>>> ........ approx 30 million row, 500MB csv, text file
>>>
>>> If I compress the table it is approximately 900MB in size with an index
>>> of approximately 550MB, which i can load into memory via LOAD INDEX INTO
>>> CACHE
>>>
>>> *Mysql Query example:*
>>> SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 IN
>>> ('S','Z')) UNION ALL
>>> SELECT * FROM mytable WHERE (c3 = 'W' AND c4 IN ('C','E','F') AND c5 =
>>> 'S' AND c6 IS NULL) UNION ALL
>>> SELECT * FROM mytable WHERE (c4 = 'W' AND c5 IN ('C','E','F') AND c6 =
>>> 'S' AND c7 IS NULL) UNION ALL
>>> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 IN ('3','C','E','F') AND c3
>>> IN ('2','5') ) UNION ALL
>>> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6
>>> IS NULL ) UNION ALL
>>> SELECT * FROM mytable WHERE (c2 = '1' AND c3 = '1' AND c4 IN
>>> ('C','E','F') AND c5 = 'S' AND c7 IS NULL)
>>> ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC;
>>> *
>>> The above query takes 37 seconds and produces approx 200,000 results and
>>> uses 550MB RAM*
>>>
>>> *So, in Sqlite3 I created a similar table structure like this:*
>>> CREATE TABLE [mytable] ([c1] TEXT  NOT NULL,[c2] TEXT  NOT NULL,[c3]
>>> TEXT  NOT NULL,[c4] TEXT  NOT NULL,[c5] TEXT  NOT NULL,[c6] TEXT
>>> NULL,[c7] TEXT  NULL,[price] INTEGER  NOT NULL)
>>> CREATE INDEX [c1] ON [mytable]([c1]  ASC)
>>> CREATE INDEX [c2] ON [mytable]([c2]  ASC)
>>> CREATE INDEX [c3] ON [mytable]([c3]  ASC)
>>> CREATE INDEX [c4] ON [mytable]([c4]  ASC)
>>> CREATE INDEX [c5] ON [mytable]([c5]  ASC)
>>> CREATE INDEX [c6] ON [mytable]([c6]  ASC)
>>> CREATE INDEX [c7] ON [mytable]([c7]  ASC)
>>> CREATE INDEX [price] ON [mytable]([price]  ASC)
>>>
>>> *Then I imported the same csv data using:*
>>> .separator ","
>>> .import mycsv.csv mytable
>>>
>>> *Then fixed the NULL values in the last two columns with:*
>>> UPDATE mytable SET c6 = NULL where c6 = '';
>>> UPDATE mytable SET c7 = NULL where c7 = '';
>>>
>>> Then Vacuumed - took 6 hours!
>>>
>>> This leaves me with a 4GB Sqlite table
>>>
>>> *Then queried the Sqlite3 table with:*
>>> PRAGMA cache_size = 20000000; */uses up 1.5GB RAM regardless*/
>>> PRAGMA page_size = 20000000; /*this doesn't make any difference*/
>>> SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 IN
>>> ('S','Z')) UNION ALL
>>> SELECT * FROM mytable WHERE (c3 = 'W' AND c4 IN ('C','E','F') AND c5 =
>>> 'S' AND c6 IS NULL) UNION ALL
>>> SELECT * FROM mytable WHERE (c4 = 'W' AND c5 IN ('C','E','F') AND c6 =
>>> 'S' AND c7 IS NULL) UNION ALL
>>> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 IN ('3','C','E','F') AND c3
>>> IN ('2','5') ) UNION ALL
>>> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6
>>> IS NULL ) UNION ALL
>>> SELECT * FROM mytable WHERE (c2 = '1' AND c3 = '1' AND c4 IN
>>> ('C','E','F') AND c5 = 'S' AND c7 IS NULL)
>>> ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC;
>>>
>>> *The above query takes 57 seconds, 20 seconds slower than Mysql but
>>> produces the same results - but uses up 1.5GB RAM!*
>>>
>>> Also if I try some more complex queries it always runs out steam at 2GB
>>> RAM and crashes. Maybe that is a limitation of my PC or XP though.
>>>
>>> I have also tried...
>>> 1. Sqlite table with no indexes - very slow!
>>> 2. Sqlite table with one unique index using all columns - still slower
>>> than seperate indexes.
>>>
>>> So I've had a fair crack of the whip with Sqlite. I was hoping it would
>>> be faster and use less memory, no luck though.
>>>
>>> Unless anyone has some good ideas I might have to give up on Sqlite.
>>>
>>> Cheers.
>>>
>>> Kim
>>>
>>> Kim Boulton wrote:
>>>
>>>> Hello,
>>>>
>>>> I'm trying out Sqlite3 with an eye to improving the performance of
>>>> queries on an existing MySQL database.
>>>>
>>>> I've imported the data into sqlite which is approx. 30 million rows of
>>>> part numbers each with a price.
>>>>
>>>> So far, it's approx. four times slower than the MySQL version, and the
>>>> size of the sqlite database is too big to fit in memory (several GB)
>>>> whereas I can get the MySQL data down to 900MB if it's compressed and
>>>> read only.
>>>>
>>>> I would appreciate some tips or pointers on getting sqlite3 performance
>>>> up and the data size down. I googled but couldn't find much.
>>>>
>>>> I don't need concurrency or inserts, it's single user, read only.
>>>>
>>>> TIA
>>>>
>>>> kimb
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> sqlite-users mailing list
>>>> sqlite-users@sqlite.org
>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>>
>>
>>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to