[EMAIL PROTECTED] wrote:


Since you are already splitting your data into "fulltext columns" and "other data" I would keep that design. Others on the list have greatly improved their performance by running their tables in this same fashion. If by some chance your "non-fulltext" columns are all fixed width, there is another speed boost just by changing your design.

Well, is there any possibility to make text column fixed width I don't know about ??? ;-)
I would prefer to have one, since 12Kb is the top limit for my application.



I think you are on the right track to an optimal storage solution. However, I would not duplicate all of your fields in your "fulltext" table, only the Primary key and the actual FT data needs to be there. The rest of the data is already on your non-FT table so no need to double up on your storage.


Of course, I could suggest more concrete examples if you actually post the table designs you have now (SHOW CREATE TABLE tablename\G) so that I could refer to your tables by their real names and columns.

I haven't decided yet the table structure, but here is what I'm playing with:


// title and description are stemmed title and description
// accordingly
DROP TABLE IF EXISTS search;
CREATE TABLE search (
   lot_id        int unsigned not null primary key,

   category_id   int unsigned not null,

   current_price decimal(16,2) unsigned not null,
   end_time      int unsigned not null,
   bid_count     int unsigned not null,

   title         char(50)     not null,
   description   text         not null,

   FULLTEXT ttitle (title),
   FULLTEXT ttitle_description (title, description)
) TYPE = MyISAM COMMENT = "Search table";

// ntitle and ndescription are stemmed title and description
// accordingly
DROP TABLE IF EXISTS lot;
CREATE TABLE lot (
   id                   int unsigned           not null primary key,
   owner_id             int unsigned           not null default 0,

   title                char(50)               not null,
   description          text                   not null,

   ntitle                char(50)               not null,
   ndescription          text                   not null,

   current_price        decimal(16,2) unsigned not null,
   reserve_price        decimal(16,2) unsigned not null,
   buy_price            decimal(16,2) unsigned not null,

   quantity             int           unsigned not null,
   sold                 int           unsigned not null,

   bid_count            int           unsigned not null,

   start_time           int           unsigned not null,
   end_time             int           unsigned not null,

city char(35) not null,
type_id int unsigned not null,
category_id int unsigned not null,
currency_id int unsigned not null,
country_id int unsigned not null,
state_id int unsigned not null,
delivery_id int unsigned not null,


   enabled              bool         default 1 not null,

   INDEX title         (title),

   INDEX current_price (current_price),
   INDEX bid_count     (bid_count),

   INDEX end_time      (end_time),

   INDEX enabled       (enabled),

   FULLTEXT ttitle (ntitle),
   FULLTEXT ttitle_description (ntitle, ndescription)

) TYPE = MyISAM COMMENT = "Lots";


// this is a helper table // to query all subcategories of a category // in one query DROP TABLE IF EXISTS category_map; CREATE TABLE category_map ( parent_id int unsigned not null, child_id int unsigned not null, INDEX parent_id (parent_id), INDEX child_id (child_id) ) TYPE = MyISAM COMMENT = "Categories Map";



Here are the queries I'm playing with (search is performed against 50 000 rows).
The table is filled in with random data generated via use of wordlists.
The length of description field is not more of 12Kb but really close


1.
SELECT SQL_CALC_FOUND_ROWS lot.id, lot.title
FROM search
INNER JOIN lot
ON lot.id = search.lot_id
AND lot.enabled = 1
WHERE
MATCH(search.title, search.description) AGAINST ('someword' IN BOOLEAN MODE)
ORDER BY search.end_time
LIMIT 0, 50
...
50 rows in set (4.22 sec)
mysql> EXPLAIN SELECT SQL_CALC_FOUND_ROWS lot.id, lot.title FROM search INNER JOIN lot ON lot.id = search.lot_id AND lot.enabled = 1 WHERE MATCH(search.title, search.description) AGAINST ('someword' IN BOOLEAN MODE) ORDER BY search.end_time LIMIT 0, 50;
+----+-------------+--------+----------+-----------------------------------+--------------------+---------+------------------------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+----------+-----------------------------------+--------------------+---------+------------------------+------+-----------------------------+
| 1 | SIMPLE | search | fulltext | PRIMARY,lot_id,ttitle_description | ttitle_description | 0 | | 1 | Using where; Using filesort |
| 1 | SIMPLE | lot | eq_ref | PRIMARY,enabled | PRIMARY | 4 | tauction.search.lot_id | 1 | Using where |
+----+-------------+--------+----------+-----------------------------------+--------------------+---------+------------------------+------+-----------------------------+
2 rows in set (0.00 sec)



4.22 sec is unacceptable for 86 rows result set


2. I've tried the same as (1) query, but ORDER BY lot.end_time
mysql> EXPLAIN SELECT SQL_CALC_FOUND_ROWS lot.id, lot.title FROM search INNER JOIN lot ON lot.id = search.lot_id AND lot.enabled = 1 WHERE MATCH(search.title, search.description) AGAINST ('someword' IN BOOLEAN MODE) ORDER BY lot.end_time LIMIT 0, 50;
+----+-------------+--------+----------+-----------------------------------+--------------------+---------+------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+----------+-----------------------------------+--------------------+---------+------------------------+------+----------------------------------------------+
| 1 | SIMPLE | search | fulltext | PRIMARY,lot_id,ttitle_description | ttitle_description | 0 | | 1 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | lot | eq_ref | PRIMARY,enabled | PRIMARY | 4 | tauction.search.lot_id | 1 | Using where |
+----+-------------+--------+----------+-----------------------------------+--------------------+---------+------------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)


Which adds timeporary table, and takes
4.21 sec for 76 rows result set


3.

SELECT SQL_CALC_FOUND_ROWS lot.id, lot.title
FROM search
INNER JOIN category_map
ON category_map.parent_id = 990
AND search.category_id = category_map.child_id
INNER JOIN lot
ON lot.id = search.lot_id
AND lot.enabled = 1
WHERE
MATCH(search.title, search.description) AGAINST ('someword' IN BOOLEAN MODE)
ORDER BY search.end_time
LIMIT 0, 50;
...
3 rows in set (0.67 sec)


mysql> EXPLAIN SELECT SQL_CALC_FOUND_ROWS lot.id, lot.title FROM search INNER JOIN category_map ON category_map.parent_id = 990 AND search.category_id = category_map.child_id INNER JOIN lot ON lot.id = search.lot_id AND lot.enabled = 1 WHERE MATCH(search.title, search.description) AGAINST ('someword' IN BOOLEAN MODE) ORDER BY search.end_time LIMIT 0, 50;
+----+-------------+--------------+----------+-----------------------------------+--------------------+---------+-----------------------------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+----------+-----------------------------------+--------------------+---------+-----------------------------+------+-----------------------------+
| 1 | SIMPLE | search | fulltext | PRIMARY,lot_id,ttitle_description | ttitle_description | 0 | | 1 | Using where; Using filesort |
| 1 | SIMPLE | category_map | ref | parent_id,child_id | child_id | 4 | tauction.search.category_id | 5 | Using where |
| 1 | SIMPLE | lot | eq_ref | PRIMARY,enabled | PRIMARY | 4 | tauction.search.lot_id | 1 | Using where |
+----+-------------+--------------+----------+-----------------------------------+--------------------+---------+-----------------------------+------+-----------------------------+
3 rows in set (0.00 sec)


0.67 sec is unacceptable for 3 rows result set


So, as far as I see, the query (1) is the right one (the fastest one), but I can't get rid of end_time in the
search table since I will get 'temporary table' overhead if to use lot.end_time. I also have recieved in
my previous experiments the idea, that having a separate table for each sorting criteria looks like the
fastest way of performing searches,
eg for end_time:


DROP TABLE IF EXISTS search_end_time;
CREATE TABLE search_end_time (
   lot_id        int unsigned not null primary key,

   category_id   int unsigned not null,
   end_time      int unsigned not null,
   title         char(50)     not null,
   description   text         not null,

   FULLTEXT ttitle (title),
   FULLTEXT ttitle_description (title, description)
) TYPE = MyISAM COMMENT = "Search helper";

( Though I can't give you results right now, since I need to fill
it in with the data, that takes a bit of time)

I'm going to drop ntitle, ndescription columns and related indexes on
lot table and check again, hope that will improve.

But hey, search above a second isn't a good option
Any ways to improve ?

Sincerely,
Aleksandr

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to