Hi All, this was already posted on mysql forum "preformance",
but forums are really slow, so sorry for the crosspost :)


My objective is to implement quick, really quick complex fulltext search with 'order by' (< 2 seconds).
The actual table I'd like to search is `lot`. I've created 2 helper tables : 1. `search`, which contains normalized (or stemmed) title and description with fulltext search index 2. `category_map` which covers all relations of categories, and is especially useful when I need to perform search on all children categories of current category.


The total count of records in lot, and correspondingly in `search`, tables is expected to be 50 000, each around 2048 (stemmed description) + 50 (stemmed title) bytes.

I need 1) fulltextsearch within all records, 2) category listing, 3) fulltextsearch within a category.
Also, the result set should be *always* ordered either by lot.end_time, lot.current_price * exchange_rate, lot.title or lot.bid_count. The fulltextsearch relevance is not important, I'm ready to ignore it in favor of performance.


Some time ago I've tried to put fulltext right inside the `lot` table, but it seemed to produce much longer lasting queries than the current, in plus in this case i was unable to LOAD INDEX INTO CACHE because of different key length of FULLTEXT and other table's indexes. That's why I've separated fields which I'd like to index fulltext into `search` table. But now I'm stuck with performance degradation of ORDER BY. I've tried to play with the related sort_buffer_size, myisam_sort_buffer_size and other params described in the docs related ORDER BY and filesort, but with no significant performance influence. I do not mind to create as many helper tables as needed, but I cant figure out what else can I improve. Another disappointing problem is that it is rather slow to generate 50 000 records for different table structures (around 2 hours) just to play with them for 10-20 mins.

Here is what I came up with at the moment:

SELECT lot.id, search.title FROM search LEFT JOIN lot ON lot.id = search.lot_id, category_map, exchange_rate WHERE MATCH(search.title, search.description) AGAINST ('some query' IN BOOLEAN MODE) AND category_map.child_id = lot.category_id AND category_map.parent_id = 10 AND exchange_rate.currency_id = lot.currency_id ORDER BY lot.current_price * exchange_rate.exchange_rate LIMIT 0, 50

Which produces almost acceptable results (which I'd like to improve further) if I omit ORDER BY statement. I do understand that the slowdown occurs mostly during filesort, and partially due to temporary table.

The question is
1) Which is a desirable tables structure which will allow to perform fulltextsearch + ordering
2) Which is preferable LEFT JOIN .. ON, or WHERE ? I ahaven't noticed any significant difference trying both of them


EXPLAIN:

+----+-------------+---------------+----------+---------------------+-------------------+---------+--------------------------+------+----------------------------------------------+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+----------+---------------------+-------------------+---------+--------------------------+------+----------------------------------------------+


| 1 | SIMPLE | search | fulltext | title_description | title_description | 0 | | 1 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | lot | eq_ref | PRIMARY,category_id | PRIMARY | 4 | tauction.search.lot_id | 1 | Using where |
| 1 | SIMPLE | exchange_rate | ref | currency_id | currency_id | 4 | tauction.lot.currency_id | 2 | Using index |
| 1 | SIMPLE | category_map | ref | parent_id,child_id | child_id | 4 | tauction.lot.category_id | 12 | Using where |
+----+-------------+---------------+----------+---------------------+-------------------+---------+--------------------------+------+----------------------------------------------+


4 rows in set (0.00 sec)


Table structure as follows.

DROP TABLE IF EXISTS lot;
CREATE TABLE lot (
id int unsigned not null primary key auto_increment,
owner_id int unsigned not null default 0,

title char(50) not null,
description text not null,

current_price decimal(16,2) unsigned not null,
quantity int unsigned not null,
bid_count int unsigned not null,

start_time int unsigned not null,
end_time int unsigned not null,

city char(35) not null,

category_id int unsigned not null,
currency_id int unsigned not null,
country_id int unsigned not null,
state_id int unsigned not null,
delivery_id int unsigned not null,

enabled bool default 1 not null,

INDEX title (title),

INDEX current_price (current_price),
INDEX bid_count (bid_count),

INDEX end_time (end_time),

INDEX category_id (category_id)

) TYPE = MyISAM COMMENT = "Lots";

DROP TABLE IF EXISTS search;
CREATE TABLE search (
lot_id int unsigned not null,

title char(50) not null,
description text not null,

FULLTEXT title (title),
FULLTEXT title_description (title, description)
) TYPE = MyISAM COMMENT = "Search helper";

DROP TABLE IF EXISTS category_map;
CREATE TABLE category_map (
parent_id int unsigned not null,
child_id int unsigned not null,
INDEX parent_id (parent_id),
INDEX child_id (child_id)
) TYPE = MyISAM COMMENT = "Categories Map";



Sincerely,
Aleks


-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to