Re: Fulltext + order by strategy

2004-12-22 Thread Aleksandr Guidrevitch
[EMAIL PROTECTED] wrote:
Since you are already splitting your data into "fulltext columns" and 
"other data" I would keep that design. Others on the list have greatly 
improved their performance by running their tables in this same 
fashion. If by some chance your "non-fulltext" columns are all fixed 
width, there is another speed boost just by changing your design.
Well, is there any possibility to make text column fixed width I don't 
know about ??? ;-)
I would prefer to have one, since 12Kb is the top limit for my application.

I think you are on the right track to an optimal storage solution. 
However, I would not duplicate all of your fields in your "fulltext" 
table, only the Primary key and the actual FT data needs to be there. 
The rest of the data is already on your non-FT table so no need to 
double up on your storage.

Of course, I could suggest more concrete examples if you actually post 
the table designs you have now (SHOW CREATE TABLE tablename\G) so that 
I could refer to your tables by their real names and columns. 
I haven't decided yet the table structure, but here is what I'm playing 
with:

// title and description are stemmed title and description
// accordingly
DROP TABLE IF EXISTS search;
CREATE TABLE search (
   lot_idint unsigned not null primary key,
   category_id   int unsigned not null,
   current_price decimal(16,2) unsigned not null,
   end_time  int unsigned not null,
   bid_count int unsigned not null,
   title char(50) not null,
   description   text not null,
   FULLTEXT ttitle (title),
   FULLTEXT ttitle_description (title, description)
) TYPE = MyISAM COMMENT = "Search table";
// ntitle and ndescription are stemmed title and description
// accordingly
DROP TABLE IF EXISTS lot;
CREATE TABLE lot (
   id   int unsigned   not null primary key,
   owner_id int unsigned   not null default 0,
   titlechar(50)   not null,
   description  text   not null,
   ntitlechar(50)   not null,
   ndescription  text   not null,
   current_pricedecimal(16,2) unsigned not null,
   reserve_pricedecimal(16,2) unsigned not null,
   buy_pricedecimal(16,2) unsigned not null,
   quantity int   unsigned not null,
   sold int   unsigned not null,
   bid_countint   unsigned not null,
   start_time   int   unsigned not null,
   end_time int   unsigned not null,
   city char(35)   not null,
 
   type_id  int   unsigned not null,
   category_id  int   unsigned not null,
   currency_id  int   unsigned not null,
   country_id   int   unsigned not null,
   state_id int   unsigned not null,
   delivery_id  int   unsigned not null,

   enabled  bool default 1 not null,
   INDEX title (title),
   INDEX current_price (current_price),
   INDEX bid_count (bid_count),
   INDEX end_time  (end_time),
   INDEX enabled   (enabled),
   FULLTEXT ttitle (ntitle),
   FULLTEXT ttitle_description (ntitle, ndescription)
) TYPE = MyISAM COMMENT = "Lots";
// this is a helper table
// to query all subcategories of a category
// in one query
DROP TABLE IF EXISTS category_map;
CREATE TABLE category_map (
   parent_id  int unsigned not null,
   child_id   int unsigned not null,
   INDEX parent_id (parent_id),
   INDEX child_id  (child_id)
) TYPE = MyISAM COMMENT = "Categories Map";

Here are the queries I'm playing with (search is performed against 50 
000 rows).
The table is filled in with random data generated via use of wordlists.
The length of description field is not more of 12Kb but really close

1.
SELECT SQL_CALC_FOUND_ROWS lot.id, lot.title
FROM search
   INNER JOIN lot
   ON lot.id = search.lot_id
   AND lot.enabled = 1
WHERE
   MATCH(search.title, search.description) AGAINST ('someword' IN 
BOOLEAN MODE)
ORDER BY search.end_time
LIMIT 0, 50
...
50 rows in set (4.22 sec)
mysql> EXPLAIN SELECT SQL_CALC_FOUND_ROWS lot.id, lot.title FROM 
search INNER JOIN lot ON lot.id = search.lot_id 
AND lot.enabled = 1 WHERE MATCH(search.title, search.description) 
AGAINST ('someword' IN BOOLEAN MODE) ORDER BY search.end_time LIMIT 0, 50;
++-++--+---++-++--+-+
| id | select_type | table  | type | 
possible_keys | key| key_len | 
ref| rows | Extra   |
++-++--+---++-++

Re: Fulltext + order by strategy

2004-12-22 Thread SGreen
Since you are already splitting your data into "fulltext columns" and 
"other data" I would keep that design. Others on the list have greatly 
improved their performance by running their tables in this same fashion. 
If by some chance your "non-fulltext" columns are all fixed width, there 
is another speed boost just by changing your design.

I think you are on the right track to an optimal storage solution. 
However, I would not duplicate all of your fields in your "fulltext" 
table, only the Primary key and the actual FT data needs to be there. The 
rest of the data is already on your non-FT table so no need to double up 
on your storage.

Of course, I could suggest more concrete examples if you actually post the 
table designs you have now (SHOW CREATE TABLE tablename\G) so that I could 
refer to your tables by their real names and columns.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Aleksandr Guidrevitch <[EMAIL PROTECTED]> wrote on 12/22/2004 03:47:39 PM:

> Hi there,
> 
> Is there any common strategy (of using indexes)
> to avoid filesorts while sorting searches by a field ?
> 
> Eg. I want fulltext search, then sort results by
> some other table field.
> 
> For example .:
> create table article (
>   id int not null primary key auto_increment,
> 
>   sort1 int not null,
>   sort2 int not null,
> 
>   description text not null,
> 
>   FULLTEXT description (description)
> )
> 
> explain select id from article where match (description) against 
> (''in boolean mode) order by sort1;
> mysql> mysql> explain select id from article where match (description) 
> against ('' in boolean mode) order by sort1;
> ++-+-+--+---
> +-+-+--+--+-+
> | id | select_type | table   | type | possible_keys | key | 
> key_len | ref  | rows | Extra   |
> ++-+-+--+---
> +-+-+--+--+-+
> |  1 | SIMPLE  | article | fulltext | description   | description 
> |   0 |  |1 | Using where; Using filesort |
> ++-+-+--+---
> +-+-+--+--+-+
> 
> 
> It doesn't matter whether I have index on sort1, filesort is always 
used.
> Furthermore, I've noted that adding an index slows the query down
> with each index added (suppose becuase of table index becomes bigger).
> 
> I come to a solution where I create a separate table (exact copy) with 
only
> fulltext indexes, then query it with order by LEFT JOIN with original 
table,
> eg:
> SELECT article.id, article.description FROM search
> LEFT join article
>ON search.id = article.id
> WHERE
> MATCH (description) AGAINST ('some' IN BOOLEAN MODE)
> ORDER BY search.sort1
> LIMIT 0, 50;
> 
> Also, I intensively use table article for listing (w/o search)
> items sorted by sort1 or sort2 fields, so in article table
> indexes for sort1 and sort2 are absolutely necessary.
> 
> So currently I see only 2 solutions
> 1. add to the article table ndescription
> field (with stemmed description), and fulltext index on it
> 
> 2. create a separate table for this purpose.
> 
> I hope there is a 3rd variant, a good idea on how to
> avoid filesort
> 
> 
> Sincerely,
> Aleksandr
> 
> 
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>