Re: Maximize Performance by Improving Search Program Logic - Help?

2002-10-10 Thread Eric Persson

David Jourard wrote:
 It consists of 2 tables: word and content.
 
 content consists of 2 fields:
 
rec_id int unsigned not null,
description varchar(200) not null
 
 
 word consists of 2 fields:
  word varchar(50) not null,
  rec_id int unsigned not null
 
 word is created from content.  Each actual word from the field
 content.description is created as a record in the table word with its 
 rec_id.
I have done a similiar thing, but the programming is done in php, but 
that shouldnt make any difference. To speed up full word searches I have 
added a int(11) column where I store a crc32(www.php.net/crc32) of the 
word. There is probably an equivalent of the php crc32 function 
availible in perl, but I dont know.
 From my experience of this I would say that mysql searches faster in an 
int column than in a varchar. And then add an index to it and its real 
fast. I have saved the varchar column for some like searches to, but it 
should be possible to remove it, but it doesnt make much difference I guess.


 I then loop over the rec_ids stored in the array @codes.
 
   my $SqlCmd = 'select description from content where rec_id=?';
   my $sth = $dbh-prepare( $SqlCmd );
   $sth-{'mysql_use_result'} = 1; # doesn't store entire result
 
   while ( $continue ) {
 
  if ($njk  $#codes) { last; }
 
  $cd = $codes[$njk++];
  $sth-execute($cd);
  @row = $sth-fetchrow_array;
 
  # Collect results for display etc ...
 
}
$sth-finish;
This must take a while, even if its fast since you have to execute a 
query for each rec_id you found. I have solved this by doing 1 query to 
get all of it.
For example, transform the rec_ids you get into something like this:
select description from content where rec_id in (1,45,3,4,19);
This will enable you to just make one query. Currently I'm not aware if 
there is some limitation in the number of values to the IN (...) but it 
might be.

To create a several word search, you could do step 2 for each word and 
then go through them again in an array or something to see which 
contents that have all of the words desired.

Well, hope you got at least a bit wiser by this.. :)

Best regards,
Eric


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: Maximize Performance by Improving Search Program Logic - Help?

2002-10-10 Thread Roger Baklund

* David Jourard
 I need to improve the performance of a 1-word search engine
 which I've created.
[...]
 The search consists of 2 SQL queries

 Query 1: select rec_id from word where word='book';

 All the rec_ids are captured into an array.

 Query2: select description from content where rec_id=?

 I then loop over the rec_ids stored in the array @codes.

You could probably improve on the speed by using a join:

select description
  from content,word
  where
word.word='book' and
word.rec_id = content.rec_id

[...]
 Question: Is there a better way of doing this with respect to:
 A. My definition of fields and their types.

varchar(50) for word... this column should be indexed, but you should not
index the full 50 characters. 5-10 is probably enough, depending on the
amount and distribution of your data. (If you have 100.000 words beginning
with 'computer-', you need more letters in the index.)

 B. My queries  - it seems that Query 2 could perhaps be made
faster.

You only need one query, and if word.word and content.rec_id are indexed, it
should be fast.

 I thank you in advance.  It is my hope to increase to 2, 3 and 4
 word searches once I've maximized the above search logic.

You can join the same table multiple times:

select description
  from content,word w1,word w2
  where
w1.word='book' and
w2.word='computer' and
w1.rec_id = content.rec_id and
w2.rec_id = content.rec_id

HTH,

--
Roger


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Maximize Performance by Improving Search Program Logic - Help?

2002-10-09 Thread David Jourard

Hi,


I need to improve the performance of a 1-word search engine
which I've created.  Currently the RAM is 1 GIG and the key_buffer is set at
512 MB which has improved the search results but I think perhaps I can improve
my method of search with your help perhaps.  The program is coded to run in 
mod_perl or just perl.

It consists of 2 tables: word and content.

content consists of 2 fields:

rec_id int unsigned not null,
description varchar(200) not null


word consists of 2 fields:
  word varchar(50) not null,
  rec_id int unsigned not null

word is created from content.  Each actual word from the field
content.description is created as a record in the table word with its rec_id.

Both tables are indexed on rec_id and word respectively.

A keyword is entered such as book.

The search consists of 2 SQL queries

Query 1: select rec_id from word where word='book';

All the rec_ids are captured into an array.

Query2: select description from content where rec_id=?

I then loop over the rec_ids stored in the array @codes.

   my $SqlCmd = 'select description from content where rec_id=?';
   my $sth = $dbh-prepare( $SqlCmd );
   $sth-{'mysql_use_result'} = 1; # doesn't store entire result

   while ( $continue ) {

  if ($njk  $#codes) { last; }

  $cd = $codes[$njk++];
  $sth-execute($cd);
  @row = $sth-fetchrow_array;

  # Collect results for display etc ...

}
$sth-finish;

Test Results:
   Query 1   Query 2
pottery (915 records)   0 sec 1 sec
book (12456 records)0 sec13 sec
computer (21999 records)1 sec21 sec


Question: Is there a better way of doing this with respect to:
A. My definition of fields and their types.
B. My queries  - it seems that Query 2 could perhaps be made
   faster.

I thank you in advance.  It is my hope to increase to 2, 3 and 4 word searches
once I've maximized the above search logic.

regards
David Jourard



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php