On Wednesday 27 February 2002 07:47 am, Jeff Kilbride wrote: > --------------------------------- > I'm going to be implementing a keyword search pretty soon myself, so I'd > like to see you solve this. :)
Me too! :-) > What about doing some sort of prefix indexing, instead of indexing the > entire word? So, for every word you put in your indexed "word" table, you > chop off the first few letters and index that: Forgot to mention, I'm already doing that. In the fulltext version, I use only first 6 chars of each word. (In Italian, that's short). But you're right, it's worth trying with 3 or even 1, to see if things go faster. If only it didn't take 3 days to generate a new try! :-( I am hoping that trying mysql 4.0.1 will at least speed that up. > > I suppose you could even take this one step further and create char(1) > fields for the first N chars of your words, and then index across those. > That should cut down the size of the left most part of the index and maybe > make it easier/quicker to find records. Also, it might be quicker if you > keep your columns static (char instead of varchar), if you're not already. I can't do that, (too big), but I've specified that the fulltext index search only the first 40 chars of the field, so at least that part is static. > Do you have enough memory to put this into a HEAP table -- or at least make > your temp tables HEAP tables? Good question. To the first, no way, the phone book .MYD table is over 3 GB, and the index about the same. For the temp tables, maybe, I'll give it a try if I can figure out how... Thanks enormously for the help, Jeff, I hope others contribute too! Steve ------------ > > > Problem: > > 1. in a phone book, a single field may contain several words, e.g. > > District 52 Police Station. > > 2. The query for this record may contain some or all of these words, but > > not > > > necessarily in that order, e.g. Police district 52. > > 3. Searching for all of the query words can be done in many ways, most of > > them > > > slower than getting truthful admissions from Enron executives. > > > > So here are some methods I've come up with: > > > > 1. Make a fulltext index of the field, and do fulltext searching on all > > the > > > query words. > > (In mysql 3.23.48, this is often over 30 seconds, which is not fast > > enough) > > > 2. Make an inverted index as a separate table, with each occurring word > > and > > > all of its record numbers in the main table. (Sounds slow but the 10 > > hours it > > > takes beats the 45 hours Mysql takes to create a FULLTEXT index on that > > table!) > > 2a) search all the query words in the inverted index, and the main table > > in one big join. (10 minutes typical!) > > 2b) find the least-occurring query word, make a temp table with it, (2 > > min) > > > now join to the main table and search with an unanchored LIKE on > > other > > > query words. (another 2 min) (Total 4 min) > > 2c) find the 2 least-occurring query words and make 2 temp tables, do a > > 3-way > > > join with the temp tables and the main table (1 min, 1 min, 2 min) > > (Total 5 min) > > 2d) same as 2c but index each of the tmp tables as you create it > > (1 min, 1 min, 15 sec) (Total 2:15) > > > > Okay, it's getting better but still not very optimal. > > > > The hardware is adequate (dual-CPU 700Mhz each, 1GB RAM, mysql configured > > for > > > big RAM, etc) > > > > I'm getting desperate because I'm sure this should not be so slow. The > > problem isn't the 30-second wait per se, it's the fact that during those > > 30 > > > seconds the server is more loaded: 1 or 2 new requests come in every > > second, > > > and once there are 50 of these in the queue, the server becomes > > unusable... > > > Things I'm trying next: > > Upgrading to 4.0.1 for faster FULLTEXT. > > Examining join syntax to find ways to help Mysql optimize the search > > (beats me!) > > Asking the list for advice... > > > > Steve. > > > > P.S. I haven't included my sample queries but I will if asked... > > > > > > > > > > > > > > --------------------------------------------------------------------- > > Before posting, please check: > > http://www.mysql.com/manual.php (the manual) > > http://lists.mysql.com/ (the list archive) > > > > To request this thread, e-mail <[EMAIL PROTECTED]> > > To unsubscribe, e-mail > > <[EMAIL PROTECTED]> > > > Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php > > --------------------------------------------------------------------- > Before posting, please check: > http://www.mysql.com/manual.php (the manual) > http://lists.mysql.com/ (the list archive) > > To request this thread, e-mail <[EMAIL PROTECTED]> > To unsubscribe, e-mail > <[EMAIL PROTECTED]> Trouble > unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php