Hi! On Nov 27, Gordan Bobic wrote: > > I take it the "IN BOOLEAN MODE" part of the AGAINST() is going to be new to > 4.0.1.
Yes. And as it's in the manual now, changes of the syntax are unlikely. > Incidentally, how are the WHERE clauses handled when MATCH/AGAINST is used > for FTS? Given that I am seeing a fairly linear increase in query time with > the increase in number of matched terms, I would guess that the FTS is > performed first. Especially since limiting other constraints in the WHERE > clause produces no noticeable reduction in query time. This seems to be > wasteful. Yes it performed first. By design natural language fts engine has to build list of oll the documents matched (as relevance value depends on some global statistics). So it cannot take into account any other constraints. Boolean FTS engine need not this statistics and it will benefit from other constraints, if possible. I'm talking about 4.0.1 here, in 4.0.0 boolean search was build on top of nl fulltext search code. > Considering that FTS is likely the slowest part of the query, it would > probably be beneficial in terms of performance to have it execute last, with > all other "simpler" constraints being satisfied first, so fewer records need > to be searched. > > Another question - is there a way to acquire a list of words in the FTS > index? Someting like > > SELECT Word, > count(*) AS Frequency > FROM FTSIndex > GROUP BY Word > ORDER BY FREQUENCY ASC > LIMIT 100; There's myisam/ft_dump utility that can dump fulltext index ot of MYI file. Adding such a functionality to mysqld would be not that impossible. > This would allow for easier overview of what "dead" words are being indexed, > and therefore allow for easier isolation of new "stop words", and reduction > in unnecessary searching that FTS would have to perform, thus increasing > performance. Considering that I'm really after SELECT speed, would more > careful tuning of stop words be likeky to yield signifficant performance > improvements? Mot probably, yes. But - strictly speaking it would be not in line with boolean search approach. In 4.0.1 boolean FTS is still a subject to stopword filtering, but I hope it will be changed soon. - it would often require different stopword lists for different indexes even on the same table! We do not plan to add such a feature in the nearest future. - it would often require a lot of manual work - periodical updating of stopword list according to the recent table data. So, there're some ideas how to avoid all these drawbacks by automated stopword list creation based on live data, (and making them applicable only to nl-fts queries). It's most probably the way fulltext search in MySQL will be developed. It can result in a _huge_ speedup, if properly implemented. > It would also be REALLY nice to have a "dynamic" list of stop words. I know > you said that this is definitely planned, but it would be nice to know how > soon... Well, we plan to have plain-text .frm files this year. Making stopword list "dynamic" relies on this feature. > Another thing - it would probably be useful to gather some statistics about > FTS queries performed. ... > Has any of this been at least thought about? I've just checked the TODO, and > it doesn't appear to be there... These are nice ideas, they weren't thought about, but they definitely will - I promise :-) > Looking forward to 4.0.1. It should be out in a few days. > BTW, will the file formats be compatible? Or will it require a dump + restore > of the database, when going from 4.0.0 to 4.0.1? For now - there's one bit changed - and one has to rebuild the table. The easiest way is 'ALTER TABLE ... TYPE=MYISAM', though dump+restore will work too, of course. Still, I'd like to make file formats fully compatible - so, you'd better take a look at the ChangeLog section of the manual included in the 4.0.1 distribution. Regards, Sergei -- MySQL Development Team __ ___ ___ ____ __ / |/ /_ __/ __/ __ \/ / Sergei Golubchik <[EMAIL PROTECTED]> / /|_/ / // /\ \/ /_/ / /__ MySQL AB, http://www.mysql.com/ /_/ /_/\_, /___/\___\_\___/ Osnabrueck, Germany <___/ --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php