Hi! On Feb 18, xing wrote: > > > Peter, > > Here are my suggestions based on past and current experience.
Unfortunately, they are mostly wrong :( > 1) set-variable = sort_bufffer=8M > > Since you have a lot of records there you will no doubt be doing quite > a lot of sorting. Increase the sort_buffer. This would also speed up > new index creation. This variable setc a max. memory size allocated for filesort operation - which is used in ORDER BY and GROUP BY when EXPLAIN shows "filesort used" in the last column. It does not apply to new index creation. Instead myisam_sort_buffer_size is used there. > 2) > set-variable = ft_min_word_len=3 (mysql defaults to 4) > set-variable = ft_max_word_len=100 (mysql defaults to 256) > set-variable = ft_max_word_len_for_sort=10 (mysql defaults to 20 I > think) > > In real life, people want to be able to find "Tom" and "Cat" so you > want to fulltext to index 3 letter words as well at the cost of index > size. However, by decreasing the max_word_len and len_for_sort by over > half, you have effectively decreased the full-text index size by 50%. > This would speed up your searches big time. Decreasing index - yes, but not decreasing ft_max_word_len and ft_max_word_len_for_sort. Words that are longer that 100 letters are *extremely* *extremely* rare. less that 0.001% in the index. Furthermore, even they will be probably not valid words but just repeated sequences of characters like ___________ or xxxxxxxxxxxxxxxxxxxxx, etc. Nobody will ever search for this. Removing them will not make the index any noticeably smaller, and will have no impact on the search speed whatsoever. As for ft_max_word_len_for_sort - it is used only during index creation. Reducing it, as you propose, will mean that MySQL will be able to put twice as much words in each sort bucket - which is good, but all the words longer than 10 will be ignored there and MySQL will insert them after repair_by_sort in done, one word by one - that is slow way. Whether it will give a speedup or not depends on the word length distribution in the data. Usually, 20 covers about >98% words. There is utility program ft_dump that comes only in MySQL source distribution. It can show you the length or word frequency distribution. Regards, Sergei -- MySQL Development Team __ ___ ___ ____ __ / |/ /_ __/ __/ __ \/ / Sergei Golubchik <[EMAIL PROTECTED]> / /|_/ / // /\ \/ /_/ / /__ MySQL AB, http://www.mysql.com/ /_/ /_/\_, /___/\___\_\___/ Osnabrueck, Germany <___/ --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php