Hi Aaron, > >> > But I also have a simpler idea that will still speed things up. It > >> > involves using a regexp SQL fulltext search. While it's definitely slow > >> > compared to a changed storage (the db won't be able to use indexes), > >> > it's still faster because it does not involve any parsing, and uses the > >> > well-tuned SQL server code to do fulltext searching. > > > > That will still be awfully slow because the database has to read every > > single page, lot of expensive I/O. > > Not any slower than requesting, parsing, and searching every message like > we are now! I think it's an excellent idea: first, REGEXP for messages > that might match, then parse those to see if they match in the right place > (specific header, body, etc.).
Yes of course that will be a lot faster than what we have now and is probably good enough for small systems, but on a large system 10 users searching something probably block everything - and searching mails is something that happens quite often. > > For PostgreSQL there is Tsearch2 [1], Oracle has built in full text > > search as far as I know, what about MySQL? I think it would be better to > > use special functions if they are available. > > MySQL's MyISAM tables can do FULLTEXT, but not InnoDB. Bummer. The database layer has to decide how to search, so MySQL could use RegExps (and PostgreSQL if TSearch2 isn't available). Then it's up to the admin to decide if MySQL is a good idea for a large system. Thomas -- http://www.tmueller.com for pgp key (95702B3B)