Jarek Zgoda <[EMAIL PROTECTED]> writes: > > I don't know what Sphinx is. > http://www.sphinxsearch.com/
Thanks, looks interesting, maybe not so good for what I'm doing, but worth looking into. There is also Xapian which again I haven't looked at much, but which has fancier PIR (probabilistic information retrieval) capabilities than Lucene or the version of Nucular that I looked at. The main thing killing most of the search apps that I'm involved with is disk latency. If Aaron is listening, I might suggest offering a config option to redundantly recording the stored search fields with every search term in the index. That will bloat the indexes by a nontrivial constant factor (maybe 5x-10x) but even terabyte disks are dirt cheap these days, so you still index a lot of data, and present large result sets without having to do a disk seek for every result in the set. I've been meaning to crunch some numbers to see if this actually makes sense. Unfortunately, the concept of the large add-on memory card seems to have vanished. It would be very useful to have a cheap x86 box with a buttload of ram (say 64gb), using commodity desktop memory and extra modules for ECC. It would be ok if it went over some slow interface so that it was 10x slower than regular ram. That's still 100x faster than a flash disk and 1000x faster than a hard disk. -- http://mail.python.org/mailman/listinfo/python-list