On Apr 14, 11:18 pm, Carl Banks <[EMAIL PROTECTED]> wrote: > However, that is for the OP to decide. The reason I don't like the > sort of question I posed is it's presumptuous--maybe the OP already > considered and rejected this, and has taken steps to ensure the in > memory data structure won't be swapped--but a database solution should > at least be considered here.
Yes, you are right, especially if the index structure will be needed many times over a long period of time. Even here though, these days, you can go pretty far by loading everything into core (streaming from disk) and dumping everything out when you are done, if needed (ahem, using the preferred way to do this from python for speed and safety: marshal ;) ). Even with Btree's if you jump around in the tree the performance can be awful. This is why Nucular, for example, is designed to stream results sequentially from disk whenever possible. The one place where it doesn't do this very well (proximity searches) shows the most problems with performance (under bad circumstances like searching for two common words in proximity). -- Aaron Watters === http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=joys -- http://mail.python.org/mailman/listinfo/python-list