Robert Brewer wrote:
Martin MOKREJŠ wrote:

 I have sets.Set() objects having up to 20E20 items,
each is composed of up to 20 characters. Keeping
them in memory on !GB machine put's me quickly into swap.
I don't want to use dictionary approach, as I don't see a sense
to store None as a value. The items in a set are unique.

How can I write them efficiently to disk?


got shelve*?

I know about shelve, but doesn't it work like a dictionary? Why should I use shelve for this? Then it's faster to use bsddb directly and use string as a key and None as a value, I'd guess.

Even for that, note that even for data contained in _set11,
the index should be(could be) optimized for keysize 11.
There are no other record-sizes.

Similarly, _set15 has all keys of size 15. In the bsddb or anydbm
and other modules docs, I don't see how to optimize that. Without
this optimization, I think it would be even slower. And shelve
gives me exactly such, unoptimized, general index on dictionary.

Maybe I'm wrong, I'm just a beginner here.
Thanks
M.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to