On Mon, 10 Jan 2005 17:11:09 +0100, =?ISO-8859-2?Q?Martin_MOKREJ=A9?= <[EMAIL PROTECTED]> wrote:
>Hi, > I have sets.Set() objects having up to 20E20 items, What notation are you using when you write 20E20? IOW, ISTM 1E9 is a billion. So 20E20 would be 2000 billion billion. Please clarify ;-) >each is composed of up to 20 characters. Keeping >them in memory on !GB machine put's me quickly into swap. >I don't want to use dictionary approach, as I don't see a sense >to store None as a value. The items in a set are unique. > > How can I write them efficiently to disk? To be more exact, >I have 20 sets. _set1 has 1E20 keys of size 1 character. > >alphabet = ('G', 'A', 'V', 'L', 'I', 'P', 'S', 'T', 'C', 'M', 'A', 'Q', 'F', >'Y', 'W', 'K', 'R', 'H', 'D', 'E') >for aa1 in alphabet: > # l = [aa1] > #_set1.add(aa1) > for aa2 in alphabet: > # l.append(aa2) > #_set2.add(''.join(l)) >[cut] > > The reason I went for sets instead of lists is the speed, >availability of unique, common and other methods. >What would you propose as an elegant solution? >Actually, even those nested for loops take ages. :( If you will explain a little what you are doing with these set "items" perhaps someone will think of another way to represent and use your data. Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list