On Nov 24, 5:44 am, Licheng Fang <[EMAIL PROTECTED]> wrote: > Yes, millions. In my natural language processing tasks, I almost > always need to define patterns, identify their occurrences in a huge > data, and count them. [...] So I end up with unnecessary > duplicates of keys. And this can be a great waste of memory with huge > input data.
create a hash that maps your keys to themselves, then use the values of that hash as your keys. >>> store = {} >>> def atom(str): global store if str not in store: store[str] = str return store[str] >>> a='this is confusing' >>> b='this is confusing' >>> a == b True >>> a is b False >>> atom(a) is atom(b) True -- http://mail.python.org/mailman/listinfo/python-list