On Nov 24, 5:44 am, Licheng Fang <[EMAIL PROTECTED]> wrote:
> Yes, millions. In my natural language processing tasks, I almost
> always need to define patterns, identify their occurrences in a huge
> data, and count them. [...] So I end up with unnecessary
> duplicates of keys. And this can be a great waste of memory with huge
> input data.

create a hash that maps your keys to themselves, then use the values
of that hash as your keys.

>>> store = {}
>>> def atom(str):
        global store
        if str not in store:
                store[str] = str
        return store[str]

>>> a='this is confusing'
>>> b='this is confusing'
>>> a == b
True
>>> a is b
False
>>> atom(a) is atom(b)
True
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to