Hi David,

On my machine, on a simple, pathological example, it is significantly
faster (around a factor of 2) to check for duplicates.

However, it depends if you expect to have many duplicates. In the second
example with only one duplicated item, it is faster to not check.

The winner in both cases, however, is to use dict() directly.

Cheers,
Josh


IPython session:

In [1]: l = [('a', 1)] * 10000

In [2]: def f(l):
   ...:     d = {}
   ...:     for k, v in l:
   ...:         d[k] = v
   ...:     return d
   ...:

In [3]: def g(l):
   ...:     d = {}
   ...:     for k, v in l:
   ...:         if k not in d:
   ...:             d[k] = v
   ...:     return d
   ...:

In [4]: %timeit f(l)
1000 loops, best of 3: 1.34 ms per loop

In [5]: %timeit g(l)
100 loops, best of 3: 773 µs per loop

In [6]: %timeit dict(l)
1000 loops, best of 3: 371 µs per loop

In [7]: m = [(str(x), x) for x in range(10000)] + [('0', 0)]

In [8]: %timeit f(m)
1000 loops, best of 3: 1.99 ms per loop

In [9]: %timeit g(m)
100 loops, best of 3: 2.48 ms per loop

In [10]: %timeit dict(m)
100 loops, best of 3: 943 µs per loop



On 21 February 2014 09:31, David Crisp <david.cr...@gmail.com> wrote:

> The following question is more along the lines of "good practice" rather
> than "how do you do it"   .
>
> If I have a name:value list of values that I want to read into a dict for
> ease of lookup, lets define them as:
>
> name  | value
> ==========
> ItemOne : 10
> ItemOne : 10
> ItemOne : 10
> ItemOne : 10
> ItemTwo : 20
> ItemTwo : 20
> ItemTwo : 20
> ItemTwo : 20
> ItemThree : 30
> ItemThree : 30
> ItemThree : 30
> ItemThree : 30
>
> Now,  obviously there are duplicates in that list.    If I use a simple
> loop such as the following:
> for each_name in list:
>      item[name] = value
>
> Then I will get a dict with three pairs in it:
> ItemOne : 10
> ItemTwo : 20
> ItemThree : 30
>
> Which is what I want.
>
> Now,   MY question is,  is there any harm in creating the dict that way
> and looping through all those values multiple times and re-defining the
> values constantly (to the same thing).   OR, should I put a check in there
> such as:
>
> if name not in item:
>    # add name and its pair value to dict
>
>
> In my case I HAVE added checking in but I was wondering if it was really
> needed.  Given no matter what, in either case, the resulting dict would be
> the same.
>
> Regards,
> David
>
> _______________________________________________
> melbourne-pug mailing list
> melbourne-pug@python.org
> https://mail.python.org/mailman/listinfo/melbourne-pug
>
>
_______________________________________________
melbourne-pug mailing list
melbourne-pug@python.org
https://mail.python.org/mailman/listinfo/melbourne-pug

Reply via email to