Re: unique-ifying a list

2009-08-09 Thread Simon Forman
On Aug 7, 4:53 pm, kj no.em...@please.post wrote:
 Suppose that x is some list.  To produce a version of the list with
 duplicate elements removed one could, I suppose, do this:

     x = list(set(x))

 but I expect that this will not preserve the original order of
 elements.

 I suppose that I could write something like

 def uniquify(items):
     seen = set()
     ret = []
     for i in items:
         if not i in seen:
             ret.append(i)
             seen.add(i)
     return ret

 But this seems to me like such a commonly needed operation that I
 find it hard to believe one would need to resort to such self-rolled
 solutions.  Isn't there some more standard (and hopefully more
 efficient, as in C-coded/built-in) approach?

 TIA!

 kynn


Unique items in a list, in the same order as in the list:

x = sorted(set(x), key=x.index)

;]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unique-ifying a list

2009-08-08 Thread Elias Fotinis (eliasf)

kj wrote:

I suppose that I could write something like

def uniquify(items):
   seen = set()
   ret = []
   for i in items:
   if not i in seen:
   ret.append(i)
   seen.add(i)
   return ret

But this seems to me like such a commonly needed operation that I
find it hard to believe one would need to resort to such self-rolled
solutions.  Isn't there some more standard (and hopefully more
efficient, as in C-coded/built-in) approach?


The most standard way is a recipe from the itertools docs (I'd give a 
link, but python.org is down at the moment):


   def unique_everseen(iterable, key=None):
   List unique elements, preserving order. Remember all elements ever 
seen.

   # unique_everseen('BBBCCDAABBB') -- A B C D
   # unique_everseen('ABBCcAD', str.lower) -- A B C D
   seen = set()
   seen_add = seen.add
   if key is None:
   for element in iterable:
   if element not in seen:
   seen_add(element)
   yield element
   else:
   for element in iterable:
   k = key(element)
   if k not in seen:
   seen_add(k)
   yield element

All the recipes mentioned there are pretty handy, so I've made a module 
(iterutil.py) out of them.


--
http://mail.python.org/mailman/listinfo/python-list


Re: unique-ifying a list

2009-08-08 Thread ryles
On Aug 7, 4:53 pm, kj no.em...@please.post wrote:
 Suppose that x is some list.  To produce a version of the list with
 duplicate elements removed one could, I suppose, do this:

     x = list(set(x))

 but I expect that this will not preserve the original order of
 elements.


OrderedSet is most likely on the way, but for now Python 3.1 and 2.7
have OrderedDict. For 3.0 and 2.6 the recipe is here:

http://code.activestate.com/recipes/576669

With OrderedDict you can do:

OrderedDict.fromkeys(x).keys()  # Returns an iterator in 3.x, a list
in 2.x.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unique-ifying a list

2009-08-08 Thread Paul Rubin
Dennis Lee Bieber wlfr...@ix.netcom.com writes:
   Why bother with seen?

The version with seen runs in linear time because of the O(1) set
lookup.  Your version runs in quadratic time.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unique-ifying a list

2009-08-07 Thread Jonathan Gardner
On Aug 7, 1:53 pm, kj no.em...@please.post wrote:
 Suppose that x is some list.  To produce a version of the list with
 duplicate elements removed one could, I suppose, do this:

     x = list(set(x))

 but I expect that this will not preserve the original order of
 elements.

 I suppose that I could write something like

 def uniquify(items):
     seen = set()
     ret = []
     for i in items:
         if not i in seen:
             ret.append(i)
             seen.add(i)
     return ret

 But this seems to me like such a commonly needed operation that I
 find it hard to believe one would need to resort to such self-rolled
 solutions.  Isn't there some more standard (and hopefully more
 efficient, as in C-coded/built-in) approach?


Honestly, doing unique operations is pretty rare in the application
level. Unless you're writing some kind of database, I don't see why
you'd do it. (My recommendation is not to write databases.)

If you want unique elements, use a set. If you want to order them,
sort a list of the items in the set.

If you want to preserve the order, then using a dict may be even
better.

orig_order = dict(reversed([reversed(i) for i in enumerate
(items)])
unique_ordered = sorted(orig_order.keys(), key=lambda k: orig_order
[k])

Hints to understanding:
* enumerate generates (index, item) pairs.
* We reverse each pair so that we get an item - index mapping.
* We reverse it so that the first ones appear last. Later pairs
override earlier ones in dict().
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unique-ifying a list

2009-08-07 Thread Gabriel Genellina

En Fri, 07 Aug 2009 17:53:10 -0300, kj no.em...@please.post escribió:


Suppose that x is some list.  To produce a version of the list with
duplicate elements removed one could, I suppose, do this:

x = list(set(x))

but I expect that this will not preserve the original order of
elements.

I suppose that I could write something like

def uniquify(items):
seen = set()
ret = []
for i in items:
if not i in seen:
ret.append(i)
seen.add(i)
return ret


Assuming the elements are hashable, yes, that's the fastest way (minus
some microoptimizations like using local names for ret.append and
seen.add, or the 'not in' operator).

See bearophile's recipe [1], another one [2] by Tim Peters (quite old but
worths reading the comment section), and this thread [3]

[1] http://code.activestate.com/recipes/438599/
[2] http://code.activestate.com/recipes/52560/
[3] http://groups.google.com/group/comp.lang.python/t/40c6c455f4fd5154/

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list