Re: [Tutor] lazily decorated sort
On Fri, Sep 28, 2012 at 3:34 PM, Peter Otten <__pete...@web.de> wrote: > >> Also, as far as I can see in the code, implementing "total_ordering" >> is unnecessary for sorting. One only needs to implement __lt__. It's >> used by binarysort() to test the pivot via the IFLT/ISLT macros: >> >> http://hg.python.org/cpython/file/70274d53c1dd/Objects/listobject.c#l1023 > > I smell danger. I really don't like having a class lying around that > partially implements ordering and may fail where you least expect it. I don't know; they're just utility objects used for sorting, which only cares about __lt__. On the other hand, Python 3 enforces a sanity check if you implement __eq__ without __hash__. So now you have objects that can't be used in sets or as dict keys. Not that this matters. ;) Anyway, it's not hurting. I just thought I'd make the suggestion to spare you the [small] effort of implementing __eq__ on something like this. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] lazily decorated sort
eryksun wrote: > On Fri, Sep 28, 2012 at 8:17 AM, Peter Otten <__pete...@web.de> wrote: >> >> def make_key(keys): >> @total_ordering >> class Key(object): >> def __init__(self, value): >> self._keys = keys(value) >> self._cached = [] > > > Using a generator/iterator to pump the key values into a cache is a > great idea. But I think it would generally be nicer to let "keys" be a > sequence of functions. Then define the generator in make_key() before > you define the class: I should have mentioned that make_key() may be used as a decorator. So it is @make_key def key(value): yield value yield f(value) yield value.attrib items.sorted(key=key) against (assuming the signature make_key(*keyfuncs)) items.sort(key=make_key( lambda value: value, f, operator.itemgetter("attrib")) ) I think the first version looks a bit cleaner, but that's a matter of taste. > Also, as far as I can see in the code, implementing "total_ordering" > is unnecessary for sorting. One only needs to implement __lt__. It's > used by binarysort() to test the pivot via the IFLT/ISLT macros: > > http://hg.python.org/cpython/file/70274d53c1dd/Objects/listobject.c#l1023 I smell danger. I really don't like having a class lying around that partially implements ordering and may fail where you least expect it. >> if a == b: >> pass >> else: >> return a < b > Or test for "!=": Yes, that was odd indeed. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] lazily decorated sort
On Fri, Sep 28, 2012 at 8:17 AM, Peter Otten <__pete...@web.de> wrote: > > def make_key(keys): > @total_ordering > class Key(object): > def __init__(self, value): > self._keys = keys(value) > self._cached = [] Using a generator/iterator to pump the key values into a cache is a great idea. But I think it would generally be nicer to let "keys" be a sequence of functions. Then define the generator in make_key() before you define the class: def make_key(keys): def _keys(value): for key in keys: yield key(value) @total_ordering class Key(object): def __init__(self, value): self._keys = _keys(value) self._cached = [] Also, as far as I can see in the code, implementing "total_ordering" is unnecessary for sorting. One only needs to implement __lt__. It's used by binarysort() to test the pivot via the IFLT/ISLT macros: http://hg.python.org/cpython/file/70274d53c1dd/Objects/listobject.c#l1023 > def __lt__(self, other): > for a, b in izip(self.keys(), other.keys()): > if a == b: > pass > else: > return a < b > return False Or test for "!=": def __lt__(self, other): for a, b in izip(self.keys(), other.keys()): if a != b: return a < b return False ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] lazily decorated sort
Chris Smith wrote: > I'm wondering if anyone has seen or knows of a good way to do a lazily > decorated sort. I was reading about how good the DSU (decorate, sort, > undecorate) approach is but the problem that we are running into in > SymPy is that we want to get by with a fast hash sort if possible, and > only decorate to break ties *if necessary*. It's a pity to decorate > with an expensive function if you don't need it but I don't know how > to only decorate when there are ties. Do you have any ideas how to do > the following better: Here's an implementation that uses the key argument that is supported by list.sort() and the built-in sorted(). A generator function (keys(value)) is used to calculate the partial keys as necessary. import time import random from contextlib import contextmanager from functools import total_ordering try: from itertools import izip except ImportError: # python 3 izip = zip def make_key(keys): @total_ordering class Key(object): def __init__(self, value): self._keys = keys(value) self._cached = [] def keys(self): for k in self._cached: yield k for k in self._keys: self._cached.append(k) yield k def __eq__(self, other): return all(a == b for a, b in izip(self.keys(), other.keys())) def __lt__(self, other): for a, b in izip(self.keys(), other.keys()): if a == b: pass else: return a < b return False return Key @contextmanager def bench(description): print("starting...") start = time.time() yield stop = time.time() print(description.format(stop - start)) if __name__ == "__main__": N = 10 def keys(value): """user defined lazy key""" yield value time.sleep(.1) yield random.random() data = list(range(N)) + [N, N] wanted = list(data) random.shuffle(data) with bench("lazy key: {:.1f}s"): got = sorted(data, key=make_key(keys)) assert got == wanted with bench("eager key: {:.1f}s"): got = sorted(data, key=lambda value: tuple(keys(value))) assert got == wanted ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] lazily decorated sort
On 11/09/12 21:44, Chris Smith wrote: Hi all, I'm wondering if anyone has seen or knows of a good way to do a lazily decorated sort. I was reading about how good the DSU (decorate, sort, undecorate) approach is but the problem that we are running into in SymPy is that we want to get by with a fast hash sort if possible, and only decorate to break ties *if necessary*. It's a pity to decorate with an expensive function if you don't need it but I don't know how to only decorate when there are ties. Firstly, unless you intend supporting Python 2.3 or older, there is (probably) no need for an explicit DSU approach. Instead, use the key argument to sorted and list.sort. I'm not sure I completely understand your question. If you know ahead of time that you can avoid DSU, you can do this: if condition: x = sorted(something) else: x = sorted(something, key=func) But I imagine that's not what you mean. My guess is that you want the sort to be sufficiently clever that instead of doing this: decorate every item compare decorated items, and sort as needed undecorate every item you want it to do this: compare items, sorting as needed if they are equal, compare decorated items (which will never tie???) Am I close? I think you could arrange that two ways: 1) A two-pass sort; first, sort undecorated, then scan looking for ties, if you find any, sort again with a key function; (This is safe, since sorting in Python is now guaranteed to be stable.) 2) or use a key function that does something like this: class SmartComparator(object): def __init__(self, item): self.item = item def __cmp__(self, other): x = cmp(self.item, other.item) if x == 0: return cmp(self.decorate(self.item), self.decorate(other.item)) return x def decorate(self, value): # expensive magic goes in here... sorted(items, key=SmartComparator) I think the second version is more promising, since it avoids a linear search for ties. You will need to check the documentation to see whether sorting relies on __cmp__ or rich comparisons (__gt__, __lt__, __eq__, etc.). If you need any help, don't hesitate to ask. P.S. I just discovered sympy recently, it is awesome. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] lazily decorated sort
On 11/09/2012 12:44, Chris Smith wrote: Hi all, I'm wondering if anyone has seen or knows of a good way to do a lazily decorated sort. I was reading about how good the DSU (decorate, sort, undecorate) approach is but the problem that we are running into in SymPy is that we want to get by with a fast hash sort if possible, and only decorate to break ties *if necessary*. It's a pity to decorate with an expensive function if you don't need it but I don't know how to only decorate when there are ties. Do you have any ideas how to do the following better: def f(): """delay for 2 seconds""" from time import time from random import random t=time() while time()-t<1: pass return random class foo(object): """an object that calls the delay function when comparing""" def __eq__(self, other): return f() == f() def __lt__(self, other): return f() < f() def lazyDSU(seq, keys=[]): """Return sorted seq, breaking ties by lazily applying keys successively as needed from the list of provided keys.""" if not keys: seq = sorted(seq) else: d = defaultdict(list) f = keys.pop(0) for a in seq: d[f(a)].append(a) seq = [] for k in sorted(d.keys()): if len(d[k]) > 1 and keys: seq.extend(lazyDSU(d[k], keys=keys[1:])) else: seq.extend(d[k]) return tuple(seq) lazyDSU(range(5)) # fast (0, 1, 2, 3, 4) [i[0] for i in lazyDSU(zip(range(5), [foo()]*5))] # fast, no ties [0, 1, 2, 3, 4] [i[0] for i in lazyDSU([(0, foo())] + list(zip(range(5), [foo()]*5)))] # slower [0, 0, 1, 2, 3, 4] The last run takes 4 seconds (but not 12 seconds) because only two had to have ties broken. In the examples, no keys were passed but the discretionary decoration was demonstrated. /Chris ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor It's my understanding that DSU is unneccessary in modern versions of Python so I suggest you read http://docs.python.org/howto/sorting.html http://wiki.python.org/moin/PythonSpeed/PerformanceTips#Sorting these before you go any further, if you haven't already done so already that is. -- Cheers. Mark Lawrence. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor