Re: [Tutor] lazily decorated sort

2012-09-28 Thread eryksun
On Fri, Sep 28, 2012 at 3:34 PM, Peter Otten <__pete...@web.de> wrote:
>
>> Also, as far as I can see in the code, implementing "total_ordering"
>> is unnecessary for sorting. One only needs to implement __lt__. It's
>> used by binarysort() to test the pivot via the IFLT/ISLT macros:
>>
>> http://hg.python.org/cpython/file/70274d53c1dd/Objects/listobject.c#l1023
>
> I smell danger. I really don't like having a class lying around that
> partially implements ordering and may fail where you least expect it.

I don't know; they're just utility objects used for sorting, which
only cares about __lt__. On the other hand, Python 3 enforces a sanity
check if you implement __eq__ without __hash__. So now you have
objects that can't be used in sets or as dict keys. Not that this
matters. ;)  Anyway, it's not hurting. I just thought I'd make the
suggestion to spare you the [small] effort of implementing __eq__ on
something like this.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] lazily decorated sort

2012-09-28 Thread Peter Otten
eryksun wrote:

> On Fri, Sep 28, 2012 at 8:17 AM, Peter Otten <__pete...@web.de> wrote:
>>
>> def make_key(keys):
>> @total_ordering
>> class Key(object):
>> def __init__(self, value):
>> self._keys = keys(value)
>> self._cached = []
> 
> 
> Using a generator/iterator to pump the key values into a cache is a
> great idea. But I think it would generally be nicer to let "keys" be a
> sequence of functions. Then define the generator in make_key() before
> you define the class:

I should have mentioned that make_key() may be used as a decorator. So it is

@make_key
def key(value):
yield value
yield f(value)
yield value.attrib
items.sorted(key=key)

against (assuming the signature make_key(*keyfuncs))

items.sort(key=make_key(
lambda value: value,
f,
operator.itemgetter("attrib"))
)

I think the first version looks a bit cleaner, but that's a matter of taste.

> Also, as far as I can see in the code, implementing "total_ordering"
> is unnecessary for sorting. One only needs to implement __lt__. It's
> used by binarysort() to test the pivot via the IFLT/ISLT macros:
> 
> http://hg.python.org/cpython/file/70274d53c1dd/Objects/listobject.c#l1023

I smell danger. I really don't like having a class lying around that 
partially implements ordering and may fail where you least expect it.

>>  if a == b:
>>  pass
>>  else:
>>  return a < b

> Or test for "!=":

Yes, that was odd indeed.


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] lazily decorated sort

2012-09-28 Thread eryksun
On Fri, Sep 28, 2012 at 8:17 AM, Peter Otten <__pete...@web.de> wrote:
>
> def make_key(keys):
> @total_ordering
> class Key(object):
> def __init__(self, value):
> self._keys = keys(value)
> self._cached = []


Using a generator/iterator to pump the key values into a cache is a
great idea. But I think it would generally be nicer to let "keys" be a
sequence of functions. Then define the generator in make_key() before
you define the class:


def make_key(keys):

def _keys(value):
for key in keys:
yield key(value)

@total_ordering
class Key(object):

def __init__(self, value):
self._keys = _keys(value)
self._cached = []


Also, as far as I can see in the code, implementing "total_ordering"
is unnecessary for sorting. One only needs to implement __lt__. It's
used by binarysort() to test the pivot via the IFLT/ISLT macros:

http://hg.python.org/cpython/file/70274d53c1dd/Objects/listobject.c#l1023


> def __lt__(self, other):
> for a, b in izip(self.keys(), other.keys()):
>  if a == b:
>  pass
>  else:
>  return a < b
> return False


Or test for "!=":

def __lt__(self, other):
for a, b in izip(self.keys(), other.keys()):
 if a != b:
 return a < b
return False
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] lazily decorated sort

2012-09-28 Thread Peter Otten
Chris Smith wrote:

> I'm wondering if anyone has seen or knows of a good way to do a lazily
> decorated sort. I was reading about how good the DSU (decorate, sort,
> undecorate) approach is but the problem that we are running into in
> SymPy is that we want to get by with a fast hash sort if possible, and
> only decorate to break ties *if necessary*. It's a pity to decorate
> with an expensive function if you don't need it but I don't know how
> to only decorate when there are ties. Do you have any ideas how to do
> the following better:

Here's an implementation that uses the key argument that is supported by 
list.sort() and the built-in sorted(). A generator function (keys(value)) is 
used to calculate the partial keys as necessary.

import time
import random

from contextlib import contextmanager
from functools import total_ordering

try:
from itertools import izip
except ImportError:
# python 3
izip = zip 

def make_key(keys):
@total_ordering
class Key(object):
def __init__(self, value):
self._keys = keys(value)
self._cached = []
def keys(self):
for k in self._cached:
yield k
for k in self._keys:
self._cached.append(k)
yield k
def __eq__(self, other):
return all(a == b for a, b in izip(self.keys(), other.keys()))
def __lt__(self, other):
for a, b in izip(self.keys(), other.keys()):
 if a == b:
 pass
 else:
 return a < b
return False
 
return Key

@contextmanager
def bench(description):
print("starting...")
start = time.time()
yield
stop = time.time()
print(description.format(stop - start))


if __name__ == "__main__":
N = 10
def keys(value):
"""user defined lazy key"""
yield value
time.sleep(.1)
yield random.random()

data = list(range(N)) + [N, N]
wanted = list(data)
random.shuffle(data)

with bench("lazy key: {:.1f}s"):
got = sorted(data, key=make_key(keys))
assert got == wanted

with bench("eager key: {:.1f}s"):
got = sorted(data, key=lambda value: tuple(keys(value)))
assert got == wanted


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] lazily decorated sort

2012-09-28 Thread Steven D'Aprano

On 11/09/12 21:44, Chris Smith wrote:

Hi all,

I'm wondering if anyone has seen or knows of a good way to do a lazily
decorated sort. I was reading about how good the DSU (decorate, sort,
undecorate) approach is but the problem that we are running into in
SymPy is that we want to get by with a fast hash sort if possible, and
only decorate to break ties *if necessary*. It's a pity to decorate
with an expensive function if you don't need it but I don't know how
to only decorate when there are ties.


Firstly, unless you intend supporting Python 2.3 or older, there is
(probably) no need for an explicit DSU approach. Instead, use the key
argument to sorted and list.sort.

I'm not sure I completely understand your question. If you know ahead of
time that you can avoid DSU, you can do this:

if condition:
x = sorted(something)
else:
x = sorted(something, key=func)


But I imagine that's not what you mean. My guess is that you want the
sort to be sufficiently clever that instead of doing this:

decorate every item
compare decorated items, and sort as needed
undecorate every item

you want it to do this:

compare items, sorting as needed
if they are equal, compare decorated items (which will never tie???)

Am I close?

I think you could arrange that two ways:

1) A two-pass sort; first, sort undecorated, then scan looking for ties,
if you find any, sort again with a key function;

(This is safe, since sorting in Python is now guaranteed to be stable.)


2) or use a key function that does something like this:

class SmartComparator(object):
def __init__(self, item):
self.item = item
def __cmp__(self, other):
x = cmp(self.item, other.item)
if x == 0:
return cmp(self.decorate(self.item), self.decorate(other.item))
return x
def decorate(self, value):
# expensive magic goes in here...

sorted(items, key=SmartComparator)


I think the second version is more promising, since it avoids a linear search
for ties.

You will need to check the documentation to see whether sorting relies on
__cmp__ or rich comparisons (__gt__, __lt__, __eq__, etc.).

If you need any help, don't hesitate to ask.



P.S. I just discovered sympy recently, it is awesome.


--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] lazily decorated sort

2012-09-28 Thread Mark Lawrence

On 11/09/2012 12:44, Chris Smith wrote:

Hi all,

I'm wondering if anyone has seen or knows of a good way to do a lazily
decorated sort. I was reading about how good the DSU (decorate, sort,
undecorate) approach is but the problem that we are running into in
SymPy is that we want to get by with a fast hash sort if possible, and
only decorate to break ties *if necessary*. It's a pity to decorate
with an expensive function if you don't need it but I don't know how
to only decorate when there are ties. Do you have any ideas how to do
the following better:


def f():
   """delay for 2 seconds"""
   from time import time
   from random import random
   t=time()
   while time()-t<1:
 pass
   return random

class foo(object):
   """an object that calls the delay function when comparing"""
   def __eq__(self, other):
 return f() == f()
   def __lt__(self, other):
 return f() < f()

def lazyDSU(seq, keys=[]):
 """Return sorted seq, breaking ties by lazily applying keys successively
 as needed from the list of provided keys."""
 if not keys:
 seq = sorted(seq)
 else:
 d = defaultdict(list)
 f = keys.pop(0)
 for a in seq:
 d[f(a)].append(a)
 seq = []
 for k in sorted(d.keys()):
   if len(d[k]) > 1 and keys:
   seq.extend(lazyDSU(d[k], keys=keys[1:]))
   else:
   seq.extend(d[k])
 return tuple(seq)


lazyDSU(range(5)) # fast

(0, 1, 2, 3, 4)

[i[0] for i in lazyDSU(zip(range(5), [foo()]*5))] # fast, no ties

[0, 1, 2, 3, 4]

[i[0] for i in lazyDSU([(0, foo())] + list(zip(range(5), [foo()]*5)))] # slower

[0, 0, 1, 2, 3, 4]

The last run takes 4 seconds (but not 12 seconds) because only two had
to have ties broken.

In the examples, no keys were passed but the discretionary decoration
was demonstrated.

/Chris
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor



It's my understanding that DSU is unneccessary in modern versions of 
Python so I suggest you read  http://docs.python.org/howto/sorting.html 
http://wiki.python.org/moin/PythonSpeed/PerformanceTips#Sorting these 
before you go any further, if you haven't already done so already that is.


--
Cheers.

Mark Lawrence.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor