Re: Throw the cat among the pigeons

2015-05-07 Thread Chris Angelico
On Thu, May 7, 2015 at 7:14 PM, Alain Ketterlin
 wrote:
> Dave Angel  writes:
>
>> On 05/06/2015 11:36 AM, Alain Ketterlin wrote:
>>> Yes, plus the time for memory allocation. Since the code uses "r *=
>>> ...", space is reallocated when the result doesn't fit. The new size is
>>> probably proportional to the current (insufficient) size. This means
>>> that overall, you'll need fewer reallocations, because allocations are
>>> made in bigger chunks.
>>
>> That sounds plausible, but  a=5; a*=4  does not update in place. It
>> calculates and creates a new object.  Updating lists can work as you
>> say, but an int is immutable.
>
> Ah, okay. Even for big ints? If that is the case, my suggestion doesn't
> explain anything. Anyway, with so many allocations for so little
> arithmetic, the difference is probably due to the behavior of the
> allocator (which maybe always finds blocks big enough, since one was
> released after the previous multiplication, or something like that). The
> only way to know would be to profile the VM.

Yes, all integers are immutable. This is true regardless of the size
of the integer, because:

x = some_big_long_calculation()
y = x
y += 1

should never change the value of x.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-07 Thread Alain Ketterlin
Dave Angel  writes:

> On 05/06/2015 11:36 AM, Alain Ketterlin wrote:
>> Yes, plus the time for memory allocation. Since the code uses "r *=
>> ...", space is reallocated when the result doesn't fit. The new size is
>> probably proportional to the current (insufficient) size. This means
>> that overall, you'll need fewer reallocations, because allocations are
>> made in bigger chunks.
>
> That sounds plausible, but  a=5; a*=4  does not update in place. It
> calculates and creates a new object.  Updating lists can work as you
> say, but an int is immutable.

Ah, okay. Even for big ints? If that is the case, my suggestion doesn't
explain anything. Anyway, with so many allocations for so little
arithmetic, the difference is probably due to the behavior of the
allocator (which maybe always finds blocks big enough, since one was
released after the previous multiplication, or something like that). The
only way to know would be to profile the VM.

> It's an optimization that might be applied if the code generator were
> a lot smarter, (and if the ref count is exactly 1), but it would then
> be confusing to anyone who used id().

"Abandon all hope, ye [optimizer] who enter here."

Thanks for the clarification.

-- Alain.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Dan Sommers
On Wed, 06 May 2015 09:12:05 -0400, Dave Angel wrote:

> Remember the days when you knew how many cycles each assembly
> instruction took, and could simply add them up to compare algorithms?

I do!  I do!  :-)

And then the MC68020 came out, and instruction execution overlapped in
weird (but predictable) ways, and interrupts would have different
(unpredictable) latency depending on how much internal state of the CPU
had to be saved and restored.

Man, am I *old*.

Dan
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Mark Lawrence

On 06/05/2015 17:17, Ian Kelly wrote:

On Wed, May 6, 2015 at 1:08 AM, Steven D'Aprano


Besides, "typical load" is a myth -- there is no such thing. A high-end
Windows web server getting ten thousand hits a minute, a virtual machine
starved for RAM, a Mac laptop, a Linux server idling away with a load of 0.1
all day... any of those machines could run your code. How can you *possibly*
say what is typical? The very idea is absurd.


Agreed.



I must disagree with this, on the grounds that a "typical load" of old 
cobblers are frequently spouted on this list.  Practicality beats purity 
any day of the week.  But no, this '=' is misused, it should have been 
':='.  Who really cares, if you don't like the language, pick another 
one, there's thousands of them to choose from.  Ah, another chance for 
me to plug the virtues of CORAL 66 and CORAL 250.  Capability violation 
anybody?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Dave Angel

On 05/06/2015 11:36 AM, Alain Ketterlin wrote:

Yes, plus the time for memory allocation. Since the code uses "r *=
...", space is reallocated when the result doesn't fit. The new size is
probably proportional to the current (insufficient) size. This means
that overall, you'll need fewer reallocations, because allocations are
made in bigger chunks.


That sounds plausible, but  a=5; a*=4  does not update in place. It 
calculates and creates a new object.  Updating lists can work as you 
say, but an int is immutable.


It's an optimization that might be applied if the code generator were a 
lot smarter, (and if the ref count is exactly 1), but it would then be 
confusing to anyone who used id().


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Paul Rubin
Ian Kelly  writes:
> That was my initial thought as well, but the problem is that this
> actually predicts the *opposite* of what is being reported: upward
> should be less expensive, not more.

Wait, what?  Hmm, you're right.  Needed coffee, will think about it more
later.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Ian Kelly
On Wed, May 6, 2015 at 1:08 AM, Steven D'Aprano
 wrote:
> On Wednesday 06 May 2015 15:58, Ian Kelly wrote:
>
>> On Tue, May 5, 2015 at 7:27 PM, Steven D'Aprano
>>  wrote:
>>> Only the minimum is statistically useful.
>>
>> I disagree. The minimum tells you how fast the code *can* run, under
>> optimal circumstances. The mean tells you how fast it *realistically*
>> runs, under typical load. Both can be useful to measure.
>
> Er, not even close. Running code using timeit is in no way the same as
> running code "for real" under realistic circumstances. The fact that you are
> running the function or code snippet in isolation, in its own scope, via
> exec, rather than as part of some larger script or application, should be a
> hint. timeit itself has overhead, so you cannot measure the time taken by
> the operation alone, you can only measure the time taken by the operation
> within the timeit environment. We have no reason to think that the
> distribution of noise under timeit will be even vaguely similar to the noise
> when running in production.

You also can't be sure that the base time taken by the operation in
your development environment will be comparable to the time taken in
production; different system architectures may produce different
results, and what is faster on your workstation may be slower on a
server.

Also, different algorithms may react to load differently. For example,
an algorithm that goes to different parts of memory frequently may
start thrashing sooner than an algorithm with better spatial locality
if the system is paging a lot. I'll grant that just computing the
means on a workstation that is not under a controlled load is not the
best way to measure this -- but a difference in mean that is not
simply proportional to the difference in min is still potentially
useful information.

> The purpose of timeit is to compare individual algorithms, in as close as
> possible to an equal footing with as little noise as possible. If you want
> to profile code used in a realistic application, use a profiler, not timeit.
> And even that doesn't tell you how fast the code would be alone, because the
> profiler adds overhead.
>
> Besides, "typical load" is a myth -- there is no such thing. A high-end
> Windows web server getting ten thousand hits a minute, a virtual machine
> starved for RAM, a Mac laptop, a Linux server idling away with a load of 0.1
> all day... any of those machines could run your code. How can you *possibly*
> say what is typical? The very idea is absurd.

Agreed.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Ian Kelly
On Wed, May 6, 2015 at 9:12 AM, Paul Rubin  wrote:
> Steven D'Aprano  writes:
>> Multiplying upwards seems to be more expensive than multiplying
>> downwards... I can only guess that it has something to do with the way
>> multiplication is implemented, or perhaps the memory management
>> involved, or something. Who the hell knows?
>
> It seems pretty natural if multiplication uses the obvious
> quadratic-time pencil and paper algorithm.  The cost of multiplying m*n
> is basically w(m)*w(n) where w(x) is the width of x in machine words.
> So for factorial where m is the counter and n is the running product,
> w(m) is always 1 while w(n) is basically log2(n!).  From
>
> from math import log
> def xfac(seq):
> cost = logfac = 0.0
> for i in seq:
> logfac += log(i,2)
> cost += logfac
> return cost
>
> def upward(n): return xfac(xrange(1,n+1))
> def downward(n): return xfac(xrange(n,1,-1))
>
> print upward(4),downward(4)
>
> I get: 10499542692.6 11652843833.5
>
> A lower number for upward than downward.  The difference isn't as large
> as your timings, but I think it still gives some explanation of the
> effect.

That was my initial thought as well, but the problem is that this
actually predicts the *opposite* of what is being reported: upward
should be less expensive, not more.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Alain Ketterlin
Paul Rubin  writes:

> Steven D'Aprano  writes:
>> Multiplying upwards seems to be more expensive than multiplying
>> downwards... I can only guess that it has something to do with the way
>> multiplication is implemented, or perhaps the memory management
>> involved, or something. Who the hell knows?
>
> It seems pretty natural if multiplication uses the obvious
> quadratic-time pencil and paper algorithm.  The cost of multiplying m*n
> is basically w(m)*w(n) where w(x) is the width of x in machine words.
> So for factorial where m is the counter and n is the running product,
> w(m) is always 1 while w(n) is basically log2(n!).  From
>
> from math import log
> def xfac(seq):
> cost = logfac = 0.0
> for i in seq:
> logfac += log(i,2)
> cost += logfac
> return cost
>
> def upward(n): return xfac(xrange(1,n+1))
> def downward(n): return xfac(xrange(n,1,-1))
>
> print upward(4),downward(4)
>
> I get: 10499542692.6 11652843833.5
>
> A lower number for upward than downward.  The difference isn't as large
> as your timings, but I think it still gives some explanation of the
> effect.

Yes, plus the time for memory allocation. Since the code uses "r *=
...", space is reallocated when the result doesn't fit. The new size is
probably proportional to the current (insufficient) size. This means
that overall, you'll need fewer reallocations, because allocations are
made in bigger chunks.

-- Alain.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Paul Rubin
Steven D'Aprano  writes:
> Multiplying upwards seems to be more expensive than multiplying
> downwards... I can only guess that it has something to do with the way
> multiplication is implemented, or perhaps the memory management
> involved, or something. Who the hell knows?

It seems pretty natural if multiplication uses the obvious
quadratic-time pencil and paper algorithm.  The cost of multiplying m*n
is basically w(m)*w(n) where w(x) is the width of x in machine words.
So for factorial where m is the counter and n is the running product,
w(m) is always 1 while w(n) is basically log2(n!).  From

from math import log
def xfac(seq):
cost = logfac = 0.0
for i in seq:
logfac += log(i,2)
cost += logfac
return cost

def upward(n): return xfac(xrange(1,n+1))
def downward(n): return xfac(xrange(n,1,-1))

print upward(4),downward(4)

I get: 10499542692.6 11652843833.5

A lower number for upward than downward.  The difference isn't as large
as your timings, but I think it still gives some explanation of the
effect.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Dave Angel

On 05/06/2015 09:55 AM, Chris Angelico wrote:

On Wed, May 6, 2015 at 11:12 PM, Dave Angel  wrote:

I had guessed that the order of multiplication would make a big difference,
once the product started getting bigger than the machine word size.

Reason I thought that is that if you multiply starting at the top value (and
end with multiplying by 2) you're spending more of the time multiplying
big-ints.

That's why I made sure that both Cecil's and my implementations were
counting up, so that wouldn't be a distinction.

I'm still puzzled, as it seems your results imply that big-int*int is faster
than int*int where the product is also int.


Are you using Python 2 or Python 3 for your testing? In Py3, there's
no type difference, and barely no performance difference as you cross
the word-size boundary. (Bigger numbers are a bit slower to work with,
but not hugely.)



Both Cecil and I are using 3.x  I'm using 3.4 in particular.  And I know 
int covers both big-int and int32.  that's why I called it big-int, 
rather than long.


I was, however, mistaken.  it's not that threshold that we're crossing 
here, but another one, for MUCH larger numbers.  factorial of 10 and 
of 20 have 456473 and 97350 digits, respectively.  In binary, that 
would be about 190k bytes and 404k bytes, respectively.


I was seeing factorial of 20 taking about 4.5 times as long as 
factorial of 10.  All the other increments seemed fairly proportional.


I'll bet the difference is something like the memory allocator using a 
different algorithm for blocks above 256k.  Or the cache logic hitting a 
threshold.


If it's caching, of course the threshold will differ wildly between 
machine architectures.


If it's the memory allocator, that could easily vary between Python 
versions as well.





--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Chris Angelico
On Wed, May 6, 2015 at 11:12 PM, Dave Angel  wrote:
> I had guessed that the order of multiplication would make a big difference,
> once the product started getting bigger than the machine word size.
>
> Reason I thought that is that if you multiply starting at the top value (and
> end with multiplying by 2) you're spending more of the time multiplying
> big-ints.
>
> That's why I made sure that both Cecil's and my implementations were
> counting up, so that wouldn't be a distinction.
>
> I'm still puzzled, as it seems your results imply that big-int*int is faster
> than int*int where the product is also int.

Are you using Python 2 or Python 3 for your testing? In Py3, there's
no type difference, and barely no performance difference as you cross
the word-size boundary. (Bigger numbers are a bit slower to work with,
but not hugely.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Dave Angel

On 05/06/2015 02:26 AM, Steven D'Aprano wrote:

On Wednesday 06 May 2015 14:05, Steven D'Aprano wrote:



My interpretation of this is that the difference has something to do with
the cost of multiplications. Multiplying upwards seems to be more expensive
than multiplying downwards, a result I never would have predicted, but
that's what I'm seeing. I can only guess that it has something to do with
the way multiplication is implemented, or perhaps the memory management
involved, or something. Who the hell knows?



I had guessed that the order of multiplication would make a big 
difference, once the product started getting bigger than the machine 
word size.


Reason I thought that is that if you multiply starting at the top value 
(and end with multiplying by 2) you're spending more of the time 
multiplying big-ints.


That's why I made sure that both Cecil's and my implementations were 
counting up, so that wouldn't be a distinction.


I'm still puzzled, as it seems your results imply that big-int*int is 
faster than int*int where the product is also int.


That could use some more testing, though.

I still say a cutoff of about 10% is where we should draw the line in an 
interpretive system.  Below that, you're frequently measuring noise and 
coincidence.


Remember the days when you knew how many cycles each assembly 
instruction took, and could simply add them up to compare algorithms?



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Steven D'Aprano
On Wednesday 06 May 2015 15:58, Ian Kelly wrote:

> On Tue, May 5, 2015 at 7:27 PM, Steven D'Aprano
>  wrote:
>> Only the minimum is statistically useful.
> 
> I disagree. The minimum tells you how fast the code *can* run, under
> optimal circumstances. The mean tells you how fast it *realistically*
> runs, under typical load. Both can be useful to measure.

Er, not even close. Running code using timeit is in no way the same as 
running code "for real" under realistic circumstances. The fact that you are 
running the function or code snippet in isolation, in its own scope, via 
exec, rather than as part of some larger script or application, should be a 
hint. timeit itself has overhead, so you cannot measure the time taken by 
the operation alone, you can only measure the time taken by the operation 
within the timeit environment. We have no reason to think that the 
distribution of noise under timeit will be even vaguely similar to the noise 
when running in production.

The purpose of timeit is to compare individual algorithms, in as close as 
possible to an equal footing with as little noise as possible. If you want 
to profile code used in a realistic application, use a profiler, not timeit. 
And even that doesn't tell you how fast the code would be alone, because the 
profiler adds overhead.

Besides, "typical load" is a myth -- there is no such thing. A high-end 
Windows web server getting ten thousand hits a minute, a virtual machine 
starved for RAM, a Mac laptop, a Linux server idling away with a load of 0.1 
all day... any of those machines could run your code. How can you *possibly* 
say what is typical? The very idea is absurd.



-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Steven D'Aprano
On Wednesday 06 May 2015 14:05, Steven D'Aprano wrote:

[...]

Here are those anomalous timing results again:

> code = "fact(5)"
> t1 = Timer(code, setup="from __main__ import factorial_while as fact")
> t2 = Timer(code, setup="from __main__ import factorial_reduce as fact")
> t3 = Timer(code, setup="from __main__ import factorial_forloop1 as fact")
> t4 = Timer(code, setup="from __main__ import factorial_forloop2 as fact")
> for t in (t1, t2, t3, t4):
> print(min(t.repeat(repeat=3, number=2)))
> 
> 
> which takes about two minutes on my computer, and prints:
> 
> 8.604736926034093  # while loop
> 10.786483339965343  # reduce
> 10.91099695302546  # for loop
> 10.821452282369137  # silly version of the for loop
[...]
> What is surprising is that for very large input like this, the while loop
> is significantly faster than reduce or either of the for-loops. I cannot
> explain that result.

I re-ran the results on a different computer, and got similar results:

7.364120149984956
9.26512472704053
9.141491871327162
9.16900822892785

The while loop is surprisingly faster for calculating fact(5). A thought 
came to mind: the while loop is different from the other three versions, in 
that it performs the multiplications from n down to 2, rather than 2 up to 
n. Maybe that has something to do with it?

Introducing version five of the factorial:

def factorial_reduce2(n):
assert n >= 0
return reduce(mul, range(n, 1, -1), 1)

t5 = Timer(code, setup="from __main__ import factorial_reduce2 as fact")
for t in (t1, t2, t3, t4, t5):
print(min(t.repeat(repeat=3, number=2)))



And results:

7.36792928725481  # while loop
9.271950023248792  # reduce, counting up
9.14769447594881  # for loop
9.154150342568755  # silly for loop
7.430811045691371  # reduce, counting down


My interpretation of this is that the difference has something to do with 
the cost of multiplications. Multiplying upwards seems to be more expensive 
than multiplying downwards, a result I never would have predicted, but 
that's what I'm seeing. I can only guess that it has something to do with 
the way multiplication is implemented, or perhaps the memory management 
involved, or something. Who the hell knows?


Just to be sure:

def factorial_forloop3(n):
assert n >= 0
result = 1
for i in range(n, 1, -1):
result *= i
return result

t6 = Timer(code, setup="from __main__ import factorial_forloop3 as fact")
print(min(t6.repeat(repeat=3, number=2)))


which prints 7.36256698705256.


Lesson to be learned: what you think is important may not be, and what you 
think is irrelevant may be.



Historical fact: for a period of about 60 years, long after the trick to 
curing and preventing scurvy was known, the British navy suffered greatly 
from scurvy again (although not to the same degree as they had been in the 
17th and 18th centuries). The problem was due to confusion between *lemons* 
and *limes*. The navy started by using the term "lime" for both lemons and 
sweet limes from the Mediterranean, as well as West Indian limes: three 
different citrus fruit. To cut costs and encourage British commerce, the 
navy gradually stopping buying lemons from Spain and swapped over almost 
entirely to West Indian limes. Unfortunately limes, and especially West 
Indian limes, have significantly less Vitamin C than lemons, and the 
sailors' daily allowance was about 75% less effective on ships that used 
limes than for those that used Spanish lemons. Scurvy, which had practically 
be eradicated in the 1850s and 60s, returned, and was a significant factor 
in World War One. (Especially for the French and Russian armies.)

This simple confusion wasn't sorted out until the 1920s, long after Vitamin 
C was isolated and named. Apparently nobody noticed, or cared, that the two 
different fruit tasted different and looked different, or concluded that 
perhaps they had different medicinal properties.

But then, for many decades, vinegar and dilute sulphuric acid were 
recommended as cures for scurvy *solely* on the basis that they were acidic 
just like proven cures oranges, lemons and sauerkraut.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Ian Kelly
On Tue, May 5, 2015 at 7:27 PM, Steven D'Aprano
 wrote:
> Only the minimum is statistically useful.

I disagree. The minimum tells you how fast the code *can* run, under
optimal circumstances. The mean tells you how fast it *realistically*
runs, under typical load. Both can be useful to measure.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Steven D'Aprano
On Sun, 3 May 2015 12:20 am, Cecil Westerhof wrote:

[...]
> But to my surprise tail recursion could even be more efficient. I
> wrote two different versions of factorial with self implemented tail
> recursion. For bigger values both are more efficient. And I expect
> that if the tail recursion is done by the compiler instead of by hand,
> it will be a little faster still.


I decided to do some experimentation.

Firstly, when timing code, one should minimise the amount of "scaffolding"
used around the code you care about. Using a single function like this:

def factorial_iterative(x, flag):
if flag:
# first implementation
else:
# second implementation


is a bad idea, because you are not just timing the implementations, but also
the scaffolding that selects between them. Also, you increase the size of
the function, which increases the chance of cache misses and other
processor effects. So first step is to split the implementations into
separate functions. Here are my four implementations, the first two are
taken from your code:


def factorial_forloop1(n):
assert n >= 0
result = 1
for i in range(2, n + 1):
result *= i
return result

def factorial_forloop2(n):
# Version with a silly extra variable.
assert n >= 0
result = 1
j=2
for i in range(2, n + 1):
result *= j
j += 1
return result

from operator import mul
try:
reduce
except NameError:
from functools import reduce

def factorial_reduce(n):
assert n >= 0
return reduce(mul, range(2, n+1), 1)

def factorial_while(n):
assert n >= 0
result = 1
while n > 1:
result *= n
n -= 1
return result



A quick test to verify that they return the same results:

py> factorial_while(10)
3628800
py> factorial_reduce(10)
3628800
py> factorial_forloop1(10)
3628800
py> factorial_forloop2(10)
3628800



There's no point in optimising code that does the wrong thing!

Now, let's do some timing tests. It is best to use a well-tested timing
framework rather than invent your own dodgy one, so I use the timeit
module. Read the comments in the module to see why rolling your own is a
bad idea.

I'll start with some relative small input sizes. Remember, the inputs may be
small, but the outputs will be very large.


from timeit import Timer
code = "fact(10); fact(20); fact(30)"
t1 = Timer(code, setup="from __main__ import factorial_while as fact")
t2 = Timer(code, setup="from __main__ import factorial_reduce as fact")
t3 = Timer(code, setup="from __main__ import factorial_forloop1 as fact")
t4 = Timer(code, setup="from __main__ import factorial_forloop2 as fact")


I'm in a bit of a hurry, so I may be a bit more slap-dash than normal.
Normally, I would pick a larger number of trials, and a larger number of
iterations per trial, but here I'm going to use just best of three trials,
each of 1 iterations each:


for t in (t1, t2, t3, t4):
print(min(t.repeat(repeat=3, number=1)))


which prints:

0.22797810286283493  # while
0.17145151272416115  # reduce
0.16230526939034462  # for-loop
0.22819281555712223  # silly for-loop


(Comments added by me.) See my earlier post for why the minimum is the only
statistic worth looking at. These results show that:

- the version using while is significantly slower than the more
straightforward iterative versions using either reduce or a for loop.

- adding an extra, unnecessary, variable to the for-loop, and incrementing
it by hand, carries as much overhead as using a while loop;

- reduce is slightly slower than a pure Python for-loop (at least according
to this simple trial, on my computer -- your results may vary);

- the obvious winner is the straightforward iterative version with a
for-loop.



Now I'm going to test it with a larger input:

py> big = factorial_forloop1(5)
py> sys.getsizeof(big)  # number of bytes
94460
py> len(str(big))  # number of digits
213237


Recreate the timer objects, and (again, because I'm in something of a hurry)
do a best-of-three with just 2 iterations per trial.

code = "fact(5)"
t1 = Timer(code, setup="from __main__ import factorial_while as fact")
t2 = Timer(code, setup="from __main__ import factorial_reduce as fact")
t3 = Timer(code, setup="from __main__ import factorial_forloop1 as fact")
t4 = Timer(code, setup="from __main__ import factorial_forloop2 as fact")
for t in (t1, t2, t3, t4):
print(min(t.repeat(repeat=3, number=2)))


which takes about two minutes on my computer, and prints:

8.604736926034093  # while loop
10.786483339965343  # reduce
10.91099695302546  # for loop
10.821452282369137  # silly version of the for loop

(Again, annotations are by me.)

These results are fascinating, and rather surprising. I think that they
demonstrate that, at this size of input argument, the time is dominated by
processing the BigNum int objects, not the iteration: whether we use
reduce, a straight-forward for-loop, or a for-loop with an extra variable
makes very 

Re: Throw the cat among the pigeons

2015-05-05 Thread Chris Angelico
On Wed, May 6, 2015 at 12:57 PM, Steven D'Aprano
 wrote:
> On Wed, 6 May 2015 05:42 am, Cecil Westerhof wrote:
>
>> I would say that a variable that is filled by a range is different as
>> a normal variable. Do not ask me why. ;-)
>
>
> I would say that you are wrong. If I have understood you correctly, that
> cannot possibly be the case in Python, all Python variables work the same
> way[1], and none of them can remember where they came from.

My reading of the code was that one had two variables and the other
had just one. Could be the noise in the numbers came from cache
locality differences.

No difference between the variables, just a peculiarity of CPUs and
how they use memory.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Steven D'Aprano
On Wed, 6 May 2015 05:42 am, Cecil Westerhof wrote:

> I would say that a variable that is filled by a range is different as
> a normal variable. Do not ask me why. ;-)


I would say that you are wrong. If I have understood you correctly, that
cannot possibly be the case in Python, all Python variables work the same
way[1], and none of them can remember where they came from.






[1] Well, technically local variables and global variables are implemented
differently, locals inside a function use a C array and globals use a dict,
but apart from a few implementation details like that, any two variables in
the same scope[2] operate identically.

[2] Anyone who raises the issue of exec() or import * inside a function body
in Python 2 will be slapped with a halibut.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Steven D'Aprano
On Wed, 6 May 2015 02:18 am, Cecil Westerhof wrote:

> Well, I did not write many tail recursive functions. But what surprised
> me was that for large values the ‘tail recursive’ version was more
> efficient as the iterative version. And that was with myself
> implementing the tail recursion. I expect the code to be more
> efficient when the compiler implements the tail recursion.


You cannot know that. Python is not C or Java, and your intuition as to what
will be fast and what will be slow will not be accurate in Python. Python
has its own fast paths, and slow paths, and they are not the same as C's or
Java's.

I have been using Python for over 15 years, and I would not want to guess
whether a hypothetical compiler-generated tail-recursion-elimination
function would be faster or slower than manually unrolling the recursion
into a while loop.

I am surprised that you find a manually unrolled version with a while loop
is faster than an iterative version. I assume the iterative version uses a
for-loop. In my experience, a for-loop is faster than a while loop.

But perhaps that depends on the exact version and platform.
 


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Steven D'Aprano
On Wed, 6 May 2015 07:23 am, Ian Kelly wrote:

> On Tue, May 5, 2015 at 3:00 PM, Dave Angel  wrote:
>> def loop(func, funcname, arg):
>> start = time.time()
>> for i in range(repeats):
>> func(arg, True)
>> print("{0}({1}) took {2:7.4}".format(funcname, arg,
>> time.time()-start))
>>
>> start = time.time()
>> for i in range(repeats):
>> func(arg)
>> print("{0}({1}) took {2:7.4}".format(funcname, arg,
>> time.time()-start))
> 
> Note that you're explicitly passing True in one case but leaving the
> default in the other. I don't know whether that might be responsible
> for the difference you're seeing.

It will be responsible for *some* difference, it is certainly faster to pass
one argument than two, but likely an entirely trivial amount compared to
the time to iterate some large number of times.


> Also, it's best to use the timeit module for timing code, e.g.:
> 
 from timeit import Timer
 t1 = Timer("factorial_iterative(10, False)", "from __main__ import
 factorial_iterative") t1.repeat(10, number=1)
> [3.8517245299881324, 3.7571076710009947, 3.7780062559759244,
> 3.848508063936606, 3.7627131739864126, 3.8278848479967564,
> 3.776115525048226, 3.83024005102925, 3.8322679550619796,
> 3.8195601429324597]
 min(_), sum(_) / len(_)
> (3.7571076710009947, 3.8084128216956743)

Only the minimum is statistically useful.

The reason is that every timing measurement has an error, due to random
fluctuations of the OS, CPU pipelining, caches, etc, but that error is
always positive. The noise always makes the code take longer to run, not
faster!

So we can imagine every measurement is made up of the "true" value T, plus
an error, where T = the perfectly repeatable timing that the function would
take to run if we could somehow eliminate every possible source of noise
and error in the system. We can't, of course, but we would like to estimate
T as closely as possible.

Without loss of generality, suppose we collect five timings:

times = [T+ε1, T+ε2, T+ε3, T+ε4, T+ε5]

where the epsilons ε are unknown errors due to noise. We don't know the
distribution of the errors, except that no epsilon can be less than zero.

We want an estimate for T, something which comes as close as possible to T.
It should be obvious that the following relationships hold:

mean(times) = T + mean([ε1, ε2, ε3, ε4, ε5])

min(times) = T + min([ε1, ε2, ε3, ε4, ε5])


in other words, both the mean of the timings and the minimum of the timings
are equivalent to the "true" timing, plus an error. It should also be
obvious that the mean of the epsilons must be larger than the smallest of
the errors (except in the miraculous case where all the epsilons are
exactly the same).

So the statistic with the minimum error is the minimum. Any other stat,
whether you use the mean, geometric mean, harmonic mean, mode, median, or
anything else, will have a larger error and be a worse estimate for
the "true" value T.

All because the errors are one-sided: they always make the timing take
longer, never less time.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Terry Reedy

On 5/5/2015 5:12 PM, Cecil Westerhof wrote:

Op Tuesday 5 May 2015 22:46 CEST schreef Terry Reedy:


Well, I did not write many tail recursive functions. But what
surprised me was that for large values the ‘tail recursive’ version
was more efficient as the iterative version.


In your first thread, what you mislabelled 'tail recursive version'
was an iterative while loop version


That is because Python has no tail recursion,


Yes is does. Please stop using 'tail recursion' to refer to the absence 
of recursion or the process of removing recursion. I wrote a tail 
recursive function, and I believe you did too.


What Python does not have is automatic tail call optimization, or 
specifically, tail recursion *elimination*. Both possibilities, the 
general and specific, have been considered and rejected for excellent 
reasons. I hinted at some of them in my post.


See https://en.wikipedia.org/wiki/Tail_call
for 'tail recursion' and 'tail call elimination'

What I believe you showed is that a while loop can be faster than a more 
or less equivalent for loop that produces the same result by a slightly 
different method.  That is not surprising.  Relative timings for CPython 
vary a few percent between different combinations of Python version, C 
compiler and settings, operating system, and cpu and other hardware.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Dave Angel

On 05/05/2015 05:39 PM, Ian Kelly wrote:

On Tue, May 5, 2015 at 3:23 PM, Ian Kelly  wrote:

On Tue, May 5, 2015 at 3:00 PM, Dave Angel  wrote:

def loop(func, funcname, arg):
 start = time.time()
 for i in range(repeats):
 func(arg, True)
 print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))

 start = time.time()
 for i in range(repeats):
 func(arg)
 print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))


Note that you're explicitly passing True in one case but leaving the
default in the other. I don't know whether that might be responsible
for the difference you're seeing.


I don't think that's the cause, but I do think that it has something
to do with the way the timing is being run. When I run your loop
function, I do observe the difference. If I reverse the order so that
the False case is tested first, I observe the opposite. That is, the
slower case is consistently the one that is timed *first* in the loop
function, regardless of which case that is.



I created two functions and called them with Timeit(), and the 
difference is now below 3%


And when I take your lead and double the loop() function so it runs each 
test twice, I get steadily decreasing numbers.


I think all of this has been noise caused by the caching of objects 
including function objects.  I was surprised by this, as the loops are 
small enough I'd figure the function object would be fully cached the 
first time it was called.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Ian Kelly
On Tue, May 5, 2015 at 3:23 PM, Ian Kelly  wrote:
> On Tue, May 5, 2015 at 3:00 PM, Dave Angel  wrote:
>> def loop(func, funcname, arg):
>> start = time.time()
>> for i in range(repeats):
>> func(arg, True)
>> print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))
>>
>> start = time.time()
>> for i in range(repeats):
>> func(arg)
>> print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))
>
> Note that you're explicitly passing True in one case but leaving the
> default in the other. I don't know whether that might be responsible
> for the difference you're seeing.

I don't think that's the cause, but I do think that it has something
to do with the way the timing is being run. When I run your loop
function, I do observe the difference. If I reverse the order so that
the False case is tested first, I observe the opposite. That is, the
slower case is consistently the one that is timed *first* in the loop
function, regardless of which case that is.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Cecil Westerhof
Op Tuesday 5 May 2015 22:46 CEST schreef Terry Reedy:

>> Well, I did not write many tail recursive functions. But what
>> surprised me was that for large values the ‘tail recursive’ version
>> was more efficient as the iterative version.
>
> In your first thread, what you mislabelled 'tail recursive version'
> was an iterative while loop version

That is because Python has no tail recursion, so I needed to program
the tail recursion myself. Tail recursion would do under the hood what
I did there manually.


> while the 'iterative version'
> was an iterative for loop version. In this thread, you just posted
> timings without code. I will not believe your claim until I see one
> file that I can run myself with an actual tail recursive function,
> as above, that beats the equivalent while or for loop version.

https://github.com/CecilWesterhof/PythonLibrary/blob/master/mathDecebal.py

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Ian Kelly
On Tue, May 5, 2015 at 3:00 PM, Dave Angel  wrote:
> def loop(func, funcname, arg):
> start = time.time()
> for i in range(repeats):
> func(arg, True)
> print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))
>
> start = time.time()
> for i in range(repeats):
> func(arg)
> print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))

Note that you're explicitly passing True in one case but leaving the
default in the other. I don't know whether that might be responsible
for the difference you're seeing.

Also, it's best to use the timeit module for timing code, e.g.:

>>> from timeit import Timer
>>> t1 = Timer("factorial_iterative(10, False)", "from __main__ import 
>>> factorial_iterative")
>>> t1.repeat(10, number=1)
[3.8517245299881324, 3.7571076710009947, 3.7780062559759244,
3.848508063936606, 3.7627131739864126, 3.8278848479967564,
3.776115525048226, 3.83024005102925, 3.8322679550619796,
3.8195601429324597]
>>> min(_), sum(_) / len(_)
(3.7571076710009947, 3.8084128216956743)
>>> t2 = Timer("factorial_iterative(10, True)", "from __main__ import 
>>> factorial_iterative")
>>> t2.repeat(10, number=1)
[3.8363616950809956, 3.753201302024536, 3.7838632150087506,
3.7670978900277987, 3.805312803015113, 3.7682680500438437,
3.856655619922094, 3.796431727008894, 3.8224815409630537,
3.765664782957174]
>>> min(_), sum(_) / len(_)
(3.753201302024536, 3.7955338626052253)

As you can see, in my testing the True case was actually marginally
(probably not significantly) faster in both the min and the average.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Dave Angel

On 05/05/2015 04:30 PM, Ian Kelly wrote:

On Tue, May 5, 2015 at 12:45 PM, Dave Angel  wrote:

When the "simple" is True, the function takes noticeably and consistently
longer.  For example, it might take 116 instead of 109 seconds.  For the
same counts, your code took 111.


I can't replicate this. What version of Python is it, and what value
of x are you testing with?


I've looked at dis.dis(factorial_iterative), and can see no explicit reason
for the difference.


My first thought is that maybe it's a result of the branch. Have you
tried swapping the branches, or reimplementing as separate functions
and comparing?



Logic is quite simple:


def factorial_iterative(x, simple=False):
assert x >= 0
result = 1
j=2
if not simple:
for i in range(2, x + 1):
#print("range value is of type", type(i), "and value", i)
#print("ordinary value is of type", type(j), "and value", j)
result *= i
j += 1
else:
for i in range(2, x + 1):
result *= j
j += 1

return result

def loop(func, funcname, arg):
start = time.time()
for i in range(repeats):
func(arg, True)
print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))

start = time.time()
for i in range(repeats):
func(arg)
print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))

repeats = 1

and arg is 10**4
loop(factorial_iterative,  "factorial_iterative  ", arg)

My actual program does the same thing with other versions of the 
function, including Cecil's factorial_tail_recursion, and my optimized 
version of that.



Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux

factorial_iterative  (10) took   3.807
factorial_iterative  (10) took   3.664

factorial_iterative  (20) took   17.07
factorial_iterative  (20) took15.3

factorial_iterative  (30) took   38.93
factorial_iterative  (30) took   36.01


Note that I test them in the opposite order of where they appear in the 
function.  That's because I was originally using the simple flag to test 
an empty loop.  The empty loop is much quicker either way, so it's not 
the issue.  (But if it were, the for-range version is much quicker).


I think I'll take your last suggestion and write separate functions.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Terry Reedy

On 5/5/2015 12:18 PM, Cecil Westerhof wrote:

Op Tuesday 5 May 2015 17:47 CEST schreef Paul Moore:


On Sunday, 3 May 2015 16:23:59 UTC+1, Cecil Westerhof  wrote:

By the way: I think that even if the recursion does not go further
as 500, it is still a good idea to use tail recursion. Why use
stack space when it is not necessary?


I pushed the example to GitHub:
https://github.com/CecilWesterhof/PythonLibrary/blob/master/mathDecebal.py


You already know this, as your code shows, but tail call recursion
elimination is only possible when you have a *direct* tail call (one


An 'indirect tail call' would not be a tail call.


with the result of the tail call returned immediately to the
caller). Even the standard trivial factorial example doesn't have a
direct tail call,  without rewriting to use an accumulator variable.
Which is a non-intuitive transformation to anyone who's not familiar
with recursive functional languages and their idioms.

If you're rewriting your recursive function *anyway*, it's not that
much harder in many (most?) cases to rewrite it iteratively.


For count functions, the main change between tail recursion and while 
iteration is replacing 'if' with 'while' and converting the tail call to 
assignment.  (One may have to reverse the if-else first to put the tail 
call in the if branch.)


from math import factorial as fac
print(fac(0), fac(1), fac(2), fac(6))

def fac_rt(n, i=2, res=1):
if i <= n:
return fac_rt(n, i+1, res*i)
else:
return res

def fac_iw(n):
i = 2
res = 1
while i <= n:
i, res = i+1, res*i
return res

for i in (0, 1, 2, 6):
print(fac(i) == fac_rt(i) == fac_iw(i))

>>>
1 1 2 720
True
True
True
True

For collection functions that process each item once, 'for item in 
collection: ...' is nearly always easier to write in the first place.



An example of a function that naturally uses direct tail call
recursion, but which doesn't have a simple iterative rewrite, would
be more compelling.


Simple, easily converted functions like the above, with one recursive 
call in one branch of an if-else, are the most common.  Something with 
multiple mutually exclusive tail calls, like the following


def f_rt1(*args):
  if nonbase1:
return f(*revised-args1)
  elif base_condition:
return base(args)
  else:
return f(*revised-args2)

must be converted to if-else with all tail calls in the if branch.

def f_rt2(*args):
  if not base_condition:
if nonbase1:
  return f(*revised-args1)
else:
  return f(*revised-args2)
  else:
return base(args)

Conversion would then be simple.  The complication is that the 
'base_condition' in the original version might not really be the base 
condition due to a dependence on nonbase1 being false.  This is a 
generic problem with reordering if-elif statements.


For non-linear (branching) recursion, in which multiple recursive calls 
may be made for one function call, the last recursive call may be a tail 
call.  An example is in-place quick sort.  Eliminating just the tail 
call may not be simple, but it also does not eliminate the possibility 
of blowing the call stack.  To do that, one must eliminate all recursive 
calls by adding explicit stacks.



Well, I did not write many tail recursive functions. But what surprised
me was that for large values the ‘tail recursive’ version was more
efficient as the iterative version.


In your first thread, what you mislabelled 'tail recursive version' was 
an iterative while loop version while the 'iterative version' was an 
iterative for loop version.  In this thread, you just posted timings 
without code. I will not believe your claim until I see one file that I 
can run myself with an actual tail recursive function, as above, that 
beats the equivalent while or for loop version.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Ian Kelly
On Tue, May 5, 2015 at 12:45 PM, Dave Angel  wrote:
> When the "simple" is True, the function takes noticeably and consistently
> longer.  For example, it might take 116 instead of 109 seconds.  For the
> same counts, your code took 111.

I can't replicate this. What version of Python is it, and what value
of x are you testing with?

> I've looked at dis.dis(factorial_iterative), and can see no explicit reason
> for the difference.

My first thought is that maybe it's a result of the branch. Have you
tried swapping the branches, or reimplementing as separate functions
and comparing?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Cecil Westerhof
Op Tuesday 5 May 2015 20:45 CEST schreef Dave Angel:

> On 05/05/2015 12:18 PM, Cecil Westerhof wrote:
>
>>
>> Well, I did not write many tail recursive functions. But what
>> surprised me was that for large values the ‘tail recursive’ version
>> was more efficient as the iterative version. And that was with
>> myself implementing the tail recursion. I expect the code to be
>> more efficient when the compiler implements the tail recursion.
>>
>
>
> You've said that repeatedly, so I finally took a look at your
> webpage
>
> https://github.com/CecilWesterhof/PythonLibrary/blob/master/mathDecebal.py
>
> I didn't have your framework to call the code, so I just extracted
> some functions and did some testing.

I definitely need to take care of documentation.

It can be called with:
python3 mathDecebal.py --factorial

The problem is that it will do correctness and speed test. I have to
split those in two different things. And use a different file for
both.

Maybe make a directory test and put a correctness_.py and a
speed_.py.



> I do see some differences,
> where the so-called tail_recursive functions are sometimes faster,
> but I did some investigating to try to determine why.
>
>
> I came up with the conclusion that sometimes the multiply operation
> takes longer than other times.  And in particular, i can see more
> variation between the two following loops than between your two
> functions.
>
>
> def factorial_iterative(x, simple=False):
> assert x >= 0
> result = 1
> j=2
> if not simple:
> for i in range(2, x + 1):
> result *= i
> j += 1
> else:
> for i in range(2, x + 1):
> result *= j
> j += 1
> pass
>
> return result
>
> When the "simple" is True, the function takes noticeably and
> consistently longer.  For example, it might take 116 instead of 109
> seconds.  For the same counts, your code took 111.
>
> I've looked at dis.dis(factorial_iterative), and can see no explicit
> reason for the difference.

I would say that a variable that is filled by a range is different as
a normal variable. Do not ask me why. ;-)

Even if you (general not personal you) think that the tail recursion
is a waist of time, this is an interesting result I think.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Dave Angel

On 05/05/2015 12:18 PM, Cecil Westerhof wrote:



Well, I did not write many tail recursive functions. But what surprised
me was that for large values the ‘tail recursive’ version was more
efficient as the iterative version. And that was with myself
implementing the tail recursion. I expect the code to be more
efficient when the compiler implements the tail recursion.




You've said that repeatedly, so I finally took a look at your webpage

https://github.com/CecilWesterhof/PythonLibrary/blob/master/mathDecebal.py

I didn't have your framework to call the code, so I just extracted some 
functions and did some testing.  I do see some differences, where the 
so-called tail_recursive functions are sometimes faster, but I did some 
investigating to try to determine why.



I came up with the conclusion that sometimes the multiply operation 
takes longer than other times.  And in particular, i can see more 
variation between the two following loops than between your two functions.



def factorial_iterative(x, simple=False):
assert x >= 0
result = 1
j=2
if not simple:
for i in range(2, x + 1):
result *= i
j += 1
else:
for i in range(2, x + 1):
result *= j
j += 1
pass

return result

When the "simple" is True, the function takes noticeably and 
consistently longer.  For example, it might take 116 instead of 109 
seconds.  For the same counts, your code took 111.


I've looked at dis.dis(factorial_iterative), and can see no explicit 
reason for the difference.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Cecil Westerhof
Op Tuesday 5 May 2015 17:47 CEST schreef Paul Moore:

> On Sunday, 3 May 2015 16:23:59 UTC+1, Cecil Westerhof  wrote:
>>> By the way: I think that even if the recursion does not go further
>>> as 500, it is still a good idea to use tail recursion. Why use
>>> stack space when it is not necessary?
>>
>> I pushed the example to GitHub:
>> https://github.com/CecilWesterhof/PythonLibrary/blob/master/mathDecebal.py
>
> You already know this, as your code shows, but tail call recursion
> elimination is only possible when you have a *direct* tail call (one
> with the result of the tail call returned immediately to the
> caller). Even the standard trivial factorial example doesn't have a
> direct tail call, without rewriting to use an accumulator variable.
> Which is a non-intuitive transformation to anyone who's not familiar
> with recursive functional languages and their idioms.
>
> If you're rewriting your recursive function *anyway*, it's not that
> much harder in many (most?) cases to rewrite it iteratively.
>
> An example of a function that naturally uses direct tail call
> recursion, but which doesn't have a simple iterative rewrite, would
> be more compelling. Not particularly compelling (to me, anyway) even
> so, but still better than factorial or fibonnaci examples.

Well, I did not write many tail recursive functions. But what surprised
me was that for large values the ‘tail recursive’ version was more
efficient as the iterative version. And that was with myself
implementing the tail recursion. I expect the code to be more
efficient when the compiler implements the tail recursion.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Paul Moore
On Sunday, 3 May 2015 16:23:59 UTC+1, Cecil Westerhof  wrote:
> > By the way: I think that even if the recursion does not go further
> > as 500, it is still a good idea to use tail recursion. Why use stack
> > space when it is not necessary?
> 
> I pushed the example to GitHub:
> https://github.com/CecilWesterhof/PythonLibrary/blob/master/mathDecebal.py

You already know this, as your code shows, but tail call recursion elimination 
is only possible when you have a *direct* tail call (one with the result of the 
tail call returned immediately to the caller). Even the standard trivial 
factorial example doesn't have a direct tail call, without rewriting to use an 
accumulator variable. Which is a non-intuitive transformation to anyone who's 
not familiar with recursive functional languages and their idioms.

If you're rewriting your recursive function *anyway*, it's not that much harder 
in many (most?) cases to rewrite it iteratively.

An example of a function that naturally uses direct tail call recursion, but 
which doesn't have a simple iterative rewrite, would be more compelling. Not 
particularly compelling (to me, anyway) even so, but still better than 
factorial or fibonnaci examples.

Paul
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-03 Thread Cecil Westerhof
Op Saturday 2 May 2015 16:20 CEST schreef Cecil Westerhof:

> I am throwing the cat among the pigeons. ;-)
>
> In another thread I mentioned that I liked to have tail recursion in
> Python. To be clear not automatic, but asked for.
>
> Looking at the replies I did hit a nerve. But I still want to
> continue.
>
> Some things are better expressed recursively for the people reading
> the code. But there are two problems with that:
> - You can get out of stack space
> - It is less efficient
>
> Most of the time the first problem is the most important.
>
> When I write factorial (I know it is already written, but I use it
> as an example to show a point), the recursive variant can not be
> called with 1.000 without tail recursion. So for functions that
> could go very deep, tail recursion would be a blessing.
>
> By the way: I think that even if the recursion does not go further
> as 500, it is still a good idea to use tail recursion. Why use stack
> space when it is not necessary?

I pushed the example to GitHub:
https://github.com/CecilWesterhof/PythonLibrary/blob/master/mathDecebal.py

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list