Re: [Tutor] Memory error - how to manage large data sets?

2008-07-31 Thread Kepala Pening
Since the question is less than 200, I used
b  limit.
However, to have limit = 2, perhaps I should do
while b = limit.

Thanks Alan for pointing it out.

- Original Message -
From: Alan Gauld [EMAIL PROTECTED]
To: tutor@python.org
Date: Thu, 31 Jul 2008 06:39:32 +0100
Subject: Re: [Tutor] Memory error - how to manage large data sets?

 
 Kepala Pening [EMAIL PROTECTED] wrote
 
  def sumEvenFibonacci( limit ):
  a, b = 1, 1  # don't waste with a = 0
  sum = 0
  while b  limit:
  if b%2 == 0: sum += b
  a, b = b, a + b
  return sum
  
  print sumEvenFibonacci( 200 )
 
 Does it work for limit = 2?
 
 Alan G.
 
 
 
 
 
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-31 Thread bob gailer

Kepala Pening wrote:

def sumEvenFibonacci( limit ):
a, b = 1, 1  # don't waste with a = 0
sum = 0
while b  limit:
if b%2 == 0: sum += b
a, b = b, a + b
return sum

print sumEvenFibonacci( 200 )


  
Every 3rd element in the Fibonacci series is an even number. So one 
could economize slightly:


def sumEvenFibonacci(limit):
   a, b = 1, 1  # don't waste with a = 0
   sum = 0
   while b  limit:
a, b = b, a + b
sum += b
a, b = b, a + b
a, b = b, a + b
   return sum


--
Bob Gailer
919-636-4239 Chapel Hill, NC

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-31 Thread Alan Gauld


Kepala Pening [EMAIL PROTECTED] wrote


However, to have limit = 2, perhaps I should do
while b = limit.

Thanks Alan for pointing it out.


No probs, forgetting to test both ends of the range is a common 
mistake.


In fact good testing practice says that for any range of values
you should test

- the lowest legal value - correct result
- one lower than the lowest - fails gracefully
- much lower than lowest - fails gracefully
- a mid range value - correct result
- the highest legal value - correct value
 (Of course testing the highest possible value would be tricky
 in your case! :-)
- one higher than the highest - fails gracefully
- much higher than the legal limit - fails gracefully
- an invalid value - fails gracefully (eg in your case limit = 'four')

So that's at least 8 tests for every range parameter/variable
in your code.
Automated testing is A Good Thing :-)

Alan G.




- Original Message -
From: Alan Gauld [EMAIL PROTECTED]
To: tutor@python.org
Date: Thu, 31 Jul 2008 06:39:32 +0100
Subject: Re: [Tutor] Memory error - how to manage large data sets?



Kepala Pening [EMAIL PROTECTED] wrote

 def sumEvenFibonacci( limit ):
 a, b = 1, 1  # don't waste with a = 0
 sum = 0
 while b  limit:
 if b%2 == 0: sum += b
 a, b = b, a + b
 return sum

 print sumEvenFibonacci( 200 )

Does it work for limit = 2?

Alan G.






___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor




___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-29 Thread Chris Fuller

The original post was a little ambiguous: I need to find the sum of all 
numbers at even positions in the Fibonacci series upto 2 million.

But the project euler page 
(http://projecteuler.net/index.php?section=problemsid=2) is clear: Find the 
sum of all the even-valued terms in the sequence which do not exceed four 
million.

Depending on which value you are after, it may not be necessary to enumerate a 
million terms of the fibonacci sequence.  In fact, only about thirty are 
needed.

Now if you want to do the harder problem, I suggest using GMP or some other 
Numerical extension, and not Python :)

Or play some interesting math tricks.  The wikipedia page on fibonaccis lists 
a lot of identities that could be exploited in intriguing ways.

Cheers
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-28 Thread Alan Gauld


Karthik [EMAIL PROTECTED] wrote


I am new to Python programming, I was trying to work out a few 
problems in
order to grasp the knowledge gained after going through the basic 
chapters

on Python programming. I got stuck with a memory error.


Always show us the full error text, it contains a lot of useful data.

It is usually helpful to show us the code too if not too long.
Alternatively post in on a web site such as pastebin.


Following is what I did,
1. I need to find the sum of all numbers at even positions in 
the

Fibonacci series upto 2 million.


Shouldn't be a problem.


2. I have used lists to achieve this.


One list should suffice. (and you could do it without any lists...)

3. My program works good with smaller ranges. Say till 10,000 or 
even
100,000. However when I compute the sum for bigger ranges it gives 
me the

memory error.

4. Also could someone tell me how to get the result in the form 
of an

exponent. For instance, I would prefer 10^5 rather  10.


You can use string formatting using the %e or %g markers.
See my tutorial topic Simple Sequences for more info on format strings
or search the documentation.

--
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld 



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-28 Thread Karthik
Forgot to include the following information,

 

Platform - win32

Version - 2.5.1

 

Error message:

 

Traceback (most recent call last):

  File C:\Python25\programs\fibo.py, line 10, in module

if i % 2 == 0:

MemoryError

 

Code:

 

fib = []

even = []

def fibonacci(x,y):

return x+y

for i in range (0,100):

if i  2:

fib.append(i)

else:

i = fib[i-1] + fib[i-2]

if i % 2 == 0:

fib.append(i)

even.append(i)

else:

fib.append(i)

total = reduce(fibonacci,even)

print total

 

Any pointers would be of great help to me.

 

Regards,

Karthik

 

From: Karthik [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 28, 2008 9:27 PM
To: 'tutor@python.org'
Subject: Memory error - how to manage large data sets?

 

Hi,

 

I am new to Python programming, I was trying to work out a few problems in
order to grasp the knowledge gained after going through the basic chapters
on Python programming. I got stuck with a memory error.

 

Following is what I did,

 

1. I need to find the sum of all numbers at even positions in the
Fibonacci series upto 2 million.

2. I have used lists to achieve this.

3. My program works good with smaller ranges. Say till 10,000 or even
100,000. However when I compute the sum for bigger ranges it gives me the
memory error.

4. Also could someone tell me how to get the result in the form of an
exponent. For instance, I would prefer 10^5 rather  10.

 

Thanks in advance,

Karthik

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-28 Thread Chris Fuller
On Monday 28 July 2008 10:56, Karthik wrote:
 Hi,



 I am new to Python programming, I was trying to work out a few problems in
 order to grasp the knowledge gained after going through the basic chapters
 on Python programming. I got stuck with a memory error.



 Following is what I did,



 1. I need to find the sum of all numbers at even positions in the
 Fibonacci series upto 2 million.

 2. I have used lists to achieve this.

 3. My program works good with smaller ranges. Say till 10,000 or even
 100,000. However when I compute the sum for bigger ranges it gives me the
 memory error.

 4. Also could someone tell me how to get the result in the form of an
 exponent. For instance, I would prefer 10^5 rather  10.



 Thanks in advance,

 Karthik

It sounds like you are storing all the fibonacci numbers as you generate them.  
Why?  You only need the previous two to find the next in the sequence.  The 
sum is a single number that you can add every other element in the sequence 
to.  You only need to store three numbers in memory.  Storing millions is 
wasteful, and doesn't scale very well.

To find an exponent, use the ** operator.  For instance, 2**3 is 8, and 3**2 
is 9.

Cheers
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-28 Thread Daniel Sarmiento
Hi

I tried to run your code and checked (with top) the memory ussage and
it uses more than 2 Gb of memory.

I tried to modify the code a little bit to use less memory and came up
with this:

fib = {0:0,1:1}
even = []

def fibonacci(x,y):
   return x+y

for j in xrange (2,100):
i = fib[j-1] + fib[j-2]
if i % 2 == 0:
even.append(i)
fib = {j-1:fib[j-1], j:i}

total = reduce(fibonacci,even)
print total

First, I replaced range with xrange.
I figured that you only need the last two values in the fibonnaci
series to calculate the next value, so I replaced the fib list with a
dictionary to only store the last two values instead of the whole
series.

It looks like the progam still hangs and I did not notice any memory
imrovements when running it with 1 000 000

Am I wrong thinking that the modifications I made help use less memory?

 Date: Mon, 28 Jul 2008 22:44:08 +0530
 From: Karthik [EMAIL PROTECTED]
 Subject: Re: [Tutor] Memory error - how to manage large data sets?
 To: tutor@python.org
 Message-ID: [EMAIL PROTECTED]
 Content-Type: text/plain; charset=us-ascii

 Forgot to include the following information,



 Platform - win32

 Version - 2.5.1



 Error message:



 Traceback (most recent call last):

  File C:\Python25\programs\fibo.py, line 10, in module

if i % 2 == 0:

 MemoryError



 Code:



 fib = []

 even = []

 def fibonacci(x,y):

return x+y

 for i in range (0,100):

if i  2:

fib.append(i)

else:

i = fib[i-1] + fib[i-2]

if i % 2 == 0:

fib.append(i)

even.append(i)

else:

fib.append(i)

 total = reduce(fibonacci,even)

 print total



 Any pointers would be of great help to me.



 Regards,

 Karthik
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-28 Thread Alan Gauld
Karthik [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]

Forgot to include the following information,
Platform - win32
Version - 2.5.1
Error message:
Traceback (most recent call last):
 File C:\Python25\programs\fibo.py, line 10, in module
   if i % 2 == 0:
MemoryError


OK, It does look like you bust your memory allocation.


Code:



fib = []
even = []


You don't really need two lists.


def fibonacci(x,y):
   return x+y


And this isn't really the fibonacci function its an addition
which you don;t need(see below)


for i in range (0,100):


This creates a 3rd list of 1 million numbers, so thats
3 million * the storage of one number. Still not enough
to trouble most PCs nowadays...


   if i  2:
   fib.append(i)
   else:
   i = fib[i-1] + fib[i-2]
   if i % 2 == 0:
   fib.append(i)


You want to append to fib on every number not just on evens?
So just take this outside the if.


   even.append(i)


You don't need this because we can extract the even numbers
when we do the summation later. All its doing it gobbling up space.


   else:
   fib.append(i)


dispense with the else and just put the fib append outside.
ie your code becomes

for i in xrange (0,100):  # xranmge should avoid the extra list
   if i = 2:
   i = fib[i-1] + fib[i-2]
   fib.append(i)


total = reduce(fibonacci,even)


You could use the sum() function in Python 2.5 but for
your purpose I'd propose using the fib list and using slicing
to select every second element:


L = range(10)
L[::2]

[0, 2, 4, 6, 8]




So you could use

total = sum(fib[::2])

and dispense with evens entirely


Any pointers would be of great help to me.


Frankly I'm puzzled why its using so much RAM.
I'd expect resource usage to be around 20-30MB
tops. hmmm, OK a bit of experimentation shows
that the fibonacci series generates very big numbers
very quickly so each entry is a long number - a very
long number!

21000 characters for the 100,000th entry!

No wonder it blows up.

You have two choices - use floats, which might still
not be big enough! ... It isn't,  98500 out of 10 entries
were infinite using floats! So you need to calculate the
total as you go without saving the values

Or buy a much, much bigger computer! :-)

PS I did all of that investigation using the  prompt.
Its a very useful tool for trying things out, I just generated
the list (using 100,000 entries) and examined the contents
using various list comprehensions and len() etc

HTH,

--
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld 



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-28 Thread John Fouhy
On 29/07/2008, Daniel Sarmiento [EMAIL PROTECTED] wrote:
  I tried to run your code and checked (with top) the memory ussage and
  it uses more than 2 Gb of memory.

  I tried to modify the code a little bit to use less memory and came up
  with this:

  fib = {0:0,1:1}

 even = []

  def fibonacci(x,y):
return x+y

 for j in xrange (2,100):
 i = fib[j-1] + fib[j-2]
 if i % 2 == 0:
 even.append(i)
 fib = {j-1:fib[j-1], j:i}


  total = reduce(fibonacci,even)
  print total

  It looks like the progam still hangs and I did not notice any memory
  imrovements when running it with 1 000 000

Well, let's see.  You're still storing all the even fibonacci numbers
that you compute.  By the looks of things, one third of fibonacci
numbers are even, so that's about 333,333 numbers that we're storing.

How big do these numbers get?  There's a closed-form expression for
Fibonacci numbers; it's: fib(n) = (phi**n - (1-phi)**n)/sqrt(5), where
phi is the golden ratio (about 1.6).  1-phi is -0.6, so when n is
large, (1-phi)**n is practically zero.  So fib(n) is roughly
phi**n/sqrt(5).  These numbers will quickly get beyond the size of
normal integers, and into long integers.  I don't know how many bytes
a long integer takes up, but I guess we can estimate it by looking at
the log to base 2.

So let's consider the millionth Fibonacci number.  fib(100) ~=
phi**100/sqrt(5).
So log(fib(100)) ~= log(phi**100/sqrt(5)) = 100*log(phi) -
log(sqrt(5)).

Taking logs to base 2, we get:

 100*log(phi, 2) - log(sqrt(5), 2)
694240.75266657001

In other words, the best possible representation of the millionth
Fibonacci number will take almost 700,000 bits, or around 85
kilobytes.  I don't know how Python long integers actually work; it
may take up more space than that.

Of course, that's just the biggest one we're working out.  Let's try
to work out how much space the first million will take up.  This is:

sum_{i=1}^{100} i*log(phi, 2) - log(sqrt(5), 2)

== -100*log(sqrt(5), 2) + log(phi, 2)*sum_{i=1}^{100} i

Remembering the formula for summing integers (n(n+1)/2), this is:

 -100*log(sqrt(5), 2) + log(phi, 2)*(100*101/2)
347120142972.21808

This is a number of bits; divide by 8 to get bytes; divide by 8*1024
to get kilobytes, etc:

 _/(8*1024*1024*1024)
40.410103156722393

So ... that's about 40 gigabytes worth of numbers in this list you're
trying to build.  Well, actually you're only storing a third of the
Fibonacci numbers (the even ones), so we can cut that down to thirteen
gigabytes.  Assuming my maths is correct and there's not too much
overhead in Python long integers :-)

(the solution, of course, is to avoid storing all those numbers in the
first place)

-- 
John.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-28 Thread Chris Fuller

There's no need to keep any lists.  The sum can be done on the fly, which is 
perhaps a bit slower, but takes a constant amount of ram.  Even storing every 
other element (or every third, which is what he's trying to do: the elements 
that are even numbers, not every other element.. See his example code, or 
Project Euler, problem two, which seems to be the original source) will still 
take a lot of ram.

Something like this:

a = 0
b = 1

total = 0

while True:
   c = a+b
   a = b
   b = c

  if c1e6:
break

  if c%2 == 0:
total += c

print total

By the way,  working through those problems will really exercise your 
programming and math skills.  It's a great way to get started, if you enjoy 
those kind of puzzles.


Can you see why every third element must be even?
Cheers
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-28 Thread Alan Gauld
Alan Gauld [EMAIL PROTECTED] wrote

 were infinite using floats! So you need to calculate the
 total as you go without saving the values
 

I got curious so wrote the following function:

 def fibtot(N):
...   f0,f1,tot = 0,1,1
...   for n in range(N):
... f = f0 + f1
... f0,f1 = f1,f
... if n % 2 == 0: tot += f
...   return tot

and here are the results...

 len(str(fibtot(1)))
2090
 len(str(fibtot(10)))
20899
 len(str(fibtot(20)))
41798
 len(str(fibtot(40)))
83595
 len(str(fibtot(50)))
104494
 len(str(fibtot(60)))
125393
 len(str(fibtot(80)))
167190
 len(str(fibtot(100)))
208988


So the final total contains nearly 209 thousand digits!
Thats a very, very big number... and it took 5 minutes 
to compute on my PC (Dual core 2.8GHz with 2G RAM)

Now I really must go to bed :-)

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Memory error - how to manage large data sets?

2008-07-28 Thread Daniel Sarmiento
(the solution, of course, is to avoid storing all those numbers in the
first place)

I tried this:

fib = {0:0,1:1}
sum = 0

for j in xrange (2,100):
i = fib[j-1] + fib[j-2]
if i % 2 == 0:
sum += i
fib = {j-1:fib[j-1], j:i}

print sum

I guess it should come up with the right answer (i compared sum to the
total value for small values in the range and they were the same)

Just storing the value of the sum doesn't use that much memory and
prints the sum for (2, 1 million) but it takes a long time to compute.
I know there has to be a way better solution.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor