Re: [Tutor] Memory error - how to manage large data sets?
Since the question is less than 200, I used b limit. However, to have limit = 2, perhaps I should do while b = limit. Thanks Alan for pointing it out. - Original Message - From: Alan Gauld [EMAIL PROTECTED] To: tutor@python.org Date: Thu, 31 Jul 2008 06:39:32 +0100 Subject: Re: [Tutor] Memory error - how to manage large data sets? Kepala Pening [EMAIL PROTECTED] wrote def sumEvenFibonacci( limit ): a, b = 1, 1 # don't waste with a = 0 sum = 0 while b limit: if b%2 == 0: sum += b a, b = b, a + b return sum print sumEvenFibonacci( 200 ) Does it work for limit = 2? Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
Kepala Pening wrote: def sumEvenFibonacci( limit ): a, b = 1, 1 # don't waste with a = 0 sum = 0 while b limit: if b%2 == 0: sum += b a, b = b, a + b return sum print sumEvenFibonacci( 200 ) Every 3rd element in the Fibonacci series is an even number. So one could economize slightly: def sumEvenFibonacci(limit): a, b = 1, 1 # don't waste with a = 0 sum = 0 while b limit: a, b = b, a + b sum += b a, b = b, a + b a, b = b, a + b return sum -- Bob Gailer 919-636-4239 Chapel Hill, NC ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
Kepala Pening [EMAIL PROTECTED] wrote However, to have limit = 2, perhaps I should do while b = limit. Thanks Alan for pointing it out. No probs, forgetting to test both ends of the range is a common mistake. In fact good testing practice says that for any range of values you should test - the lowest legal value - correct result - one lower than the lowest - fails gracefully - much lower than lowest - fails gracefully - a mid range value - correct result - the highest legal value - correct value (Of course testing the highest possible value would be tricky in your case! :-) - one higher than the highest - fails gracefully - much higher than the legal limit - fails gracefully - an invalid value - fails gracefully (eg in your case limit = 'four') So that's at least 8 tests for every range parameter/variable in your code. Automated testing is A Good Thing :-) Alan G. - Original Message - From: Alan Gauld [EMAIL PROTECTED] To: tutor@python.org Date: Thu, 31 Jul 2008 06:39:32 +0100 Subject: Re: [Tutor] Memory error - how to manage large data sets? Kepala Pening [EMAIL PROTECTED] wrote def sumEvenFibonacci( limit ): a, b = 1, 1 # don't waste with a = 0 sum = 0 while b limit: if b%2 == 0: sum += b a, b = b, a + b return sum print sumEvenFibonacci( 200 ) Does it work for limit = 2? Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
The original post was a little ambiguous: I need to find the sum of all numbers at even positions in the Fibonacci series upto 2 million. But the project euler page (http://projecteuler.net/index.php?section=problemsid=2) is clear: Find the sum of all the even-valued terms in the sequence which do not exceed four million. Depending on which value you are after, it may not be necessary to enumerate a million terms of the fibonacci sequence. In fact, only about thirty are needed. Now if you want to do the harder problem, I suggest using GMP or some other Numerical extension, and not Python :) Or play some interesting math tricks. The wikipedia page on fibonaccis lists a lot of identities that could be exploited in intriguing ways. Cheers ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
Karthik [EMAIL PROTECTED] wrote I am new to Python programming, I was trying to work out a few problems in order to grasp the knowledge gained after going through the basic chapters on Python programming. I got stuck with a memory error. Always show us the full error text, it contains a lot of useful data. It is usually helpful to show us the code too if not too long. Alternatively post in on a web site such as pastebin. Following is what I did, 1. I need to find the sum of all numbers at even positions in the Fibonacci series upto 2 million. Shouldn't be a problem. 2. I have used lists to achieve this. One list should suffice. (and you could do it without any lists...) 3. My program works good with smaller ranges. Say till 10,000 or even 100,000. However when I compute the sum for bigger ranges it gives me the memory error. 4. Also could someone tell me how to get the result in the form of an exponent. For instance, I would prefer 10^5 rather 10. You can use string formatting using the %e or %g markers. See my tutorial topic Simple Sequences for more info on format strings or search the documentation. -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
Forgot to include the following information, Platform - win32 Version - 2.5.1 Error message: Traceback (most recent call last): File C:\Python25\programs\fibo.py, line 10, in module if i % 2 == 0: MemoryError Code: fib = [] even = [] def fibonacci(x,y): return x+y for i in range (0,100): if i 2: fib.append(i) else: i = fib[i-1] + fib[i-2] if i % 2 == 0: fib.append(i) even.append(i) else: fib.append(i) total = reduce(fibonacci,even) print total Any pointers would be of great help to me. Regards, Karthik From: Karthik [mailto:[EMAIL PROTECTED] Sent: Monday, July 28, 2008 9:27 PM To: 'tutor@python.org' Subject: Memory error - how to manage large data sets? Hi, I am new to Python programming, I was trying to work out a few problems in order to grasp the knowledge gained after going through the basic chapters on Python programming. I got stuck with a memory error. Following is what I did, 1. I need to find the sum of all numbers at even positions in the Fibonacci series upto 2 million. 2. I have used lists to achieve this. 3. My program works good with smaller ranges. Say till 10,000 or even 100,000. However when I compute the sum for bigger ranges it gives me the memory error. 4. Also could someone tell me how to get the result in the form of an exponent. For instance, I would prefer 10^5 rather 10. Thanks in advance, Karthik ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
On Monday 28 July 2008 10:56, Karthik wrote: Hi, I am new to Python programming, I was trying to work out a few problems in order to grasp the knowledge gained after going through the basic chapters on Python programming. I got stuck with a memory error. Following is what I did, 1. I need to find the sum of all numbers at even positions in the Fibonacci series upto 2 million. 2. I have used lists to achieve this. 3. My program works good with smaller ranges. Say till 10,000 or even 100,000. However when I compute the sum for bigger ranges it gives me the memory error. 4. Also could someone tell me how to get the result in the form of an exponent. For instance, I would prefer 10^5 rather 10. Thanks in advance, Karthik It sounds like you are storing all the fibonacci numbers as you generate them. Why? You only need the previous two to find the next in the sequence. The sum is a single number that you can add every other element in the sequence to. You only need to store three numbers in memory. Storing millions is wasteful, and doesn't scale very well. To find an exponent, use the ** operator. For instance, 2**3 is 8, and 3**2 is 9. Cheers ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
Hi I tried to run your code and checked (with top) the memory ussage and it uses more than 2 Gb of memory. I tried to modify the code a little bit to use less memory and came up with this: fib = {0:0,1:1} even = [] def fibonacci(x,y): return x+y for j in xrange (2,100): i = fib[j-1] + fib[j-2] if i % 2 == 0: even.append(i) fib = {j-1:fib[j-1], j:i} total = reduce(fibonacci,even) print total First, I replaced range with xrange. I figured that you only need the last two values in the fibonnaci series to calculate the next value, so I replaced the fib list with a dictionary to only store the last two values instead of the whole series. It looks like the progam still hangs and I did not notice any memory imrovements when running it with 1 000 000 Am I wrong thinking that the modifications I made help use less memory? Date: Mon, 28 Jul 2008 22:44:08 +0530 From: Karthik [EMAIL PROTECTED] Subject: Re: [Tutor] Memory error - how to manage large data sets? To: tutor@python.org Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=us-ascii Forgot to include the following information, Platform - win32 Version - 2.5.1 Error message: Traceback (most recent call last): File C:\Python25\programs\fibo.py, line 10, in module if i % 2 == 0: MemoryError Code: fib = [] even = [] def fibonacci(x,y): return x+y for i in range (0,100): if i 2: fib.append(i) else: i = fib[i-1] + fib[i-2] if i % 2 == 0: fib.append(i) even.append(i) else: fib.append(i) total = reduce(fibonacci,even) print total Any pointers would be of great help to me. Regards, Karthik ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
Karthik [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Forgot to include the following information, Platform - win32 Version - 2.5.1 Error message: Traceback (most recent call last): File C:\Python25\programs\fibo.py, line 10, in module if i % 2 == 0: MemoryError OK, It does look like you bust your memory allocation. Code: fib = [] even = [] You don't really need two lists. def fibonacci(x,y): return x+y And this isn't really the fibonacci function its an addition which you don;t need(see below) for i in range (0,100): This creates a 3rd list of 1 million numbers, so thats 3 million * the storage of one number. Still not enough to trouble most PCs nowadays... if i 2: fib.append(i) else: i = fib[i-1] + fib[i-2] if i % 2 == 0: fib.append(i) You want to append to fib on every number not just on evens? So just take this outside the if. even.append(i) You don't need this because we can extract the even numbers when we do the summation later. All its doing it gobbling up space. else: fib.append(i) dispense with the else and just put the fib append outside. ie your code becomes for i in xrange (0,100): # xranmge should avoid the extra list if i = 2: i = fib[i-1] + fib[i-2] fib.append(i) total = reduce(fibonacci,even) You could use the sum() function in Python 2.5 but for your purpose I'd propose using the fib list and using slicing to select every second element: L = range(10) L[::2] [0, 2, 4, 6, 8] So you could use total = sum(fib[::2]) and dispense with evens entirely Any pointers would be of great help to me. Frankly I'm puzzled why its using so much RAM. I'd expect resource usage to be around 20-30MB tops. hmmm, OK a bit of experimentation shows that the fibonacci series generates very big numbers very quickly so each entry is a long number - a very long number! 21000 characters for the 100,000th entry! No wonder it blows up. You have two choices - use floats, which might still not be big enough! ... It isn't, 98500 out of 10 entries were infinite using floats! So you need to calculate the total as you go without saving the values Or buy a much, much bigger computer! :-) PS I did all of that investigation using the prompt. Its a very useful tool for trying things out, I just generated the list (using 100,000 entries) and examined the contents using various list comprehensions and len() etc HTH, -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
On 29/07/2008, Daniel Sarmiento [EMAIL PROTECTED] wrote: I tried to run your code and checked (with top) the memory ussage and it uses more than 2 Gb of memory. I tried to modify the code a little bit to use less memory and came up with this: fib = {0:0,1:1} even = [] def fibonacci(x,y): return x+y for j in xrange (2,100): i = fib[j-1] + fib[j-2] if i % 2 == 0: even.append(i) fib = {j-1:fib[j-1], j:i} total = reduce(fibonacci,even) print total It looks like the progam still hangs and I did not notice any memory imrovements when running it with 1 000 000 Well, let's see. You're still storing all the even fibonacci numbers that you compute. By the looks of things, one third of fibonacci numbers are even, so that's about 333,333 numbers that we're storing. How big do these numbers get? There's a closed-form expression for Fibonacci numbers; it's: fib(n) = (phi**n - (1-phi)**n)/sqrt(5), where phi is the golden ratio (about 1.6). 1-phi is -0.6, so when n is large, (1-phi)**n is practically zero. So fib(n) is roughly phi**n/sqrt(5). These numbers will quickly get beyond the size of normal integers, and into long integers. I don't know how many bytes a long integer takes up, but I guess we can estimate it by looking at the log to base 2. So let's consider the millionth Fibonacci number. fib(100) ~= phi**100/sqrt(5). So log(fib(100)) ~= log(phi**100/sqrt(5)) = 100*log(phi) - log(sqrt(5)). Taking logs to base 2, we get: 100*log(phi, 2) - log(sqrt(5), 2) 694240.75266657001 In other words, the best possible representation of the millionth Fibonacci number will take almost 700,000 bits, or around 85 kilobytes. I don't know how Python long integers actually work; it may take up more space than that. Of course, that's just the biggest one we're working out. Let's try to work out how much space the first million will take up. This is: sum_{i=1}^{100} i*log(phi, 2) - log(sqrt(5), 2) == -100*log(sqrt(5), 2) + log(phi, 2)*sum_{i=1}^{100} i Remembering the formula for summing integers (n(n+1)/2), this is: -100*log(sqrt(5), 2) + log(phi, 2)*(100*101/2) 347120142972.21808 This is a number of bits; divide by 8 to get bytes; divide by 8*1024 to get kilobytes, etc: _/(8*1024*1024*1024) 40.410103156722393 So ... that's about 40 gigabytes worth of numbers in this list you're trying to build. Well, actually you're only storing a third of the Fibonacci numbers (the even ones), so we can cut that down to thirteen gigabytes. Assuming my maths is correct and there's not too much overhead in Python long integers :-) (the solution, of course, is to avoid storing all those numbers in the first place) -- John. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
There's no need to keep any lists. The sum can be done on the fly, which is perhaps a bit slower, but takes a constant amount of ram. Even storing every other element (or every third, which is what he's trying to do: the elements that are even numbers, not every other element.. See his example code, or Project Euler, problem two, which seems to be the original source) will still take a lot of ram. Something like this: a = 0 b = 1 total = 0 while True: c = a+b a = b b = c if c1e6: break if c%2 == 0: total += c print total By the way, working through those problems will really exercise your programming and math skills. It's a great way to get started, if you enjoy those kind of puzzles. Can you see why every third element must be even? Cheers ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
Alan Gauld [EMAIL PROTECTED] wrote were infinite using floats! So you need to calculate the total as you go without saving the values I got curious so wrote the following function: def fibtot(N): ... f0,f1,tot = 0,1,1 ... for n in range(N): ... f = f0 + f1 ... f0,f1 = f1,f ... if n % 2 == 0: tot += f ... return tot and here are the results... len(str(fibtot(1))) 2090 len(str(fibtot(10))) 20899 len(str(fibtot(20))) 41798 len(str(fibtot(40))) 83595 len(str(fibtot(50))) 104494 len(str(fibtot(60))) 125393 len(str(fibtot(80))) 167190 len(str(fibtot(100))) 208988 So the final total contains nearly 209 thousand digits! Thats a very, very big number... and it took 5 minutes to compute on my PC (Dual core 2.8GHz with 2G RAM) Now I really must go to bed :-) -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Memory error - how to manage large data sets?
(the solution, of course, is to avoid storing all those numbers in the first place) I tried this: fib = {0:0,1:1} sum = 0 for j in xrange (2,100): i = fib[j-1] + fib[j-2] if i % 2 == 0: sum += i fib = {j-1:fib[j-1], j:i} print sum I guess it should come up with the right answer (i compared sum to the total value for small values in the range and they were the same) Just storing the value of the sum doesn't use that much memory and prints the sum for (2, 1 million) but it takes a long time to compute. I know there has to be a way better solution. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor