Re: How can I make this piece of code even faster?
pablobarhamal...@gmail.com wrote: Ok, I'm working on a predator/prey simulation, which evolve using genetic algorithms. At the moment, they use a quite simple feed-forward neural network, which can change size over time. Each brain tick is performed by the following function (inside the Brain class): def tick(self): input_num = self.input_num hidden_num = self.hidden_num output_num = self.output_num hidden = [0]*hidden_num output = [0]*output_num inputs = self.input h_weight = self.h_weight o_weight = self.o_weight e = math.e count = -1 for x in range(hidden_num): temp = 0 for y in range(input_num): count += 1 temp += inputs[y] * h_weight[count] hidden[x] = 1/(1+e**(-temp)) count = -1 for x in range(output_num): temp = 0 for y in range(hidden_num): count += 1 temp += hidden[y] * o_weight[count] output[x] = 1/(1+e**(-temp)) self.output = output The function is actually quite fast (~0.040 seconds per 200 calls, using 10 input, 20 hidden and 3 output neurons), and used to be much slower untill I fiddled about with it a bit to make it faster. However, it is still somewhat slow for what I need it. My question to you is if you an see any obvious (or not so obvious) way of making this faster. I've heard about numpy and have been reading about it, but I really can't see how it could be implemented here. Cheers! Assuming every list is replaced with a numpy.array, h_weight.shape == (hidden_num, input_num) o_weight.shape == (output_num, hidden_num) and as untested as it gets: def tick(self): temp = numpy.dot(self.inputs, self.h_weight) hidden = 1/(1+numpy.exp(-temp)) temp = numpy.dot(hidden, self.o_weight) self.output = 1/(1+numpy.exp(-temp)) My prediction: this is probably wrong, but if you can fix the code it will be stinkin' fast ;) -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
20.07.13 23:22, pablobarhamal...@gmail.com написав(ла): e = math.e count = -1 for x in range(hidden_num): temp = 0 for y in range(input_num): count += 1 temp += inputs[y] * h_weight[count] hidden[x] = 1/(1+e**(-temp)) [...] My question to you is if you an see any obvious (or not so obvious) way of making this faster. 1. Use math.exp() instead of math.e**. 2. I'm not sure that it will be faster, but try to use sum(). temp = sum(inputs[y] * h_weight[count + y] for y in range(input_num)) count += input_num or temp = sum(map(operator.mul, inputs, h_weight[count:count+input_num])) count += input_num -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes: On Sat, 20 Jul 2013 13:22:03 -0700, pablobarhamalzas asked: How can I make this piece of code even faster? - Use a faster computer. - Put in more memory. - If using Unix or Linux, decrease the nice priority of the process. I mention these because sometimes people forget that if you have a choice between spend 10 hours at $70 per hour to optimize code, and spend $200 to put more memory in, putting more memory in may be more cost effective. Sure - but it's helpful if programmers understand a little bit about the computational complexity of algorithms. If it's just a question of making each basic step of your algorithm a bit faster, then it may well be better to spend money on better hardware than on squeezing more out of your code. OTOH if you've got an n^2 implementation and there's actually an n.log n solution available then you should probably re-code. Of course if what you've got is actually adequate for your use-case then it maybe that you don't actually need to do anything at all... -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
How about using numpy? Am 20.07.13 22:22, schrieb pablobarhamal...@gmail.com: Ok, I'm working on a predator/prey simulation, which evolve using genetic algorithms. At the moment, they use a quite simple feed-forward neural network, which can change size over time. Each brain tick is performed by the following function (inside the Brain class): count = -1 for x in range(hidden_num): temp = 0 for y in range(input_num): count += 1 temp += inputs[y] * h_weight[count] hidden[x] = 1/(1+e**(-temp)) I don't really understand this loop, but it looks to me like a matrix-vector multiplication of the matrix of weights (indexed with a single continous index) with the inputs. Given that you reshape the weights() array correctly into a hidden_num-by-input_num array, this would result in import numpy as np hidden = 1.0/(1.0 + np.exp(-np.dot(h_weights, inputs))) The matrix-vector product is then executed by a compiled loop. You can reshape your existing lists into that form by inputs = np.array(inputs) h_weight = np.reshape(h_weight, (hidden_num, input_num)) ... but of course you should use this only to check whether you get the correct result. You don't want to do that in the loop, instead, store the weights always in matrix form. Christian -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
On Sun, Jul 21, 2013 at 5:11 PM, Paul Rudin paul.nos...@rudin.co.uk wrote: Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes: On Sat, 20 Jul 2013 13:22:03 -0700, pablobarhamalzas asked: How can I make this piece of code even faster? - Use a faster computer. - Put in more memory. - If using Unix or Linux, decrease the nice priority of the process. I mention these because sometimes people forget that if you have a choice between spend 10 hours at $70 per hour to optimize code, and spend $200 to put more memory in, putting more memory in may be more cost effective. Sure - but it's helpful if programmers understand a little bit about the computational complexity of algorithms. If it's just a question of making each basic step of your algorithm a bit faster, then it may well be better to spend money on better hardware than on squeezing more out of your code. OTOH if you've got an n^2 implementation and there's actually an n.log n solution available then you should probably re-code. I haven't analyzed every suggestion in this thread in detail, but I don't think any of them affects the algorithmic complexity of the code. They're all incremental changes. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
Thank's for all the replies! I've tried some of the imporovements you suggested (using math.exp() and sum() or math.fsum()). None of that made the code faster, because they are functions you are calling lots of times, and function calling is quite time expensive (same as x**(1/2) is faster than math.sqrt(x)). I'm going to try to convert tu numpy now (I have no idea how to do it at the moment), thank's again to everyone. -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
On Sun, 21 Jul 2013 03:19:24 -0700, pablobarhamalzas wrote: Thank's for all the replies! I've tried some of the imporovements you suggested (using math.exp() and sum() or math.fsum()). None of that made the code faster, because they are functions you are calling lots of times, and function calling is quite time expensive (same as x**(1/2) is faster than math.sqrt(x)). You are *badly* mistaken. Not only is sqrt more accurate, but it is also much faster. [steve@ando ~]$ python3.3 -m timeit -s x = 2.357e7 x**0.5 100 loops, best of 3: 0.319 usec per loop [steve@ando ~]$ python3.3 -m timeit -s x = 2.357e7 -s from math import sqrt sqrt(x) 1000 loops, best of 3: 0.172 usec per loop How exactly are you timing the code? -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
On Sun, Jul 21, 2013 at 8:31 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: On Sun, 21 Jul 2013 03:19:24 -0700, pablobarhamalzas wrote: Thank's for all the replies! I've tried some of the imporovements you suggested (using math.exp() and sum() or math.fsum()). None of that made the code faster, because they are functions you are calling lots of times, and function calling is quite time expensive (same as x**(1/2) is faster than math.sqrt(x)). You are *badly* mistaken. Not only is sqrt more accurate, but it is also much faster. [steve@ando ~]$ python3.3 -m timeit -s x = 2.357e7 x**0.5 100 loops, best of 3: 0.319 usec per loop [steve@ando ~]$ python3.3 -m timeit -s x = 2.357e7 -s from math import sqrt sqrt(x) 1000 loops, best of 3: 0.172 usec per loop Don't forget the cost of attribute lookup, which adds 50% to the sqrt() figure. Still faster than exponentiation. (Figures from Python 3.4 alpha, but unlikely to be materially different.) rosuav@sikorsky:~$ python3 -m timeit -s x = 2.357e7 x**0.5 100 loops, best of 3: 0.239 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s x = 2.357e7 -s from math import sqrt sqrt(x) 1000 loops, best of 3: 0.102 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s x = 2.357e7 -s import math math.sqrt(x) 1000 loops, best of 3: 0.155 usec per loop ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
El domingo, 21 de julio de 2013 12:31:42 UTC+2, Steven D'Aprano escribió: On Sun, 21 Jul 2013 03:19:24 -0700, pablobarhamalzas wrote: Thank's for all the replies! I've tried some of the imporovements you suggested (using math.exp() and sum() or math.fsum()). None of that made the code faster, because they are functions you are calling lots of times, and function calling is quite time expensive (same as x**(1/2) is faster than math.sqrt(x)). You are *badly* mistaken. Not only is sqrt more accurate, but it is also much faster. [steve@ando ~]$ python3.3 -m timeit -s x = 2.357e7 x**0.5 100 loops, best of 3: 0.319 usec per loop [steve@ando ~]$ python3.3 -m timeit -s x = 2.357e7 -s from math import sqrt sqrt(x) 1000 loops, best of 3: 0.172 usec per loop How exactly are you timing the code? I'm timing the whole program with cProfile. Removing math.sqrt() from a function and using **(1/2) instead cut the execution time for a significant amount (~0.035 to ~0.020). I can't see another explanation for the speed increase... -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
On 20 July 2013 21:22, pablobarhamal...@gmail.com wrote: Ok, I'm working on a predator/prey simulation, which evolve using genetic algorithms. At the moment, they use a quite simple feed-forward neural network, which can change size over time. Each brain tick is performed by the following function (inside the Brain class): CODE The function is actually quite fast (~0.040 seconds per 200 calls, using 10 input, 20 hidden and 3 output neurons), and used to be much slower untill I fiddled about with it a bit to make it faster. However, it is still somewhat slow for what I need it. My question to you is if you an see any obvious (or not so obvious) way of making this faster. I've heard about numpy and have been reading about it, but I really can't see how it could be implemented here. Currently we're just guessing; if you gave us an appropriate stand-in for self (so that we can call the function) we could be helpful much more easily. -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
pablobarhamal...@gmail.com, 21.07.2013 12:48: El domingo, 21 de julio de 2013 12:31:42 UTC+2, Steven D'Aprano escribió: [steve@ando ~]$ python3.3 -m timeit -s x = 2.357e7 x**0.5 100 loops, best of 3: 0.319 usec per loop [steve@ando ~]$ python3.3 -m timeit -s x = 2.357e7 -s from math import sqrt sqrt(x) 1000 loops, best of 3: 0.172 usec per loop How exactly are you timing the code? I'm timing the whole program with cProfile. Removing math.sqrt() from a function and using **(1/2) instead cut the execution time for a significant amount (~0.035 to ~0.020). With or without the profiler running? Note that profiling will slow down your code (especially function calls), often significantly and sometimes even in such an unbalanced way that it visibly changes its execution profile. Always make sure you validate your code changes with benchmarks, outside of the profiler. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
On 07/21/2013 04:19 AM, pablobarhamal...@gmail.com wrote: Thank's for all the replies! I've tried some of the imporovements you suggested (using math.exp() and sum() or math.fsum()). None of that made the code faster, because they are functions you are calling lots of times, and function calling is quite time expensive (same as x**(1/2) is faster than math.sqrt(x)). I'm going to try to convert tu numpy now (I have no idea how to do it at the moment), thank's again to everyone. Perhaps you'd have better results if you'd post a runnable piece of code. Otherwise we're just guessing since no one has the ability to actually run your code. -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
On Sat, Jul 20, 2013 at 5:22 PM, pablobarhamal...@gmail.com wrote: Ok, I'm working on a predator/prey simulation, which evolve using genetic algorithms. At the moment, they use a quite simple feed-forward neural network, which can change size over time. Each brain tick is performed by the following function (inside the Brain class): def tick(self): input_num = self.input_num hidden_num = self.hidden_num output_num = self.output_num hidden = [0]*hidden_num output = [0]*output_num inputs = self.input h_weight = self.h_weight o_weight = self.o_weight e = math.e count = -1 for x in range(hidden_num): temp = 0 for y in range(input_num): count += 1 temp += inputs[y] * h_weight[count] hidden[x] = 1/(1+e**(-temp)) count = -1 for x in range(output_num): temp = 0 for y in range(hidden_num): count += 1 temp += hidden[y] * o_weight[count] output[x] = 1/(1+e**(-temp)) self.output = output The function is actually quite fast (~0.040 seconds per 200 calls, using 10 input, 20 hidden and 3 output neurons), and used to be much slower untill I fiddled about with it a bit to make it faster. However, it is still somewhat slow for what I need it. My question to you is if you an see any obvious (or not so obvious) way of making this faster. I've heard about numpy and have been reading about it, but I really can't see how it could be implemented here. Cheers! -- http://mail.python.org/mailman/listinfo/python-list Low level optimizations: If you're in Python 2.x (and not 3), you should use xrange() instead of range(), or maybe even create a local variable and increment it and check its value within a while (that way you can save a few instructions on method invocations from xrange/range). Anyways, if that's not fast enough, just port it to c/c++ (or one of the alternatives to speed it up while still in python: numba, cython, shedskin). Or (if you can), try to use PyPy and see if you get more speed without doing anything. Cheers, Fabio -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
In article 6bf4d298-b425-4357-9c1a-192e6e6cd...@googlegroups.com, pablobarhamal...@gmail.com wrote: Ok, I'm working on a predator/prey simulation, which evolve using genetic algorithms. At the moment, they use a quite simple feed-forward neural network, which can change size over time. Each brain tick is performed by the following function (inside the Brain class): def tick(self): input_num = self.input_num hidden_num = self.hidden_num output_num = self.output_num hidden = [0]*hidden_num output = [0]*output_num inputs = self.input h_weight = self.h_weight o_weight = self.o_weight e = math.e count = -1 for x in range(hidden_num): temp = 0 for y in range(input_num): count += 1 temp += inputs[y] * h_weight[count] hidden[x] = 1/(1+e**(-temp)) count = -1 for x in range(output_num): temp = 0 for y in range(hidden_num): count += 1 temp += hidden[y] * o_weight[count] output[x] = 1/(1+e**(-temp)) self.output = output The function is actually quite fast (~0.040 seconds per 200 calls, using 10 input, 20 hidden and 3 output neurons), and used to be much slower untill I fiddled about with it a bit to make it faster. However, it is still somewhat slow for what I need it. My question to you is if you an see any obvious (or not so obvious) way of making this faster. I've heard about numpy and have been reading about it, but I really can't see how it could be implemented here. First thing, I would add some instrumentation to see where the most time is being spent. My guess is in the first set of nested loops, where the inner loop gets executed hidden_num * input_num (i.e. 10 * 20 = 200) times. But timing data is better than my guess. Assuming I'm right, though, you do compute range(input_num) 20 times. You don't need to do that. You might try xrange(), or you might just factor out creating the list outside the outer loop. But, none of that seems like it should make much difference. What possible values can temp take? If it can only take certain discrete values and you can enumerate them beforehand, you might want to build a dict mapping temp - 1/(1+e**(-temp)) and then all that math becomes just a table lookup. -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
Hi there. I'm using python 3, where xrange doesn't exist any more (range is now equivalent). And temp doesn't have any fixed discrete values it always takes. I have tried cython but it doesn't seem to work well (maybe using it wrong?). Any other ideas? -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
On Sun, Jul 21, 2013 at 6:22 AM, pablobarhamal...@gmail.com wrote: temp = 0 for y in range(input_num): count += 1 temp += inputs[y] * h_weight[count] hidden[x] = 1/(1+e**(-temp)) It's a micro-optimization that'll probably have negligible effect, but it can't hurt: Instead of adding to temp and raising e to -temp, carry the value of temp as a negative number: temp -= inputs[y] * h_weight[count] hidden[x] = 1/(1+e**temp) Ditto in the second loop. Not sure which way performance would go, but would it be more readable to take an iterator for h_weight and o_weight? Something like this: # Slot this into your existing structure inputs = self.input h_weight = iter(self.h_weight) o_weight = iter(self.o_weight) e = math.e for x in range(hidden_num): temp = 0 for y in inputs: temp += y * next(h_weight) hidden[x] = 1/(1+e**(-temp)) for x in range(output_num): temp = 0 for y in hidden: temp += y * next(o_weight) output[x] = 1/(1+e**(-temp)) # End. If that looks better, the next change I'd look to make is replacing the 'for y' loops with sum() calls on generators: temp = sum(y * next(o_weight) for y in hidden) And finally replace the entire 'for x' loops with list comps... which makes for two sizeable one-liners, which I like and many people detest: def tick(self): inputs = self.inputs h_weight = iter(self.h_weight) o_weight = iter(self.o_weight) e = math.e hidden = [1/(1+e**sum(-y * next(h_weight) for y in inputs)) for _ in range(hidden_num)] self.output = [1/(1+e**sum(-y * next(o_weight) for y in hidden)) for _ in range(output_num)] Up to you to decide whether you find that version more readable, or at least sufficiently readable, and then to test performance :) But it's shorter by quite a margin, which I personally like. Oh, and I'm relying on you to make sure I've made the translation correctly, which I can't confirm without a pile of input data to test it on. All I can say is that it's syntactically correct. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
Hi there Chris. Unfortunately, using iterations was about twice as slow as the original implementation, so that's not the solution. Thank's anyway. -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
On Sun, Jul 21, 2013 at 9:24 AM, pablobarhamal...@gmail.com wrote: Hi there Chris. Unfortunately, using iterations was about twice as slow as the original implementation, so that's not the solution. Thank's anyway. Fascinating! Well, was worth a try anyhow. But that's a very surprising result. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: How can I make this piece of code even faster?
On Sat, 20 Jul 2013 13:22:03 -0700, pablobarhamalzas asked: How can I make this piece of code even faster? - Use a faster computer. - Put in more memory. - If using Unix or Linux, decrease the nice priority of the process. I mention these because sometimes people forget that if you have a choice between spend 10 hours at $70 per hour to optimize code, and spend $200 to put more memory in, putting more memory in may be more cost effective. Other than that, what you describe sounds like it could be a good candidate for PyPy to speed the code up, although PyPy is still (mostly) Python 2. You could take this question to the pypy mailing list and ask there. http://mail.python.org/mailman/listinfo/pypy-dev You also might like to try Cython or Numba. As far as pure-Python optimizations, once you have a decent algorithm, there's probably not a lot of room for major speed ups. But a couple of thoughts and a possible optimized version come to mind... 1) In general, it is better/faster to iterate over lists directly, than indirectly by index number: for item in sequence: process(item) rather than: for i in range(len(sequence)): item = sequence[i] process(item) If you need both the index and the value: for i, item in enumerate(sequence): print(i, process(item)) In your specific case, if I have understood your code's logic, you can just iterate directly over the appropriate lists, once each. 2) You perform an exponentiation using math.e**(-temp). You will probably find that math.exp(-temp) is both faster and more accurate. 3) If you need to add numbers, it is better to call sum() or math.fsum() than add them by hand. sum() may be a tiny bit faster, or maybe not, but fsum() is more accurate for floats. See below for my suggestion on an optimized version. Ok, I'm working on a predator/prey simulation, which evolve using genetic algorithms. At the moment, they use a quite simple feed-forward neural network, which can change size over time. Each brain tick is performed by the following function (inside the Brain class): def tick(self): input_num = self.input_num hidden_num = self.hidden_num output_num = self.output_num hidden = [0]*hidden_num output = [0]*output_num inputs = self.input h_weight = self.h_weight o_weight = self.o_weight e = math.e count = -1 for x in range(hidden_num): temp = 0 for y in range(input_num): count += 1 temp += inputs[y] * h_weight[count] hidden[x] = 1/(1+e**(-temp)) count = -1 for x in range(output_num): temp = 0 for y in range(hidden_num): count += 1 temp += hidden[y] * o_weight[count] output[x] = 1/(1+e**(-temp)) self.output = output The function is actually quite fast (~0.040 seconds per 200 calls, using 10 input, 20 hidden and 3 output neurons), and used to be much slower untill I fiddled about with it a bit to make it faster. However, it is still somewhat slow for what I need it. My question to you is if you an see any obvious (or not so obvious) way of making this faster. I've heard about numpy and have been reading about it, but I really can't see how it could be implemented here. Here's my suggestion: def tick(self): exp = math.exp sum = math.fsum # more accurate than builtin sum # This assumes that both inputs and h_weight have exactly # self.input_num values. temp = fsum(i*w for (i, w) in zip(self.inputs, self.h_weight)) hidden = [1/(1+exp(-temp))]*self.hidden_num # This assumes that both outputs and o_weight have exactly # self.output_num values. temp = fsum(o*w for (o, w) in zip(self.outputs, self.o_weight)) self.output = [1/(1+exp(-temp))]*self.output_num I have neither tested that this works the same as your code (or even works at all!) nor that it is faster, but I would expect that it will be faster. Good luck! -- Steven -- http://mail.python.org/mailman/listinfo/python-list