Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array
To John: Did you try larger arrays/tuples? I would guess that makes a significant difference. No I didn't, due to the fact that these values are coordinates in 3D (x,y,z). In fact I work with a list/array/tuple of arrays with 10 to 1M of elements or more. What I need to do is to calculate the distance of each of these elements (coordinates) to a given coordinate and filter for the nearest. The brute force method would look like this: #~ def bruteForceSearch(points, point): minpt = min([(vec2Norm(pt, point), pt, i) for i, pt in enumerate(points)], key=itemgetter(0)) return sqrt(minpt[0]), minpt[1], minpt[2] #~~ def vec2Norm(pt1,pt2): xDis = pt1[0]-pt2[0] yDis = pt1[1]-pt2[1] zDis = pt1[2]-pt2[2] return xDis*xDis+yDis*yDis+zDis*zDis I have a more clever method but it still takes a lot of time in the vec2norm-function. If you like I can attach a running example. To Ben: Don't know how much of an impact it would have, but those timeit statements for array creation include the import process, which are going to be different for each module and are probably not indicative of the speed of array creation. No, the timeit statements counts the time for the statement in the first argument only, the import-thing isn't included in the time. Thomas This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged. If you are not one of the named recipients or have received this email in error, (i) you should not read, disclose, or copy it, (ii) please notify sender of your receipt by reply email and delete this email and all attachments, (iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email.For other languages, go to http://www.3ds.com/terms/email-disclaimer. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array
Hey, On Mon, 2011-01-10 at 08:09 +, EMMEL Thomas wrote: #~ def bruteForceSearch(points, point): minpt = min([(vec2Norm(pt, point), pt, i) for i, pt in enumerate(points)], key=itemgetter(0)) return sqrt(minpt[0]), minpt[1], minpt[2] #~~ def vec2Norm(pt1,pt2): xDis = pt1[0]-pt2[0] yDis = pt1[1]-pt2[1] zDis = pt1[2]-pt2[2] return xDis*xDis+yDis*yDis+zDis*zDis I have a more clever method but it still takes a lot of time in the vec2norm-function. If you like I can attach a running example. if you use the vec2Norm function as you wrote it there, this code is not vectorized at all, and as such of course numpy would be slowest as it has the most overhead and no advantages for non vectorized code, you simply can't write python code like that and expect it to be fast for these kind of calculations. Your function should look more like this: import numpy as np def bruteForceSearch(points, point): dists = points - point # that may need point[None,:] or such for broadcasting to work dists *= dists dists = dists.sum(1) I = np.argmin(dists) return sqrt(dists[I]), points[I], I If points is small, this may not help much (though compared to this exact code my guess is it probably would), if points is larger it should speed up things tremendously (unless you run into RAM problems). It may be that you need to fiddle around with axes, I did not check the code. If this is not good enough for you (you will need to port it (and maybe the next outer loop as well) to Cython or write it in C/C++ and make sure it can optimize things right. Also I think somewhere in scipy there were some distance tools that may be already in C and nice fast, but not sure. I hope I got this right and it helps, Sebastian ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array
Hey back... #~ ~ ~~~ def bruteForceSearch(points, point): minpt = min([(vec2Norm(pt, point), pt, i) for i, pt in enumerate(points)], key=itemgetter(0)) return sqrt(minpt[0]), minpt[1], minpt[2] #~ ~ def vec2Norm(pt1,pt2): xDis = pt1[0]-pt2[0] yDis = pt1[1]-pt2[1] zDis = pt1[2]-pt2[2] return xDis*xDis+yDis*yDis+zDis*zDis I have a more clever method but it still takes a lot of time in the vec2norm-function. If you like I can attach a running example. if you use the vec2Norm function as you wrote it there, this code is not vectorized at all, and as such of course numpy would be slowest as it has the most overhead and no advantages for non vectorized code, you simply can't write python code like that and expect it to be fast for these kind of calculations. Your function should look more like this: import numpy as np def bruteForceSearch(points, point): dists = points - point # that may need point[None,:] or such for broadcasting to work dists *= dists dists = dists.sum(1) I = np.argmin(dists) return sqrt(dists[I]), points[I], I If points is small, this may not help much (though compared to this exact code my guess is it probably would), if points is larger it should speed up things tremendously (unless you run into RAM problems). It may be that you need to fiddle around with axes, I did not check the code. If this is not good enough for you (you will need to port it (and maybe the next outer loop as well) to Cython or write it in C/C++ and make sure it can optimize things right. Also I think somewhere in scipy there were some distance tools that may be already in C and nice fast, but not sure. I hope I got this right and it helps, Sebastian I see the point and it was very helpful to understand the behavior of the arrays a bit better. And your attempt improved the bruteForceSearch which is up to 6 times faster. But in case of a leaf in a kd-tree you end up with 50, 20, 10 or less points where the speed-up is reversed. In this particular case 34000 runs take 90s with your method and 50s with mine (not the bruteForce). I see now the limits of the arrays but of course I see the chances and - coming back to my original question - it seems that Numeric arrays were faster for my kind of application but they might be slower for larger amounts of data. Regards Thomas This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged. If you are not one of the named recipients or have received this email in error, (i) you should not read, disclose, or copy it, (ii) please notify sender of your receipt by reply email and delete this email and all attachments, (iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email.For other languages, go to http://www.3ds.com/terms/email-disclaimer. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array
Hi, On 01/10/2011 09:09 AM, EMMEL Thomas wrote: No I didn't, due to the fact that these values are coordinates in 3D (x,y,z). In fact I work with a list/array/tuple of arrays with 10 to 1M of elements or more. What I need to do is to calculate the distance of each of these elements (coordinates) to a given coordinate and filter for the nearest. The brute force method would look like this: #~ def bruteForceSearch(points, point): minpt = min([(vec2Norm(pt, point), pt, i) for i, pt in enumerate(points)], key=itemgetter(0)) return sqrt(minpt[0]), minpt[1], minpt[2] #~~ def vec2Norm(pt1,pt2): xDis = pt1[0]-pt2[0] yDis = pt1[1]-pt2[1] zDis = pt1[2]-pt2[2] return xDis*xDis+yDis*yDis+zDis*zDis I am not sure I understood the problem properly but here what I would use to calculate a distance from horizontally stacked vectors (big): ref=numpy.array([0.1,0.2,0.3]) big=numpy.random.randn(100, 3) big=numpy.add(big,-ref) distsquared=numpy.sum(big**2, axis=1) Pascal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array
Hi, Spatial hashes are the common solution. Another common optimization is using the distance squared for collision detection. Since you do not need the expensive sqrt for this calc. cu. On Mon, Jan 10, 2011 at 3:25 PM, Pascal pascal...@parois.net wrote: Hi, On 01/10/2011 09:09 AM, EMMEL Thomas wrote: No I didn't, due to the fact that these values are coordinates in 3D (x,y,z). In fact I work with a list/array/tuple of arrays with 10 to 1M of elements or more. What I need to do is to calculate the distance of each of these elements (coordinates) to a given coordinate and filter for the nearest. The brute force method would look like this: #~ def bruteForceSearch(points, point): minpt = min([(vec2Norm(pt, point), pt, i) for i, pt in enumerate(points)], key=itemgetter(0)) return sqrt(minpt[0]), minpt[1], minpt[2] #~~ def vec2Norm(pt1,pt2): xDis = pt1[0]-pt2[0] yDis = pt1[1]-pt2[1] zDis = pt1[2]-pt2[2] return xDis*xDis+yDis*yDis+zDis*zDis I am not sure I understood the problem properly but here what I would use to calculate a distance from horizontally stacked vectors (big): ref=numpy.array([0.1,0.2,0.3]) big=numpy.random.randn(100, 3) big=numpy.add(big,-ref) distsquared=numpy.sum(big**2, axis=1) Pascal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] speed of numpy.ndarray compared to Numeric.array
Hi, There are some discussions on the speed of numpy compared to Numeric in this list, however I have a topic I don't understand in detail, maybe someone can enlighten me... I use python 2.6 on a SuSE installation and test this: #Python 2.6 (r26:66714, Mar 30 2010, 00:29:28) #[GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2 #Type help, copyright, credits or license for more information. import timeit #creation of arrays and tuples (timeit number=100 by default) timeit.Timer('a((1.,2.,3.))','from numpy import array as a').timeit() #8.2061841487884521 timeit.Timer('a((1.,2.,3.))','from Numeric import array as a').timeit() #9.6958281993865967 timeit.Timer('a((1.,2.,3.))','a=tuple').timeit() #0.13814711570739746 #Result: tuples - of course - are much faster than arrays and numpy is a bit faster in creating arrays than Numeric #working with arrays timeit.Timer('d=x1-x2;sum(d*d)','from Numeric import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #3.263314962387085 timeit.Timer('d=x1-x2;sum(d*d)','from numpy import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #9.7236979007720947 #Result: Numeric is three times faster than numpy! Why? #working with components: timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','a=tuple; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #0.64785194396972656 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from numpy import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #3.4181499481201172 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from Numeric import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #0.97426199913024902 Result: tuples are again the fastest variant, Numeric is faster than numpy and both are faster than the variant above using the high-level functions! Why? For various reasons I need to use numpy in the future where I used Numeric before. Is there any better solution in numpy I missed? Kind regards and thanks in advance Thomas This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged. If you are not one of the named recipients or have received this email in error, (i) you should not read, disclose, or copy it, (ii) please notify sender of your receipt by reply email and delete this email and all attachments, (iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email.For other languages, go to http://www.3ds.com/terms/email-disclaimer. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array
Did you try larger arrays/tuples? I would guess that makes a significant difference. On Fri, Jan 7, 2011 at 7:58 AM, EMMEL Thomas thomas.em...@3ds.com wrote: Hi, There are some discussions on the speed of numpy compared to Numeric in this list, however I have a topic I don't understand in detail, maybe someone can enlighten me... I use python 2.6 on a SuSE installation and test this: #Python 2.6 (r26:66714, Mar 30 2010, 00:29:28) #[GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2 #Type help, copyright, credits or license for more information. import timeit #creation of arrays and tuples (timeit number=100 by default) timeit.Timer('a((1.,2.,3.))','from numpy import array as a').timeit() #8.2061841487884521 timeit.Timer('a((1.,2.,3.))','from Numeric import array as a').timeit() #9.6958281993865967 timeit.Timer('a((1.,2.,3.))','a=tuple').timeit() #0.13814711570739746 #Result: tuples - of course - are much faster than arrays and numpy is a bit faster in creating arrays than Numeric #working with arrays timeit.Timer('d=x1-x2;sum(d*d)','from Numeric import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #3.263314962387085 timeit.Timer('d=x1-x2;sum(d*d)','from numpy import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #9.7236979007720947 #Result: Numeric is three times faster than numpy! Why? #working with components: timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','a=tuple; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #0.64785194396972656 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from numpy import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #3.4181499481201172 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from Numeric import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #0.97426199913024902 Result: tuples are again the fastest variant, Numeric is faster than numpy and both are faster than the variant above using the high-level functions! Why? For various reasons I need to use numpy in the future where I used Numeric before. Is there any better solution in numpy I missed? Kind regards and thanks in advance Thomas This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged. If you are not one of the named recipients or have received this email in error, (i) you should not read, disclose, or copy it, (ii) please notify sender of your receipt by reply email and delete this email and all attachments, (iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email. For other languages, Click Herehttp://www.3ds.com/terms/email-disclaimer ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array
On Fri, Jan 7, 2011 at 9:58 AM, EMMEL Thomas thomas.em...@3ds.com wrote: Hi, There are some discussions on the speed of numpy compared to Numeric in this list, however I have a topic I don't understand in detail, maybe someone can enlighten me... I use python 2.6 on a SuSE installation and test this: #Python 2.6 (r26:66714, Mar 30 2010, 00:29:28) #[GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2 #Type help, copyright, credits or license for more information. import timeit #creation of arrays and tuples (timeit number=100 by default) timeit.Timer('a((1.,2.,3.))','from numpy import array as a').timeit() #8.2061841487884521 timeit.Timer('a((1.,2.,3.))','from Numeric import array as a').timeit() #9.6958281993865967 timeit.Timer('a((1.,2.,3.))','a=tuple').timeit() #0.13814711570739746 #Result: tuples - of course - are much faster than arrays and numpy is a bit faster in creating arrays than Numeric #working with arrays timeit.Timer('d=x1-x2;sum(d*d)','from Numeric import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #3.263314962387085 timeit.Timer('d=x1-x2;sum(d*d)','from numpy import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #9.7236979007720947 #Result: Numeric is three times faster than numpy! Why? #working with components: timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','a=tuple; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #0.64785194396972656 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from numpy import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #3.4181499481201172 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from Numeric import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit() #0.97426199913024902 Result: tuples are again the fastest variant, Numeric is faster than numpy and both are faster than the variant above using the high-level functions! Why? For various reasons I need to use numpy in the future where I used Numeric before. Is there any better solution in numpy I missed? Kind regards and thanks in advance Thomas Don't know how much of an impact it would have, but those timeit statements for array creation include the import process, which are going to be different for each module and are probably not indicative of the speed of array creation. Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion