Re: [Numpy-discussion] Closing some tickets.
Hi, Is there an official support for MSVC 2005 ? Last time I tried to compile Python with it, it couldn't build extension. If MSVC 2005 is not officially supported, at least by Python itself, I'm not sure Numpy can. Matthieu 2008/5/22 Charles R Harris [EMAIL PROTECTED]: All, Can we close ticket #117 and add Pearu's comment to the FAQ? http://projects.scipy.org/scipy/numpy/ticket/117 Can someone with MSVC 2005 check if we can close ticket #164? http://projects.scipy.org/scipy/numpy/ticket/164 Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Closing some tickets.
Matthieu Brucher wrote: Hi, Is there an official support for MSVC 2005 ? Last time I tried to compile Python with it, it couldn't build extension. If MSVC 2005 is not officially supported, at least by Python itself, I'm not sure Numpy can. Python 2.5 used 2003 (VS 7 ? I am always confused by their version numbering scheme), and Python 2.6/3.0 will use 2008 (VS 9 ?) AFAIK. I don't think it makes sense to support a compiler which is not used for official binaries and is already superseded by a newer version. cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] first recarray steps
Anne Archibald wrote: 2008/5/21 Vincent Schut [EMAIL PROTECTED]: Christopher Barker wrote: Also, if you image data is rgb, usually, that's a (width, height, 3) array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) array, then that's rrr... Some image libs may give you that, I'm not sure. My data is. In fact, this is a simplification of my situation; I'm processing satellite data, which usually has more (and other) bands than just rgb. But the data is definitely in shape (bands, y, x). You may find your life becomes easier if you transpose the data in memory. This can make a big difference to efficiency. Years ago I was working with enormous (by the standards of the day) MATLAB files on disk, storing complex data. The way (that version of) MATLAB represented complex data was the way you describe: matrix of real parts, matrix of imaginary parts. This meant that to draw a single pixel, the disk needed to seek twice... depending on what sort of operations you're doing, transposing your data so that each pixel is all in one place may improve cache coherency as well as making the use of record arrays possible. Anne Anne, thanks for the thoughts. In most cases, you'll probably be right. In this case, however, it won't give me much (if any) speedup, maybe even slowdown. Satellite images often are stored on disk in a band sequential manner. The library I use for IO is GDAL, which is a higly optimized c library for reading/writing almost any kind of satellite data type. It also features an internal caching mechanism. And it gives me my data as (y, x, bands). I'm not reading single pixels anyway. The amounts of data I have to process (enormous, even by the standards of today ;-)) require me to do this in chunks, in parallel, even on different cores/cpu's/computers. Every chunk usually is (chunkYSize, chunkXSize, allBands) with xsize and ysize being not so small (think from 64^2 to 1024^2) so that pretty much eliminates any performance issues regarding the data on disk. Furthermore, having to process on multiple computers forces me to have my data on networked storage. The latency and transfer rate of the network will probably eliminate any small speedup because my drive has to do less seeks... Now for the recarray part, that would indeed ease my life a bit :) However, having to transpose the data in memory on every read and write does not sound very attractive. It will spoil cycles, and memory, and be asking for bugs. I can live without recarrays, for sure. I only hoped they might make my live a bit easier and my code a bit more readable, without too much effort. Well, they won't, apparently... I'll just go on like I did before this little excercise. Thanks all for the inputs. Cheers, Vincent. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] distance_matrix: how to speed up?
Emanuele Olivetti wrote: snip This solution is super-fast, stable and use little memory. It is based on the fact that: (x-y)^2*w = x*x*w - 2*x*y*w + y*y*w For size1=size2=dimensions=1000 requires ~0.6sec. to compute on my dual core duo. It is 2 order of magnitude faster than my previous solution, but 1-2 order of magnitude slower than using C with weave.inline. Definitely good enough for me. Emanuele Reading this thread, I remembered having tried scipy's sandbox.rbf (radial basis function) to interpolate a pretty large, multidimensional dataset, to fill in the missing data points. This however failed soon with out-of-memory errors, which, if I remember correctly, came from the pretty straightforward distance calculation between the different data points that is used in this package. Being no math wonder, I assumed that there simply was no simple way to calculate distances without using much memory, and ended my rbf experiments. To make a story short: correct me if I am wrong, but might it be an idea to use the above solution in scipy.sandbox.rbf? Vincent. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008
ke, 2008-05-21 kello 10:08 +0200, Stéfan van der Walt kirjoitti: [clip] This will parse better (as the line with the semicolon is bold, the next lines are not). Also, would it be possible to put function and next_function in double back-ticks, so that they are referenced, like modules? That way they will might be clickable in a html version of the documentation. When generating the reference guide, I parse all the numpy docstrings and re-generate a document enhanced with Sphinx markup. In this document, functions in the See Also clause are clickable. I have support for two formats: See Also function_a, function_b, function_c function_d : relation to current function Don't worry if it doesn't look perfect on the wiki; the reference guide will be rendered correctly. Should the function names in the See also section also include the namespace prefix, ie. numpy.function_a numpy.function_b Or should we assume from numpy import * or import numpy as np? I think it'd be useful to clarify this in the documentation standard and in example.py, also for the Examples section. (Btw, the Docstrings/Example appears to be out-of-date wrt. this.) Pauli ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy and scipy icons ?
Stéfan van der Walt wrote: Hi David The icons are attached to the scipy.org main page as .png's. Travis Vaught drew the vector versions, IIRC. I can find logo for scipy conf, bug, etc... but no normal icon, nor any numpy icon. Would it be possible to put the icon (vector version) somewhere in subversion ? Would be useful for say installers, etc... cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Numpy and scipy icons ?
Hi, Where can I find numpy and scipy icons, preferably in a vector format ? cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy and scipy icons ?
Hi David The icons are attached to the scipy.org main page as .png's. Travis Vaught drew the vector versions, IIRC. Regards Stéfan 2008/5/22 David Cournapeau [EMAIL PROTECTED]: Hi, Where can I find numpy and scipy icons, preferably in a vector format ? cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] distance_matrix: how to speed up?
On May 22, 2008, at 9:45 AM, Vincent Schut wrote: Reading this thread, I remembered having tried scipy's sandbox.rbf (radial basis function) to interpolate a pretty large, multidimensional dataset, to fill in the missing data points. This however failed soon with out-of-memory errors, which, if I remember correctly, came from the pretty straightforward distance calculation between the different data points that is used in this package. Being no math wonder, I assumed that there simply was no simple way to calculate distances without using much memory, and ended my rbf experiments. To make a story short: correct me if I am wrong, but might it be an idea to use the above solution in scipy.sandbox.rbf? Yes, this would be a very good substitution. Not only does it use less memory, but in my quick tests it is about as fast or faster. Really, though, both are pretty quick. There will still be memory limitations, but you only need to store a matrix of (N, M), instead of (NDIM, N, M), so for many dimensions there will be big memory improvements. Probably only small improvements for 3 dimensions or less. I'm not sure where rbf lives anymore -- it's not in scikits. I have my own version (parts of which got folded into the old scipy.sandbox version), that I would be willing to share if there is interest. Really, though, the rbf toolbox will not be limited by the memory of the distance matrix. Later on, you need to do a large linear algebra 'solve', like this: r = norm(x, x) # The distances between all of the ND points to each other. A = psi(r) # where psi is some divergent function, often the multiquadratic function : sqrt((self.epsilon*r)**2 + 1) coefs = linalg.solve(A, data) # where data is the length of x, one data point for each spatial point. # to find the interpolated data points at xi ri = norm(xi, x) Ai = psi(ri) di = dot(Ai, coefs) All in all, it is the 'linalg.solve' that kills you. -Rob Rob Hetland, Associate Professor Dept. of Oceanography, Texas AM University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] distance_matrix: how to speed up?
Rob Hetland wrote: On May 22, 2008, at 9:45 AM, Vincent Schut wrote: snip Really, though, the rbf toolbox will not be limited by the memory of the distance matrix. Later on, you need to do a large linear algebra 'solve', like this: r = norm(x, x) # The distances between all of the ND points to each other. A = psi(r) # where psi is some divergent function, often the multiquadratic function : sqrt((self.epsilon*r)**2 + 1) coefs = linalg.solve(A, data) # where data is the length of x, one data point for each spatial point. # to find the interpolated data points at xi ri = norm(xi, x) Ai = psi(ri) di = dot(Ai, coefs) All in all, it is the 'linalg.solve' that kills you. Ah, indeed, my memory was faulty, I'm afraid. It was in this phase that it halted, not in the distance calculations. Vincent. -Rob Rob Hetland, Associate Professor Dept. of Oceanography, Texas AM University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy and scipy icons ?
On Thu, May 22, 2008 at 4:14 AM, David Cournapeau [EMAIL PROTECTED] wrote: Stéfan van der Walt wrote: Hi David The icons are attached to the scipy.org main page as .png's. Travis Vaught drew the vector versions, IIRC. I can find logo for scipy conf, bug, etc... but no normal icon, nor any numpy icon. Would it be possible to put the icon (vector version) somewhere in subversion ? Would be useful for say installers, etc... Travis Vaught is the man: http://article.gmane.org/gmane.comp.python.numeric.general/18495 Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Multiple Boolean Operations
Hi All, I am building some 3D grids for visualization starting from a much bigger grid. I build these grids by satisfying certain conditions on x, y, z coordinates of their cells: up to now I was using VTK to perform this operation, but VTK is slow as a turtle, so I thought to use numpy to get the cells I am interested in. Basically, for every cell I have the coordinates of its center point (centroids), named xCent, yCent and zCent. These values are stored in numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, yCent and zCent with 10,000 values in them). What I'd like to do is: # Filter cells which do not satisfy Z requirements: zReq = zMin = zCent = zMax # After that, filter cells which do not satisfy Y requirements, # but apply this filter only on cells who satisfy the above condition: yReq = yMin = yCent = yMax # After that, filter cells which do not satisfy X requirements, # but apply this filter only on cells who satisfy the 2 above conditions: xReq = xMin = xCent = xMax I'd like to end up with a vector of indices which tells me which are the cells in the original grid that satisfy all 3 conditions. I know that something like this: zReq = zMin = zCent = zMax Can not be done directly in numpy, as the first statement executed returns a vector of boolean. Also, if I do something like: zReq1 = numpy.nonzero(zCent = zMax) zReq2 = numpy.nonzero(zCent[zReq1] = zMin) I lose the original indices of the grid, as in the second statement zCent[zReq1] has no more the size of the original grid but it has already been filtered out. Is there anything I could try in numpy to get what I am looking for? Sorry if the description is not very clear :-D Thank you very much for your suggestions. Andrea. Imagination Is The Only Weapon In The War Against Reality. http://xoomer.alice.it/infinity77/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Hi Andrea 2008/5/22 Andrea Gavana [EMAIL PROTECTED]: I am building some 3D grids for visualization starting from a much bigger grid. I build these grids by satisfying certain conditions on x, y, z coordinates of their cells: up to now I was using VTK to perform this operation, but VTK is slow as a turtle, so I thought to use numpy to get the cells I am interested in. Basically, for every cell I have the coordinates of its center point (centroids), named xCent, yCent and zCent. These values are stored in numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, yCent and zCent with 10,000 values in them). What I'd like to do is: You clearly have a large dataset, otherwise speed wouldn't have been a concern to you. You can do your operation in one pass over the data, and I'd suggest you try doing that with Cython or Ctypes. If you need an example on how to access data using those methods, let me know. Of course, it *can* be done using NumPy (maybe not in one pass), but thinking in terms of for-loops is sometimes easier, and immediately takes you to a highly optimised execution time. Cheers Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Hi Stefan All, On Thu, May 22, 2008 at 12:29 PM, Stéfan van der Walt wrote: Hi Andrea 2008/5/22 Andrea Gavana [EMAIL PROTECTED]: I am building some 3D grids for visualization starting from a much bigger grid. I build these grids by satisfying certain conditions on x, y, z coordinates of their cells: up to now I was using VTK to perform this operation, but VTK is slow as a turtle, so I thought to use numpy to get the cells I am interested in. Basically, for every cell I have the coordinates of its center point (centroids), named xCent, yCent and zCent. These values are stored in numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, yCent and zCent with 10,000 values in them). What I'd like to do is: You clearly have a large dataset, otherwise speed wouldn't have been a concern to you. You can do your operation in one pass over the data, and I'd suggest you try doing that with Cython or Ctypes. If you need an example on how to access data using those methods, let me know. Of course, it *can* be done using NumPy (maybe not in one pass), but thinking in terms of for-loops is sometimes easier, and immediately takes you to a highly optimised execution time. First of all, thank you for your answer. I know next to nothing about Cython and very little about Ctypes, but it would be nice to have an example on how to use them to speed up the operations. Actually, I don't really know if my dataset is large, as I work normally with xCent, yCent and zCent vectors of about 100,000-300,000 elements in them. However, all the other operations I do with numpy on these vectors are pretty fast (reshaping, re-casting, min(), max() and so on). So I believe that also a pure numpy solution might perform well enough for my needs: but I am really no expert in numpy, so please forgive any mistake I'm doing :-D. Andrea. Imagination Is The Only Weapon In The War Against Reality. http://xoomer.alice.it/infinity77/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
A Thursday 22 May 2008, Andrea Gavana escrigué: Hi All, I am building some 3D grids for visualization starting from a much bigger grid. I build these grids by satisfying certain conditions on x, y, z coordinates of their cells: up to now I was using VTK to perform this operation, but VTK is slow as a turtle, so I thought to use numpy to get the cells I am interested in. Basically, for every cell I have the coordinates of its center point (centroids), named xCent, yCent and zCent. These values are stored in numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, yCent and zCent with 10,000 values in them). What I'd like to do is: # Filter cells which do not satisfy Z requirements: zReq = zMin = zCent = zMax # After that, filter cells which do not satisfy Y requirements, # but apply this filter only on cells who satisfy the above condition: yReq = yMin = yCent = yMax # After that, filter cells which do not satisfy X requirements, # but apply this filter only on cells who satisfy the 2 above conditions: xReq = xMin = xCent = xMax I'd like to end up with a vector of indices which tells me which are the cells in the original grid that satisfy all 3 conditions. I know that something like this: zReq = zMin = zCent = zMax Can not be done directly in numpy, as the first statement executed returns a vector of boolean. Also, if I do something like: zReq1 = numpy.nonzero(zCent = zMax) zReq2 = numpy.nonzero(zCent[zReq1] = zMin) I lose the original indices of the grid, as in the second statement zCent[zReq1] has no more the size of the original grid but it has already been filtered out. Is there anything I could try in numpy to get what I am looking for? Sorry if the description is not very clear :-D Thank you very much for your suggestions. I don't know if this is what you want, but you can get the boolean arrays separately, do the intersection and finally get the interesting values (by using fancy indexing) or coordinates (by using .nonzero()). Here it is an example: In [105]: a = numpy.arange(10,20) In [106]: c1=(a=13)(a=17) In [107]: c2=(a=14)(a=18) In [109]: all=c1c2 In [110]: a[all] Out[110]: array([14, 15, 16, 17]) # the values In [111]: all.nonzero() Out[111]: (array([4, 5, 6, 7]),)# the coordinates Hope that helps, -- Francesc Alted ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
On Thu, 22 May 2008, Andrea Gavana apparently wrote: # Filter cells which do not satisfy Z requirements: zReq = zMin = zCent = zMax This seems to raise a question: should numpy arrays support this standard Python idiom? Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
2008/5/22 Andrea Gavana [EMAIL PROTECTED]: Hi All, I am building some 3D grids for visualization starting from a much bigger grid. I build these grids by satisfying certain conditions on x, y, z coordinates of their cells: up to now I was using VTK to perform this operation, but VTK is slow as a turtle, so I thought to use numpy to get the cells I am interested in. Basically, for every cell I have the coordinates of its center point (centroids), named xCent, yCent and zCent. These values are stored in numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, yCent and zCent with 10,000 values in them). What I'd like to do is: # Filter cells which do not satisfy Z requirements: zReq = zMin = zCent = zMax # After that, filter cells which do not satisfy Y requirements, # but apply this filter only on cells who satisfy the above condition: yReq = yMin = yCent = yMax # After that, filter cells which do not satisfy X requirements, # but apply this filter only on cells who satisfy the 2 above conditions: xReq = xMin = xCent = xMax I'd like to end up with a vector of indices which tells me which are the cells in the original grid that satisfy all 3 conditions. I know that something like this: zReq = zMin = zCent = zMax Can not be done directly in numpy, as the first statement executed returns a vector of boolean. Also, if I do something like: zReq1 = numpy.nonzero(zCent = zMax) zReq2 = numpy.nonzero(zCent[zReq1] = zMin) I lose the original indices of the grid, as in the second statement zCent[zReq1] has no more the size of the original grid but it has already been filtered out. Is there anything I could try in numpy to get what I am looking for? Sorry if the description is not very clear :-D Thank you very much for your suggestions. How about (as a pure numpy solution): valid = (z = zMin) (z = zMax) valid[valid] = (y[valid] = yMin) (y[valid] = yMax) valid[valid] = (x[valid] = xMin) (x[valid] = xMax) inds = valid.nonzero() ? -- AJC McMorland, PhD candidate Physiology, University of Auckland (Nearly) post-doctoral research fellow Neurobiology, University of Pittsburgh ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Hi Francesc All, On Thu, May 22, 2008 at 1:04 PM, Francesc Alted wrote: I don't know if this is what you want, but you can get the boolean arrays separately, do the intersection and finally get the interesting values (by using fancy indexing) or coordinates (by using .nonzero()). Here it is an example: In [105]: a = numpy.arange(10,20) In [106]: c1=(a=13)(a=17) In [107]: c2=(a=14)(a=18) In [109]: all=c1c2 In [110]: a[all] Out[110]: array([14, 15, 16, 17]) # the values In [111]: all.nonzero() Out[111]: (array([4, 5, 6, 7]),)# the coordinates Thank you for this suggestion! I had forgotten that this worked in numpy :-( . I have written a couple of small functions to test your method and my method (hopefully I did it correctly for both). On my computer (Toshiba Notebook 2.00 GHz, Windows XP SP2, 1GB Ram, Python 2.5, numpy 1.0.3.1), your solution is about 30 times faster than mine (implemented when I didn't know about multiple boolean operations in numpy). This is my code: # Begin Code import numpy from timeit import Timer # Number of cells in my original grid nCells = 15 # Define some constraints for X, Y, Z xMin, xMax = 250.0, 700.0 yMin, yMax = 1000.0, 1900.0 zMin, zMax = 120.0, 300.0 # Generate random centroids for the cells xCent = 1000.0*numpy.random.rand(nCells) yCent = 2500.0*numpy.random.rand(nCells) zCent = 400.0*numpy.random.rand(nCells) def MultipleBoolean1(): Andrea's solution, slow :-( . xReq_1 = numpy.nonzero(xCent = xMin) xReq_2 = numpy.nonzero(xCent = xMax) yReq_1 = numpy.nonzero(yCent = yMin) yReq_2 = numpy.nonzero(yCent = yMax) zReq_1 = numpy.nonzero(zCent = zMin) zReq_2 = numpy.nonzero(zCent = zMax) xReq = numpy.intersect1d_nu(xReq_1, xReq_2) yReq = numpy.intersect1d_nu(yReq_1, yReq_2) zReq = numpy.intersect1d_nu(zReq_1, zReq_2) xyReq = numpy.intersect1d_nu(xReq, yReq) xyzReq = numpy.intersect1d_nu(xyReq, zReq) def MultipleBoolean2(): Francesc's's solution, Much faster :-) . xyzReq = (xCent = xMin) (xCent = xMax) \ (yCent = yMin) (yCent = yMax) \ (zCent = zMin) (zCent = zMax) xyzReq = numpy.nonzero(xyzReq)[0] if __name__ == __main__: trial = 10 t = Timer(MultipleBoolean1(), from __main__ import MultipleBoolean1) print \n\nAndrea's Solution: %0.8g Seconds/Trial%(t.timeit(number=trial)/trial) t = Timer(MultipleBoolean2(), from __main__ import MultipleBoolean2) print Francesc's Solution: %0.8g Seconds/Trial\n%(t.timeit(number=trial)/trial) # End Code And I get this timing on my PC: Andrea's Solution: 0.34946193 Seconds/Trial Francesc's Solution: 0.011288139 Seconds/Trial If I implemented everything correctly, this is an amazing improvement. Thank you to everyone who provided suggestions, and thanks to the list :-D Andrea. Imagination Is The Only Weapon In The War Against Reality. http://xoomer.alice.it/infinity77/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008
Stéfan van der Walt wrote: It looks like there is significant interest in using np instead of numpy in the examples (i.e. we expect the user to do import numpy as np before trying code snippets). Would anybody who objects to using np raise it now, so that we can bury this issue? Regards Stéfan 2008/5/22 Rob Hetland [EMAIL PROTECTED]: On May 22, 2008, at 11:37 AM, Pauli Virtanen wrote: Or should we assume from numpy import * or import numpy as np? I Although a good case could probably be made for all three (*, np, numpy), I think that if import numpy as np is to be put forward as the standard coding style, the examples should use this as well. -Rob Rob Hetland, Associate Professor Dept. of Oceanography, Texas AM University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion Hi, I prefer using 'import numpy' over 'import numpy as np'. But as long as each example has 'import numpy as np' included then I have no objections. The main reason for this is that the block of code can easily be copied and pasted to run a complete entity. Also this type of implicit assumption often get missed because these assumptions are often far from the example (missed in web searches) or overlooked as the reader doesn't think that part was important. Regards Bruce ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Different attributes for NumPy types
Hi, Is it bug if different NumPy types have different attributes? Based on prior discussion, 'complex', 'float' and 'int' are Python types and others are NumPy types. Consequently 'complex', 'float' and 'int' do not inherit from NumPy. However, an element from array created using dtype=numpy.float has the numpy.float64 type. So this is really a documentation issue than an implementation issue. Also different NumPy types have different attributes, for example, 'float64' contains attributes (eg __coerce__) that are not present in 'float32' and 'float128' (these two have the same attributes). This can cause attribute errors in somewhat contrived examples that probably are unlikely to appear in practice because of the casting involved in array creation. The 'uint' types all seem to have the same attributes so do not have these issues. import numpy len(dir(float)) # 47 len(dir(numpy.float))# 47 len(dir(numpy.float32)) # 131 len(dir(numpy.float64)) # 135 len(dir(numpy.float128)) # 131 len(dir(int))# 54 len(dir(numpy.int)) # 54 len(dir(numpy.int0)) # 135 len(dir(numpy.int16))# 132 len(dir(numpy.int32))# 132 len(dir(numpy.int64))# 135 len(dir(numpy.int8)) # 132 print (numpy.float64(1234).size) # 1 print (numpy.float(1234).size) ''' prints error: Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'float' object has no attribute 'size' ''' Regards Bruce ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution
Hi, I was just looking around at the new numpy documentation and got a xhtml parsing error on the page (with Firefox): http://mentat.za.net/numpy/refguide/random.xhtml#index-29351 The offending line contains $X pprox prod_{i=1}^{k}{x^{lpha_i-1}_i}$ in the docstring of the dirichlet distribution the corresponding line in the source at http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/random/mtrand/mtrand.pyx is .. math:: X \\approx \\prod_{i=1}^{k}{x^{\\alpha_i-1}_i} (I have no idea, why it seems not to parse \\a correctly). When looking for this, I found that the Dirichlet distribution is missing from the new Docstring Wiki, http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random Then I saw that Dirichlet is also missing in __all__ in http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/random/info.py As a consequence numpy.lookfor does not find Dirichlet numpy.lookfor('dirichlet') Search results for 'dirichlet' -- import numpy.random dir(numpy.random) contains dirichlet numpy.random.__all__ does not contain dirichlet. To me this seems to be a documentation bug. Josef ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different attributes for NumPy types
Bruce Southey wrote: Hi, Is it bug if different NumPy types have different attributes? I don't think so, other than perhaps we should not have the Python types in the numpy namespace. numpy.float is just __builtin__.float which is a Python type not a NumPy data-type object. numpy.float64 inherits from numpy.float however. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Alan G Isaac wrote: On Thu, 22 May 2008, Andrea Gavana apparently wrote: # Filter cells which do not satisfy Z requirements: zReq = zMin = zCent = zMax This seems to raise a question: should numpy arrays support this standard Python idiom? It would be nice, but alas it requires a significant change to Python first to give us the hooks to modify. (We need the 'and' and 'or' operations to return vectors instead of just numbers as they do now). There is a PEP to allow this, but it has not received much TLC as of late. The difficulty in the implementation is supporting short-circuited evaluation. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Fancier indexing
After poking around for a bit, I was wondering if there was a faster method for the following: # Array of index values 0..n items = numpy.array([0,3,2,1,4,2],dtype=int) # Count the number of occurrences of each index counts = numpy.zeros(5, dtype=int) for i in items: counts[i] += 1 In my real code, 'items' contain up to a million values and this loop will be in a performance critical area of code. If there is no simple solution, I can trivially code this using the C-API. Thanks, -Kevin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancier indexing
On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: After poking around for a bit, I was wondering if there was a faster method for the following: # Array of index values 0..n items = numpy.array([0,3,2,1,4,2],dtype=int) # Count the number of occurrences of each index counts = numpy.zeros(5, dtype=int) for i in items: counts[i] += 1 In my real code, 'items' contain up to a million values and this loop will be in a performance critical area of code. If there is no simple solution, I can trivially code this using the C-API. How big is n? If it is much smaller than a million then loop over that instead. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancier indexing
On Thu, May 22, 2008 at 12:08 PM, Keith Goodman [EMAIL PROTECTED] wrote: How big is n? If it is much smaller than a million then loop over that instead. n is always relatively small, but I'd rather not do: for i in range(n): counts[i] = (items==i).sum() If that was the best alternative, I'd just bite the bullet and code this in C. Thanks, -Kevin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancier indexing
On Thu, May 22, 2008 at 9:08 AM, Keith Goodman [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: After poking around for a bit, I was wondering if there was a faster method for the following: # Array of index values 0..n items = numpy.array([0,3,2,1,4,2],dtype=int) # Count the number of occurrences of each index counts = numpy.zeros(5, dtype=int) for i in items: counts[i] += 1 In my real code, 'items' contain up to a million values and this loop will be in a performance critical area of code. If there is no simple solution, I can trivially code this using the C-API. How big is n? If it is much smaller than a million then loop over that instead. Or how about using a list instead: items = [0,3,2,1,4,2] uitems = frozenset(items) count = [items.count(i) for i in uitems] count [1, 1, 2, 1, 1] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancier indexing
You're just trying to do this...correct? import numpy items = numpy.array([0,3,2,1,4,2],dtype=int) unique = numpy.unique(items) unique array([0, 1, 2, 3, 4]) counts=numpy.histogram(items,unique) counts (array([1, 1, 2, 1, 1]), array([0, 1, 2, 3, 4])) counts[0] array([1, 1, 2, 1, 1]) On Thu, May 22, 2008 at 9:08 AM, Keith Goodman [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: After poking around for a bit, I was wondering if there was a faster method for the following: # Array of index values 0..n items = numpy.array([0,3,2,1,4,2],dtype=int) # Count the number of occurrences of each index counts = numpy.zeros(5, dtype=int) for i in items: counts[i] += 1 In my real code, 'items' contain up to a million values and this loop will be in a performance critical area of code. If there is no simple solution, I can trivially code this using the C-API. How big is n? If it is much smaller than a million then loop over that instead. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancier indexing
On Thu, May 22, 2008 at 9:15 AM, Keith Goodman [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 9:08 AM, Keith Goodman [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: After poking around for a bit, I was wondering if there was a faster method for the following: # Array of index values 0..n items = numpy.array([0,3,2,1,4,2],dtype=int) # Count the number of occurrences of each index counts = numpy.zeros(5, dtype=int) for i in items: counts[i] += 1 In my real code, 'items' contain up to a million values and this loop will be in a performance critical area of code. If there is no simple solution, I can trivially code this using the C-API. How big is n? If it is much smaller than a million then loop over that instead. Or how about using a list instead: items = [0,3,2,1,4,2] uitems = frozenset(items) count = [items.count(i) for i in uitems] count [1, 1, 2, 1, 1] Oh, I see, so uitems should be range(n) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancier indexing
On Thu, May 22, 2008 at 4:59 PM, Kevin Jacobs [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: After poking around for a bit, I was wondering if there was a faster method for the following: # Array of index values 0..n items = numpy.array([0,3,2,1,4,2],dtype=int) # Count the number of occurrences of each index counts = numpy.zeros(5, dtype=int) for i in items: counts[i] += 1 In my real code, 'items' contain up to a million values and this loop will be in a performance critical area of code. If there is no simple solution, I can trivially code this using the C-API. I would use bincount: count = bincount(items) should be all you need: In [192]: items = [0,3,2,1,4,2] In [193]: bincount(items) Out[193]: array([1, 1, 2, 1, 1]) In [194]: bincount? Type: builtin_function_or_method Base Class: type 'builtin_function_or_method' String Form:built-in function bincount Namespace: Interactive Docstring: bincount(x,weights=None) Return the number of occurrences of each value in x. x must be a list of non-negative integers. The output, b[i], represents the number of times that i is found in x. If weights is specified, every occurrence of i at a position p contributes weights[p] instead of 1. See also: histogram, digitize, unique. Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancier indexing
On Thu, May 22, 2008 at 9:22 AM, Robin [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 4:59 PM, Kevin Jacobs [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: After poking around for a bit, I was wondering if there was a faster method for the following: # Array of index values 0..n items = numpy.array([0,3,2,1,4,2],dtype=int) # Count the number of occurrences of each index counts = numpy.zeros(5, dtype=int) for i in items: counts[i] += 1 In my real code, 'items' contain up to a million values and this loop will be in a performance critical area of code. If there is no simple solution, I can trivially code this using the C-API. I would use bincount: count = bincount(items) should be all you need: I guess bincount is *little* faster: items = mp.random.randint(0, 100, (100,)) timeit mp.bincount(items) 100 loops, best of 3: 4.05 ms per loop items = items.tolist() timeit [items.count(i) for i in range(100)] 10 loops, best of 3: 2.91 s per loop ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution
On May 22, 11:11 am, joep [EMAIL PROTECTED] wrote: Hi, When looking for this, I found that the Dirichlet distribution is missing from the new Docstring Wiki,http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random Actually, a search on the wiki finds dirichlet in http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/mtrand/dirichlet I found random/mtrand only through the search, it doesn't seem to be linked from anywhere Is it intentional that function that are imported inside numpy might have the same docstring assigned to several different wiki pages, and might get edited on different pages? Since all distribution (except for dirichlet) are included in numpy.random.__all__, these distribution show up on two different pages, e.g. http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/poisson and http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/mtrand/poisson So except for the strange parsing of the dirichlet docstring, this is a problem with numpy: numpy.random.__all__ as defined in numpy/random/info.py does not expose Dirichlet Josef ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Hi Andrea 2008/5/22 Andrea Gavana [EMAIL PROTECTED]: You clearly have a large dataset, otherwise speed wouldn't have been a concern to you. You can do your operation in one pass over the data, and I'd suggest you try doing that with Cython or Ctypes. If you need an example on how to access data using those methods, let me know. Of course, it *can* be done using NumPy (maybe not in one pass), but thinking in terms of for-loops is sometimes easier, and immediately takes you to a highly optimised execution time. First of all, thank you for your answer. I know next to nothing about Cython and very little about Ctypes, but it would be nice to have an example on how to use them to speed up the operations. Actually, I don't really know if my dataset is large, as I work normally with xCent, yCent and zCent vectors of about 100,000-300,000 elements in them. Just to clarify things in my mind: is VTK *that* slow? I find that surprising, since it is written in C or C++. Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] osX leopard linker setting
Hi, does anybody know the linker settings for python c modules on osX ? i have the original xcode3 tools installed - gcc 4.0.1 I use '-bundle -flat_namespace' for linking now, but I get ld: can't insert lazy pointers, __dyld section not found for inferred architecture ppc Does anybody know of flags working for the linker? Thank you, Thomas ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] osX leopard linker setting
By the way, whats a lazy pointer anyway? -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] im Auftrag von Thomas Hrabe Gesendet: Do 22.05.2008 11:22 An: numpy-discussion@scipy.org Betreff: [Numpy-discussion] osX leopard linker setting Hi, does anybody know the linker settings for python c modules on osX ? i have the original xcode3 tools installed - gcc 4.0.1 I use '-bundle -flat_namespace' for linking now, but I get ld: can't insert lazy pointers, __dyld section not found for inferred architecture ppc Does anybody know of flags working for the linker? Thank you, Thomas winmail.dat___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] osX leopard linker setting
On Thu, May 22, 2008 at 1:22 PM, Thomas Hrabe [EMAIL PROTECTED] wrote: Hi, does anybody know the linker settings for python c modules on osX ? i have the original xcode3 tools installed - gcc 4.0.1 Just use distutils (or numpy.distutils). It will take care of the linker flags for you. If you really can't use distutils for some reason, take a look at the flags that are added for modules that do build with distutils. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution
On May 22, 1:30 pm, Pauli Virtanen [EMAIL PROTECTED] wrote: to, 2008-05-22 kello 09:51 -0700, joep kirjoitti: It is not intentional. And for the majority of cases this does not happen, and I can fix this for numpy.random.mtrand. Thanks for reporting. I was looking some more at the __all__ statements and trying to figure out what the system/idea behind the imports and exposure of functions at different places is. I did not find any other full duplication as with mtrand so far. However, when I do a search on the DocWiki for example for arccos (or log, log10, exp, tan,...), I see it 9 times, and it is not clear which ones refer to the same docstring and where several imports of the same function are picked up separately, and which ones refer to actually different functions in the source. numpy.lookfor('arccos') yields 3 results, with 3 different doc strings, the other 6 might be duplicates. http://sd-2116.dedibox.fr/doc/Docstrings/numpy/lib/scimath/arccos has the most informative docstring In numpy it is exposed as ``numpy.emath.arccos`` A recommendation for docstring editing might be to verify duplicates and copy doc strings if the function is (almost) duplicated or triplicated in the numpy source and possibly cross link different versions. When I start from the DocWiki front page, I seem to be able to follow links only to one version of any docstring, but any search leads to the multiple exposer of the same function. Josef ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] workaround for searchsorted with strings?
Hello- I see from this thread: http://article.gmane.org/gmane.comp.python.numeric.general/18746/ that searchsorted does not work correctly with strings. Is there a workaround, though, that I can use with 1.0.4 until there is a new official numpy release that includes the fix mentioned in the reference above? Using the latest SVN version is not an option for me. My understanding was that searchsorted works OK if the strings are all the same data type, but that does not appear to be the case: p x=array(['0', '1', '2', '12']) p y=array(['0', '0', '2', '3', '123']) p x.searchsorted(y) array([0, 0, 0, 2, 0]) p x.astype(y.dtype).searchsorted(y) array([0, 0, 2, 4, 2]) I understand that the first call to searchsorted fails because y has type S3 and x has type S2. But it seems that changing the type of x produces still incorrect (albeit) different results. Is there something similar I can do to make this work for now? Thanks very much. -Lewis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] workaround for searchsorted with strings?
Oh sorry, my example was dumb, never mind. It looks like this way does work after all. Can someone please confirm for me, though, that the workaround I am using (just changing to the wider string type) is reliable? Thanks, sorry for the noise. -Lewis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] workaround for searchsorted with strings?
On Thu, May 22, 2008 at 12:29 PM, Lewis Hyatt [EMAIL PROTECTED] wrote: Hello- I see from this thread: http://article.gmane.org/gmane.comp.python.numeric.general/18746/ that searchsorted does not work correctly with strings. Is there a workaround, though, that I can use with 1.0.4 until there is a new official numpy release that includes the fix mentioned in the reference above? Using the latest SVN version is not an option for me. My understanding was that searchsorted works OK if the strings are all the same data type, but that does not appear to be the case: p x=array(['0', '1', '2', '12']) p y=array(['0', '0', '2', '3', '123']) p x.searchsorted(y) array([0, 0, 0, 2, 0]) p x.astype(y.dtype).searchsorted(y) array([0, 0, 2, 4, 2]) I understand that the first call to searchsorted fails because y has type S3 and x has type S2. But it seems that changing the type of x produces still incorrect (albeit) different results. Is there something similar I can do to make this work for now? Thanks very much. The x array is not sorted. Try x = array(['0', '1', '12', '2']) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] buglet: Dirichlet missing in numpy.random.__all__ as defined in numpy/random/info.py
The Dirichlet distribution is missing in __all__ in http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/random/info.py As a consequence numpy.lookfor does not find Dirichlet numpy.lookfor('dirichlet') Search results for 'dirichlet' -- import numpy.random dir(numpy.random) contains dirichlet numpy.random.__all__ does not contain dirichlet. looks like a tiny bug. Josef (this is kind of a duplicate email, but I didn't want it to get lost in the DocWiki discussion) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
On Thu, May 22, 2008 at 12:26 PM, Stéfan van der Walt [EMAIL PROTECTED] wrote: Just to clarify things in my mind: is VTK *that* slow? I find that surprising, since it is written in C or C++. Performance can depend more on the design of the code than the implementation language. There are several places in VTK which are slower than they strictly could be because VTK exposes data primarily through abstract interfaces and only sometimes expose underlying data structure for faster processing. Quite sensibly, they implement the general form first. It's much the same with parts of numpy. The iterator abstraction lets you work on arbitrarily strided arrays, but for contiguous arrays, just using the pointer lets you, and the compiler, optimize your code more. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution
to, 2008-05-22 kello 11:28 -0700, joep kirjoitti: [clip] However, when I do a search on the DocWiki for example for arccos (or log, log10, exp, tan,...), I see it 9 times, and it is not clear which ones refer to the same docstring and where several imports of the same function are picked up separately, and which ones refer to actually different functions in the source. [clip] A recommendation for docstring editing might be to verify duplicates and copy doc strings if the function is (almost) duplicated or triplicated in the numpy source and possibly cross link different versions. This is a problem with the tool on handling extension objects and Pyrex-generated classes, and the editors shouldn't have to concern themselves with it. I'll fix it and remove any unedited duplicates from the wiki. Pauli ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] workaround for searchsorted with strings?
On Thu, May 22, 2008 at 12:36 PM, Lewis Hyatt [EMAIL PROTECTED] wrote: Oh sorry, my example was dumb, never mind. It looks like this way does work after all. Can someone please confirm for me, though, that the workaround I am using (just changing to the wider string type) is reliable? Thanks, sorry for the noise. You can still have problems because the numpy strings will be filled out with zeros. The string compare in 1.0.4 doesn't handle zeros correctly and this might cause some problems. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different attributes for NumPy types
On Thu, May 22, 2008 at 10:24 AM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Bruce Southey wrote: Hi, Is it bug if different NumPy types have different attributes? I don't think so, other than perhaps we should not have the Python types in the numpy namespace. numpy.float is just __builtin__.float which is a Python type not a NumPy data-type object. numpy.float64 inherits from numpy.float however. And I believe this is the cause of the difference between the attributes of numpy.float32/numpy.float128 and numpy.float64. Same deal with int0 and int64 on your presumably 64-bit platform. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] C API
All, I added a function to array_api_order.txt and apparently this changed the order of the pointers in the API, which caused ctypes to segfault until I removed the build directory and did a complete rebuild. It seems to me that if we really want to make adding these API functions safe, then we should only have one list instead of the current two. This looks to require some mods to the build system. What do folks think? Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Hi All, On Thu, May 22, 2008 at 7:46 PM, Robert Kern wrote: On Thu, May 22, 2008 at 12:26 PM, Stéfan van der Walt [EMAIL PROTECTED] wrote: Just to clarify things in my mind: is VTK *that* slow? I find that surprising, since it is written in C or C++. Performance can depend more on the design of the code than the implementation language. There are several places in VTK which are slower than they strictly could be because VTK exposes data primarily through abstract interfaces and only sometimes expose underlying data structure for faster processing. Quite sensibly, they implement the general form first. Yes, Robert is perfectly right. VTK is quite handy in most of the situations, but in this case I had to recursively apply 3 thresholds (each one for X, Y and Z respectively) and the threshold construction (initialization) and its execution were much slower than my (sloppy) numpy result. Compared to the solution Francesc posted, the VTK approach simply disappears. By the way, about the solution Francesc posted: xyzReq = (xCent = xMin) (xCent = xMax) \ (yCent = yMin) (yCent = yMax) \ (zCent = zMin) (zCent = zMax) xyzReq = numpy.nonzero(xyzReq)[0] Do you think is there any chance that a C extension (or something similar) could be faster? Or something else using weave? I understand that this solution is already highly optimized as it uses the power of numpy with the logic operations in Python, but I was wondering if I can make it any faster: on my PC, the algorithm runs in 0.01 seconds, more or less, for 150,000 cells, but today I encountered a case in which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-( Otherwise, I will try and implement it in Fortran and wrap it with f2py, assuming I am able to do it correctly and the overhead of calling an external extension is not killing the execution time. Thank you very much for your sugestions. Andrea. Imagination Is The Only Weapon In The War Against Reality. http://xoomer.alice.it/infinity77/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different attributes for NumPy types
Hi, Thanks very much for the confirmation. Bruce On Thu, May 22, 2008 at 2:09 PM, Robert Kern [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 10:24 AM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Bruce Southey wrote: Hi, Is it bug if different NumPy types have different attributes? I don't think so, other than perhaps we should not have the Python types in the numpy namespace. numpy.float is just __builtin__.float which is a Python type not a NumPy data-type object. numpy.float64 inherits from numpy.float however. And I believe this is the cause of the difference between the attributes of numpy.float32/numpy.float128 and numpy.float64. Same deal with int0 and int64 on your presumably 64-bit platform. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Andrea Gavana wrote: By the way, about the solution Francesc posted: xyzReq = (xCent = xMin) (xCent = xMax) \ (yCent = yMin) (yCent = yMax) \ (zCent = zMin) (zCent = zMax) xyzReq = numpy.nonzero(xyzReq)[0] Do you think is there any chance that a C extension (or something similar) could be faster? yep -- if I've be got this right, the above creates 7 temporary arrays. creating that many and pushing the data in and out of memory can be pretty slow for large arrays. In C, C++, Cython or Fortran, you can just do one loop, and one output array. It should be much faster for the big arrays. Otherwise, I will try and implement it in Fortran and wrap it with f2py, assuming I am able to do it correctly and the overhead of calling an external extension is not killing the execution time. nope, that's one function call for the whole thing, negligible. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different attributes for NumPy types
On Thu, May 22, 2008 at 1:32 PM, Bruce Southey [EMAIL PROTECTED] wrote: Hi, Thanks very much for the confirmation. Bruce On Thu, May 22, 2008 at 2:09 PM, Robert Kern [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 10:24 AM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Bruce Southey wrote: Hi, Is it bug if different NumPy types have different attributes? I don't think so, other than perhaps we should not have the Python types in the numpy namespace. numpy.float is just __builtin__.float which is a Python type not a NumPy data-type object. numpy.float64 inherits from numpy.float however. And I believe this is the cause of the difference between the attributes of numpy.float32/numpy.float128 and numpy.float64. Same deal with int0 and int64 on your presumably 64-bit platform. It also leads to various inconsistencies: In [1]: float32(array([[1]])) Out[1]: array([[ 1.]], dtype=float32) In [2]: float64(array([[1]])) Out[2]: 1.0 Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different attributes for NumPy types
On Thu, May 22, 2008 at 2:46 PM, Charles R Harris [EMAIL PROTECTED] wrote: It also leads to various inconsistencies: In [1]: float32(array([[1]])) Out[1]: array([[ 1.]], dtype=float32) In [2]: float64(array([[1]])) Out[2]: 1.0 Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype). -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
On Thu, May 22, 2008 at 2:16 PM, Andrea Gavana [EMAIL PROTECTED] wrote: By the way, about the solution Francesc posted: xyzReq = (xCent = xMin) (xCent = xMax) \ (yCent = yMin) (yCent = yMax) \ (zCent = zMin) (zCent = zMax) You could implement this with inplace operations to save memory: xyzReq = (xCent = xMin) xyzReq = (xCent = xMax) xyzReq = (yCent = yMin) xyzReq = (yCent = yMax) xyzReq = (zCent = zMin) xyzReq = (zCent = zMax) Do you think is there any chance that a C extension (or something similar) could be faster? Or something else using weave? I understand that this solution is already highly optimized as it uses the power of numpy with the logic operations in Python, but I was wondering if I can make it any faster A C implementation would certainly be faster, perhaps 5x faster, due to short-circuiting the AND operations and the fact that you'd only pass over the data once. OTOH I'd be very surprised if this is the slowest part of your application. -- Nathan Bell [EMAIL PROTECTED] http://graphics.cs.uiuc.edu/~wnbell/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Hi Andrea 2008/5/22 Andrea Gavana [EMAIL PROTECTED]: By the way, about the solution Francesc posted: xyzReq = (xCent = xMin) (xCent = xMax) \ (yCent = yMin) (yCent = yMax) \ (zCent = zMin) (zCent = zMax) xyzReq = numpy.nonzero(xyzReq)[0] Do you think is there any chance that a C extension (or something similar) could be faster? Or something else using weave? I understand that this solution is already highly optimized as it uses the power of numpy with the logic operations in Python, but I was wondering if I can make it any faster: on my PC, the algorithm runs in 0.01 seconds, more or less, for 150,000 cells, but today I encountered a case in which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-( Otherwise, I will try and implement it in Fortran and wrap it with f2py, assuming I am able to do it correctly and the overhead of calling an external extension is not killing the execution time. I wrote a quick proof of concept (no guarantees). You can find it here (download using bzr, http://bazaar-vcs.org, or just grab the files with your web browser): https://code.launchpad.net/~stefanv/+junk/xyz 1. Install Cython if you haven't already 2. Run python setup.py build_ext -i to build the C extension 3. Use the code, e.g., import xyz out = xyz.filter(array([1.0, 2.0, 3.0]), 2, 5, array([2.0, 4.0, 6.0]), 2, 4, array([-1.0, -2.0, -4.0]), -3, -2) In the above case, out is [False, True, False]. Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different attributes for NumPy types
On Thu, May 22, 2008 at 2:59 PM, Robert Kern [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 2:46 PM, Charles R Harris [EMAIL PROTECTED] wrote: It also leads to various inconsistencies: In [1]: float32(array([[1]])) Out[1]: array([[ 1.]], dtype=float32) In [2]: float64(array([[1]])) Out[2]: 1.0 Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype). -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion So, should these return an error if the argument is an ndarray object, a list or similar? Otherwise, int, float and string type of arguments would be okay under the assumption that people would like variable precision scalars. Bruce ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C API
Charles R Harris wrote: All, I added a function to array_api_order.txt and apparently this changed the order of the pointers in the API, which caused ctypes to segfault until I removed the build directory and did a complete rebuild. It seems to me that if we really want to make adding these API functions safe, then we should only have one list instead of the current two. This looks to require some mods to the build system. What do folks think? Yes, or a simple solution is to only append to one of the lists. At the very least, we should mark the array_api_order as not appendable. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C API
On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Charles R Harris wrote: All, I added a function to array_api_order.txt and apparently this changed the order of the pointers in the API, which caused ctypes to segfault until I removed the build directory and did a complete rebuild. It seems to me that if we really want to make adding these API functions safe, then we should only have one list instead of the current two. This looks to require some mods to the build system. What do folks think? Yes, or a simple solution is to only append to one of the lists. At the very least, we should mark the array_api_order as not appendable. That doesn't work unless I change the tag from OBJECT_API to MULTIARRAY_API. Do these tags really matter? Maybe we should just replace them with API and merge this lists. At the beginning of 1.2, of course. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Stéfan van der Walt wrote: I wrote a quick proof of concept (no guarantees). Thanks for the example -- I like how Cython understands ndarrays! It looks like this code would break if x,y,and z are not C-contiguous -- should there be a check for that? -Chris here (download using bzr, http://bazaar-vcs.org, or just grab the files with your web browser): https://code.launchpad.net/~stefanv/+junk/xyz -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Hi Chris and All, On Thu, May 22, 2008 at 8:40 PM, Christopher Barker wrote: Andrea Gavana wrote: By the way, about the solution Francesc posted: xyzReq = (xCent = xMin) (xCent = xMax) \ (yCent = yMin) (yCent = yMax) \ (zCent = zMin) (zCent = zMax) xyzReq = numpy.nonzero(xyzReq)[0] Do you think is there any chance that a C extension (or something similar) could be faster? yep -- if I've be got this right, the above creates 7 temporary arrays. creating that many and pushing the data in and out of memory can be pretty slow for large arrays. In C, C++, Cython or Fortran, you can just do one loop, and one output array. It should be much faster for the big arrays. Well, I have implemented it in 2 ways in Fortran, and actually the Fortran solutions are slower than the numpy one (2 and 3 times slower respectively). I attach the source code of the timing code and the 5 implementations I have at the moment (I have included Nathan's implementation, which is as fast as Francesc's one but it has the advantage of saving memory). The timing I get on my home PC are: Andrea's Solution: 0.42807561 Seconds/Trial Francesc's Solution: 0.018297884 Seconds/Trial Fortran Solution 1: 0.035862072 Seconds/Trial Fortran Solution 2: 0.029822338 Seconds/Trial Nathan's Solution: 0.018930507 Seconds/Trial Maybe my fortran coding is sloppy but I don't really know fortran so well to implement it better... Thank you so much to everybody for your suggestions so far :-D Andrea. Imagination Is The Only Weapon In The War Against Reality. http://xoomer.alice.it/infinity77/ # Begin Code import numpy from timeit import Timer # FORTRAN modules from MultipleBoolean3 import multipleboolean3 from MultipleBoolean4 import multipleboolean4 # Number of cells in my original grid nCells = 15 # Define some constraints for X, Y, Z xMin, xMax = 250.0, 700.0 yMin, yMax = 1000.0, 1900.0 zMin, zMax = 120.0, 300.0 # Generate random centroids for the cells xCent = 1000.0*numpy.random.rand(nCells) yCent = 2500.0*numpy.random.rand(nCells) zCent = 400.0*numpy.random.rand(nCells) def MultipleBoolean1(): Andrea's solution, slow :-( . xReq_1 = numpy.nonzero(xCent = xMin) xReq_2 = numpy.nonzero(xCent = xMax) yReq_1 = numpy.nonzero(yCent = yMin) yReq_2 = numpy.nonzero(yCent = yMax) zReq_1 = numpy.nonzero(zCent = zMin) zReq_2 = numpy.nonzero(zCent = zMax) xReq = numpy.intersect1d_nu(xReq_1, xReq_2) yReq = numpy.intersect1d_nu(yReq_1, yReq_2) zReq = numpy.intersect1d_nu(zReq_1, zReq_2) xyReq = numpy.intersect1d_nu(xReq, yReq) xyzReq = numpy.intersect1d_nu(xyReq, zReq) def MultipleBoolean2(): Francesc's's solution, Much faster :-) . xyzReq = (xCent = xMin) (xCent = xMax) \ (yCent = yMin) (yCent = yMax) \ (zCent = zMin) (zCent = zMax) xyzReq = numpy.nonzero(xyzReq)[0] def MultipleBoolean3(): xyzReq = multipleboolean3(xCent, yCent, zCent, xMin, xMax, yMin, yMax, zMin, zMax, nCells) xyzReq = numpy.nonzero(xyzReq)[0] def MultipleBoolean4(): xyzReq = multipleboolean4(xCent, yCent, zCent, xMin, xMax, yMin, yMax, zMin, zMax, nCells) xyzReq = numpy.nonzero(xyzReq)[0] def MultipleBoolean5(): xyzReq = (xCent = xMin) xyzReq = (xCent = xMax) xyzReq = (yCent = yMin) xyzReq = (yCent = yMax) xyzReq = (zCent = zMin) xyzReq = (zCent = zMax) xyzReq = numpy.nonzero(xyzReq)[0] if __name__ == __main__: trial = 10 t = Timer(MultipleBoolean1(), from __main__ import MultipleBoolean1) print \n\nAndrea's Solution: %0.8g Seconds/Trial%(t.timeit(number=trial)/trial) t = Timer(MultipleBoolean2(), from __main__ import MultipleBoolean2) print Francesc's Solution: %0.8g Seconds/Trial%(t.timeit(number=trial)/trial) t = Timer(MultipleBoolean3(), from __main__ import MultipleBoolean3) print Fortran Solution 1: %0.8g Seconds/Trial%(t.timeit(number=trial)/trial) t = Timer(MultipleBoolean4(), from __main__ import MultipleBoolean4) print Fortran Solution 2: %0.8g Seconds/Trial%(t.timeit(number=trial)/trial) t = Timer(MultipleBoolean5(), from __main__ import MultipleBoolean5) print Nathan's Solution: %0.8g Seconds/Trial\n%(t.timeit(number=trial)/trial) # End Code MultipleBoolean3.f90 Description: Binary data MultipleBoolean4.f90 Description: Binary data ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Hi Stefan, On Thu, May 22, 2008 at 10:23 PM, Stéfan van der Walt wrote: Hi Andrea 2008/5/22 Andrea Gavana [EMAIL PROTECTED]: By the way, about the solution Francesc posted: xyzReq = (xCent = xMin) (xCent = xMax) \ (yCent = yMin) (yCent = yMax) \ (zCent = zMin) (zCent = zMax) xyzReq = numpy.nonzero(xyzReq)[0] Do you think is there any chance that a C extension (or something similar) could be faster? Or something else using weave? I understand that this solution is already highly optimized as it uses the power of numpy with the logic operations in Python, but I was wondering if I can make it any faster: on my PC, the algorithm runs in 0.01 seconds, more or less, for 150,000 cells, but today I encountered a case in which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-( Otherwise, I will try and implement it in Fortran and wrap it with f2py, assuming I am able to do it correctly and the overhead of calling an external extension is not killing the execution time. I wrote a quick proof of concept (no guarantees). You can find it here (download using bzr, http://bazaar-vcs.org, or just grab the files with your web browser): https://code.launchpad.net/~stefanv/+junk/xyz 1. Install Cython if you haven't already 2. Run python setup.py build_ext -i to build the C extension 3. Use the code, e.g., import xyz out = xyz.filter(array([1.0, 2.0, 3.0]), 2, 5, array([2.0, 4.0, 6.0]), 2, 4, array([-1.0, -2.0, -4.0]), -3, -2) In the above case, out is [False, True, False]. Thank you very much for this! I am going to try it and time it, comparing it with the other implementations. I think I need to study a bit your code as I know almost nothing about Cython :-D Thank you! Andrea. Imagination Is The Only Weapon In The War Against Reality. http://xoomer.alice.it/infinity77/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different attributes for NumPy types
On Thu, May 22, 2008 at 4:25 PM, Bruce Southey [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 2:59 PM, Robert Kern [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 2:46 PM, Charles R Harris [EMAIL PROTECTED] wrote: It also leads to various inconsistencies: In [1]: float32(array([[1]])) Out[1]: array([[ 1.]], dtype=float32) In [2]: float64(array([[1]])) Out[2]: 1.0 Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype). So, should these return an error if the argument is an ndarray object, a list or similar? I think it was originally put in as a feature, but given the inconsistency and the long-standing alternatives, I would deprecate its use for converting array dtypes. But that's just my opinion. Otherwise, int, float and string type of arguments would be okay under the assumption that people would like variable precision scalars. Yes. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different attributes for NumPy types
On Thu, May 22, 2008 at 5:07 PM, Robert Kern [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 4:25 PM, Bruce Southey [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 2:59 PM, Robert Kern [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 2:46 PM, Charles R Harris [EMAIL PROTECTED] wrote: It also leads to various inconsistencies: In [1]: float32(array([[1]])) Out[1]: array([[ 1.]], dtype=float32) In [2]: float64(array([[1]])) Out[2]: 1.0 Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype). So, should these return an error if the argument is an ndarray object, a list or similar? I think it was originally put in as a feature, but given the inconsistency and the long-standing alternatives, I would deprecate its use for converting array dtypes. But that's just my opinion. I agree. Having too many ways to do things just makes for headaches. Should we schedule in a deprecation for anything other than scalars and strings. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C API
On Thu, May 22, 2008 at 3:55 PM, Charles R Harris [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Charles R Harris wrote: All, I added a function to array_api_order.txt and apparently this changed the order of the pointers in the API, which caused ctypes to segfault until I removed the build directory and did a complete rebuild. It seems to me that if we really want to make adding these API functions safe, then we should only have one list instead of the current two. This looks to require some mods to the build system. What do folks think? Yes, or a simple solution is to only append to one of the lists. At the very least, we should mark the array_api_order as not appendable. That doesn't work unless I change the tag from OBJECT_API to MULTIARRAY_API. Do these tags really matter? Maybe we should just replace them with API and merge this lists. At the beginning of 1.2, of course. This doesn't look to hard to do. How about a unified NUMPY_API list? Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Boolean Operations
Hi Andrea 2008/5/23 Andrea Gavana [EMAIL PROTECTED]: Thank you very much for this! I am going to try it and time it, comparing it with the other implementations. I think I need to study a bit your code as I know almost nothing about Cython :-D That won't be necessary -- the Fortran-implementation is guaranteed to win! Just to make sure, I timed it anyway (on somewhat larger arrays): Francesc's Solution: 0.062797403 Seconds/Trial Fortran Solution 1: 0.050316906 Seconds/Trial Fortran Solution 2: 0.052595496 Seconds/Trial Nathan's Solution: 0.055562282 Seconds/Trial Cython Solution: 0.06250751 Seconds/Trial Nathan's version runs over the data 6 times, and still does better than the Pyrex version. I don't know why! But, hey, this algorithm is parallelisable! Wait, no, it's bedtime. Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C API
Charles R Harris wrote: On Thu, May 22, 2008 at 3:55 PM, Charles R Harris [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Charles R Harris wrote: All, I added a function to array_api_order.txt and apparently this changed the order of the pointers in the API, which caused ctypes to segfault until I removed the build directory and did a complete rebuild. It seems to me that if we really want to make adding these API functions safe, then we should only have one list instead of the current two. This looks to require some mods to the build system. What do folks think? Yes, or a simple solution is to only append to one of the lists. At the very least, we should mark the array_api_order as not appendable. That doesn't work unless I change the tag from OBJECT_API to MULTIARRAY_API. Do these tags really matter? Maybe we should just replace them with API and merge this lists. At the beginning of 1.2, of course. This doesn't look to hard to do. How about a unified NUMPY_API list? That's fine with me. I can't remember why there were 2 separate lists. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different attributes for NumPy types
Charles R Harris wrote: On Thu, May 22, 2008 at 5:07 PM, Robert Kern [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 4:25 PM, Bruce Southey [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 2:59 PM, Robert Kern [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 2:46 PM, Charles R Harris [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: It also leads to various inconsistencies: In [1]: float32(array([[1]])) Out[1]: array([[ 1.]], dtype=float32) In [2]: float64(array([[1]])) Out[2]: 1.0 Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype). So, should these return an error if the argument is an ndarray object, a list or similar? I think it was originally put in as a feature, but given the inconsistency and the long-standing alternatives, I would deprecate its use for converting array dtypes. But that's just my opinion. I agree. Having too many ways to do things just makes for headaches. Should we schedule in a deprecation for anything other than scalars and strings. I don't have a strong opinion either way. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] triangular matrix fill
I have a question on filling a lower triangular matrix using numpy. This is essentially having two loops and the inner loop upper limit is the outer loop current index. In the inner loop I have a vector being multiplied by a constant set in the outer loop. For a matrix N*N in size, the C the code is: for(i = 0; i N; ++i){ for(j = 0; j i; ++j){ Matrix[i*N + j] = V1[i] * V2[j]; } } Thanks Tom ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C API
On Thu, May 22, 2008 at 6:36 PM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Charles R Harris wrote: On Thu, May 22, 2008 at 3:55 PM, Charles R Harris [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Charles R Harris wrote: All, I added a function to array_api_order.txt and apparently this changed the order of the pointers in the API, which caused ctypes to segfault until I removed the build directory and did a complete rebuild. It seems to me that if we really want to make adding these API functions safe, then we should only have one list instead of the current two. This looks to require some mods to the build system. What do folks think? Yes, or a simple solution is to only append to one of the lists. At the very least, we should mark the array_api_order as not appendable. That doesn't work unless I change the tag from OBJECT_API to MULTIARRAY_API. Do these tags really matter? Maybe we should just replace them with API and merge this lists. At the beginning of 1.2, of course. This doesn't look to hard to do. How about a unified NUMPY_API list? That's fine with me. I can't remember why there were 2 separate lists. OK. Another question, why do __ufunc_api.h and __multiarray_api.h have double underscores prefixes? Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] triangular matrix fill
On Thu, May 22, 2008 at 7:19 PM, Tom Waite [EMAIL PROTECTED] wrote: I have a question on filling a lower triangular matrix using numpy. This is essentially having two loops and the inner loop upper limit is the outer loop current index. In the inner loop I have a vector being multiplied by a constant set in the outer loop. For a matrix N*N in size, the C the code is: for(i = 0; i N; ++i){ for(j = 0; j i; ++j){ Matrix[i*N + j] = V1[i] * V2[j]; } } You can use numpy.outer(V1,V2) and just ignore everything on and above the diagonal. In [1]: x = arange(3) In [2]: y = arange(3,6) In [3]: outer(x,y) Out[3]: array([[ 0, 0, 0], [ 3, 4, 5], [ 6, 8, 10]]) You can mask the upper part if you want: In [16]: outer(x,y)*fromfunction(lambda i,j: ij, (3,3)) Out[16]: array([[0, 0, 0], [3, 0, 0], [6, 8, 0]]) Or you could use fromfunction directly. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] triangular matrix fill
On Thu, May 22, 2008 at 9:07 PM, Charles R Harris [EMAIL PROTECTED] wrote: On Thu, May 22, 2008 at 7:19 PM, Tom Waite [EMAIL PROTECTED] wrote: I have a question on filling a lower triangular matrix using numpy. This is essentially having two loops and the inner loop upper limit is the outer loop current index. In the inner loop I have a vector being multiplied by a constant set in the outer loop. For a matrix N*N in size, the C the code is: for(i = 0; i N; ++i){ for(j = 0; j i; ++j){ Matrix[i*N + j] = V1[i] * V2[j]; } } You can use numpy.outer(V1,V2) and just ignore everything on and above the diagonal. In [1]: x = arange(3) In [2]: y = arange(3,6) In [3]: outer(x,y) Out[3]: array([[ 0, 0, 0], [ 3, 4, 5], [ 6, 8, 10]]) You can mask the upper part if you want: In [16]: outer(x,y)*fromfunction(lambda i,j: ij, (3,3)) Out[16]: array([[0, 0, 0], [3, 0, 0], [6, 8, 0]]) Or you could use fromfunction directly. Or numpy.tril(). -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion