[Numpy-discussion] Masked Array Usage Problems
I am trying out masked arrays for the first time and having some problems. I have a 2-D image as dtype=numpy.int16 I create a mask of all False to not mask out any pixels. I calculate the mean of the image original image and it comes out ~597. I calculate the mean of the masked array and it comes out differently around -179. It produces the same negative mean value no matter what masks I try, e.g. (all True, all False, etc). Furthermore there are no negative samples in the entire array. Any ideas on what am I doing wrong? Here is some sample code showing the behavior: In [1]: img.dtype, img.shape Out[1]: (dtype('int16'), (3200, 3456)) In [2]: mask = numpy.zeros(img.shape, dtype=bool) In [3]: imgma = ma.masked_array(img, mask) In [4]: img.mean() Out[4]: 597.15437617549185 In [5]: imgma.mean() Out[5]: -179.56858678747108 In [6]: imgma.min() Out[6]: 25 In [7]: numpy.__version__ Out[7]: '1.3.0' In [8]: numpy.ma.__version__ Out[8]: '1.0' ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Masked Array Usage Problems
On Sat, Apr 10, 2010 at 3:49 AM, Lane Brooks l...@brooks.nu wrote: I am trying out masked arrays for the first time and having some problems. I have a 2-D image as dtype=numpy.int16 I create a mask of all False to not mask out any pixels. I calculate the mean of the image original image and it comes out ~597. I calculate the mean of the masked array and it comes out differently around -179. It produces the same negative mean value no matter what masks I try, e.g. (all True, all False, etc). Furthermore there are no negative samples in the entire array. Any ideas on what am I doing wrong? Here is some sample code showing the behavior: In [1]: img.dtype, img.shape Out[1]: (dtype('int16'), (3200, 3456)) In [2]: mask = numpy.zeros(img.shape, dtype=bool) In [3]: imgma = ma.masked_array(img, mask) In [4]: img.mean() Out[4]: 597.15437617549185 In [5]: imgma.mean() Out[5]: -179.56858678747108 In [6]: imgma.min() Out[6]: 25 In [7]: numpy.__version__ Out[7]: '1.3.0' In [8]: numpy.ma.__version__ Out[8]: '1.0' Just a guess untill Pierre replies: It looks to me like an integer overflow bug. Can you try imgma.mean(dtype=float) to do the accumulation with floating points? Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy build system questions for use in another project (fwrap)
Kurt Smith wrote: On Fri, Apr 9, 2010 at 2:25 AM, David da...@silveregg.co.jp wrote: On 04/07/2010 11:52 AM, Kurt Smith wrote: Briefly, I'm encountering difficulties getting things working in numpy distutils for fwrap's build system. Here are the steps I want the build system to accomplish: 1) Compile a directory of Fortran 90 source code -- this works. - The .mod files generated by this compilation step are put in the build directory. This is difficult - fortran modules are a PITA from a build perspective. Many compilers don't seem to have a way to control exactly where to put the generated .mod, so the only way I am aware of to control this is to cwd the process into the build directory... From what I can tell, numpy's distutils already takes care of putting the .mod files in the build directory (at least for gfortran and ifort) by manually moving them there -- see numpy.distutils.command.build_ext.build_ext.build_extension, after the line if fmodule_source. This is fine for my purposes. All I need is the project's .mod files put in one place before the next step. This was also a problem when I worked on fortran support for waf (see http://groups.google.com/group/waf-users/browse_thread/thread/889e2a5e5256e420/84ee939e93c9e30f?lnk=gstq=fortran+modules#84ee939e93c9e30f ) Yes, I saw that thread and it dissuaded me from pursuing waf for now. I'd like to take a stab at it sometime down the road, though. My problem is in instantiating numpy.distutils.config such that it is appropriately configured with command line flags. I've tried the following with no success: ('self' is a build_ext instance) cfg = self.distribution.get_command_obj('config') cfg.initialize_options() cfg.finalize_options() # doesn't do what I hoped it would do. This creates a config object, but it doesn't use the command line flags (e.g. --fcompiler=gfortran doesn't affect the fortran compiler used). Why don't you do the testing in config ? That's how things are done Fortran's .mod files are essentially compiler-generated header files; fwrap needs to use these 'headers' to get type information so it can figure out how the C types and Fortran types match up. Fwrap then generates the config files with this information and compiles the wrappers with the other source files. I hope there will at least be an option to supply mod files instead of using this mechanism. For wrapper projects which already (presumably) has a build system for the Fortran code set up it seems more reasonable to me to just refer to the output of the existing build. In particular I don't like the situation today with Python wrappers around C code, where the C code files are often copied into the Python wrapper project. I hope the same won't happen with fwrap, i.e. that people don't copy the Fortran sources to the fwrap wrapper just to make things easy to build (but end up forking the project and not keep it up to date). Of course, if one is not wrapping an existing module but rather developing an application with fwrap then the situation is different, I suppose. -- Dag Sverre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Proposal for new ufunc functionality
Hi, I've been mulling over a couple of ideas for new ufunc methods plus a couple of numpy functions that I think will help implement group-by operations with NumPy arrays. I wanted to discuss them on this list before putting forward an actual proposal or patch to get input from others. The group-by operation is very common in relational algebra and NumPy arrays (especially structured arrays) can often be seen as a database table.There are common and easy-to implement approaches for select and other relational algebra concepts, but group-by basically has to be implemented yourself. Here are my suggested additions to NumPy: ufunc methods: * reduceby (array, by, sorted=1, axis=0) array is the array to reduce by is the array to provide the grouping (can be a structured array or a list of arrays) if sorted is 1, then possibly a faster algorithm can be used. * reducein (array, indices, axis=0) similar to reduce-at, but the indices provide both the start and end points (rather than being fence-posts like reduceat). numpy functions (or methods): * segment(array) (produce an array of integers from an array producing the different regions of an array: segment([10,20,10,20,30,30,10]) would produce ([0,1,0,1,2,2,0]) * edges(array, at=True) produce an index array providing the edges (with either fence-post like syntax for reduce-at or both boundaries like reducein. Thoughts? -Travis Thoughts on the general idea? -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliph...@enthought.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal for new ufunc functionality
la, 2010-04-10 kello 12:23 -0500, Travis Oliphant kirjoitti: [clip] Here are my suggested additions to NumPy: ufunc methods: [clip] * reducein (array, indices, axis=0) similar to reduce-at, but the indices provide both the start and end points (rather than being fence-posts like reduceat). Is the `reducein` important to have, as compared to `reduceat`? [clip] numpy functions (or methods): I'd prefer functions here. ndarray already has a huge number of methods. * segment(array) (produce an array of integers from an array producing the different regions of an array: segment([10,20,10,20,30,30,10]) would produce ([0,1,0,1,2,2,0]) Sounds like `np.digitize(x, bins=np.unique(x))-1`. What would the behavior be with structured arrays? * edges(array, at=True) produce an index array providing the edges (with either fence-post like syntax for reduce-at or both boundaries like reducein. This can probably easily be based on segment(). Thoughts on the general idea? One question is whether these methods should be stuffed to the main namespace, or under numpy.rec. Another addition to ufuncs that should be though about is specifying the Python-side interface to generalized ufuncs. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal for new ufunc functionality
On Sat, Apr 10, 2010 at 1:23 PM, Travis Oliphant oliph...@enthought.com wrote: Hi, I've been mulling over a couple of ideas for new ufunc methods plus a couple of numpy functions that I think will help implement group-by operations with NumPy arrays. I wanted to discuss them on this list before putting forward an actual proposal or patch to get input from others. The group-by operation is very common in relational algebra and NumPy arrays (especially structured arrays) can often be seen as a database table. There are common and easy-to implement approaches for select and other relational algebra concepts, but group-by basically has to be implemented yourself. Here are my suggested additions to NumPy: ufunc methods: * reduceby (array, by, sorted=1, axis=0) array is the array to reduce by is the array to provide the grouping (can be a structured array or a list of arrays) if sorted is 1, then possibly a faster algorithm can be used. how is the grouping in by specified? These functions would be very useful for statistics. One problem with the current bincount is that it doesn't allow multi-dimensional weight arrays (with axis argument). Josef * reducein (array, indices, axis=0) similar to reduce-at, but the indices provide both the start and end points (rather than being fence-posts like reduceat). numpy functions (or methods): * segment(array) (produce an array of integers from an array producing the different regions of an array: segment([10,20,10,20,30,30,10]) would produce ([0,1,0,1,2,2,0]) * edges(array, at=True) produce an index array providing the edges (with either fence-post like syntax for reduce-at or both boundaries like reducein. Thoughts? -Travis Thoughts on the general idea? -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliph...@enthought.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal for new ufunc functionality
On 10 April 2010 19:45, Pauli Virtanen p...@iki.fi wrote: Another addition to ufuncs that should be though about is specifying the Python-side interface to generalized ufuncs. This is an interesting idea; what do you have in mind? Regards Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] INSTALLATION ON CYGWIN
There is too much out there which is making me confuse, I want to install Numpy and Scipy on cygwinCan some body give me steps...There is different version of Numpy ...which one i need to download.and how to check it after installing.. I already have cygwin full version on my pc... Aki ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal for new ufunc functionality
On Sat, Apr 10, 2010 at 12:45, Pauli Virtanen p...@iki.fi wrote: la, 2010-04-10 kello 12:23 -0500, Travis Oliphant kirjoitti: [clip] Here are my suggested additions to NumPy: ufunc methods: [clip] * reducein (array, indices, axis=0) similar to reduce-at, but the indices provide both the start and end points (rather than being fence-posts like reduceat). Is the `reducein` important to have, as compared to `reduceat`? Yes, I think so. If there are some areas you want to ignore, that's difficult to do with reduceat(). -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Simple way to shift array elements
Hello, Is there a simpler way to get c from a I[1]: a = np.arange(10) I[2]: b = a[3:] I[3]: b O[3]: array([3, 4, 5, 6, 7, 8, 9]) I[4]: c = np.insert(b, [7]*3, 0) O[5]: array([3, 4, 5, 6, 7, 8, 9, 0, 0, 0]) a and c have to be same in length and the left shift must be balanced with equal number of 0's Thanks. -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Simple way to shift array elements
On Sat, Apr 10, 2010 at 5:17 PM, Gökhan Sever gokhanse...@gmail.com wrote: Hello, Is there a simpler way to get c from a I[1]: a = np.arange(10) I[2]: b = a[3:] I[3]: b O[3]: array([3, 4, 5, 6, 7, 8, 9]) I[4]: c = np.insert(b, [7]*3, 0) O[5]: array([3, 4, 5, 6, 7, 8, 9, 0, 0, 0]) a and c have to be same in length and the left shift must be balanced with equal number of 0's Does this count as simpler? c = np.zeros_like(a) c[:-3] = a[3:] ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Simple way to shift array elements
On Sat, Apr 10, 2010 at 6:17 PM, Gökhan Sever gokhanse...@gmail.com wrote: Hello, Is there a simpler way to get c from a I[1]: a = np.arange(10) I[2]: b = a[3:] I[3]: b O[3]: array([3, 4, 5, 6, 7, 8, 9]) I[4]: c = np.insert(b, [7]*3, 0) O[5]: array([3, 4, 5, 6, 7, 8, 9, 0, 0, 0]) a and c have to be same in length and the left shift must be balanced with equal number of 0's Maybe something like In [1]: a = np.arange(10) In [2]: b = zeros_like(a) In [3]: b[:-3] = a[3:] In [4]: b Out[4]: array([3, 4, 5, 6, 7, 8, 9, 0, 0, 0]) Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Simple way to shift array elements
On Sat, Apr 10, 2010 at 7:31 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Apr 10, 2010 at 6:17 PM, Gökhan Sever gokhanse...@gmail.comwrote: Hello, Is there a simpler way to get c from a I[1]: a = np.arange(10) I[2]: b = a[3:] I[3]: b O[3]: array([3, 4, 5, 6, 7, 8, 9]) I[4]: c = np.insert(b, [7]*3, 0) O[5]: array([3, 4, 5, 6, 7, 8, 9, 0, 0, 0]) a and c have to be same in length and the left shift must be balanced with equal number of 0's Maybe something like In [1]: a = np.arange(10) In [2]: b = zeros_like(a) In [3]: b[:-3] = a[3:] In [4]: b Out[4]: array([3, 4, 5, 6, 7, 8, 9, 0, 0, 0]) Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Thanks, Your ways are more obvious than my first approach. With a bit more playing I get a one-liner: c=np.append(a[3:], [0]*3) -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy build system questions for use in another project (fwrap)
On Sat, Apr 10, 2010 at 9:53 AM, Dag Sverre Seljebotn da...@student.matnat.uio.no wrote: Kurt Smith wrote: On Fri, Apr 9, 2010 at 2:25 AM, David da...@silveregg.co.jp wrote: On 04/07/2010 11:52 AM, Kurt Smith wrote: Fortran's .mod files are essentially compiler-generated header files; fwrap needs to use these 'headers' to get type information so it can figure out how the C types and Fortran types match up. Fwrap then generates the config files with this information and compiles the wrappers with the other source files. I hope there will at least be an option to supply mod files instead of using this mechanism. For wrapper projects which already (presumably) has a build system for the Fortran code set up it seems more reasonable to me to just refer to the output of the existing build. Yes -- I've been thinking about this usecase recently, and I was thinking along the same lines. Fwrap shouldn't have to have a fully general Fortran build system, that obviously isn't its intended purpose. (Build systems for mixed Fortran/C/C++/Whatever are becoming quite an albatross[0], IMHO, and fwrap would do well to avoid the whole issue). I've been assuming up until now that since f2py takes care of it, so should fwrap, but that's foolish. The scipy sources that f2py wraps are all fairly simple, and they're F77 source, which have no concept of interfaces, etc. F9X source is a different beast. I'm realizing the many conflicting usecases that are out there, and the impossibility for fwrap to handle them all well. The simple cases (you mention below) with a few fortran source files are fine, but for a big project with a complicated long build process, it would be foolish for fwrap to try and become a fully general Fortran build system in addition to its intended purpose. It isn't always clear what compiler flags one needs to use to ensure that the fortran code is compiled suitably for a Python extension module. Here's what I'll probably do: Python has a 'python-config' command that indicates the different flags, include and link libraries to use to compile extension modules. Fwrap could do something similar, and the user is responsible for supplying the compiled .o and .mod files, using the configuration flags supplied by fwrap for compilation. So you could do something like the following to get the compiler flags: $ fwrap --get-cflags --fcompiler=intelem And it would spit out the flags necessary to include when compiling the fortran source files. The user is responsible for handing off the .o and .mod files to fwrap, and fwrap then does the rest. This is good -- I think we're converging on a solution. And it keeps fwrap focused on what it's supposed to do. In particular I don't like the situation today with Python wrappers around C code, where the C code files are often copied into the Python wrapper project. I hope the same won't happen with fwrap, i.e. that people don't copy the Fortran sources to the fwrap wrapper just to make things easy to build (but end up forking the project and not keep it up to date). No -- I'll do everything in my power to keep anything like this from being necessary :-) Thanks for the input. Kurt [0] http://en.wikipedia.org/wiki/Albatross_%28metaphor%29 Of course, if one is not wrapping an existing module but rather developing an application with fwrap then the situation is different, I suppose. -- Dag Sverre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Masked Array Usage Problems
On Apr 10, 2010, at 5:17 AM, josef.p...@gmail.com wrote: On Sat, Apr 10, 2010 at 3:49 AM, Lane Brooks l...@brooks.nu wrote: I am trying out masked arrays for the first time and having some problems. I have a 2-D image as dtype=numpy.int16 I create a mask of all False to not mask out any pixels. I calculate the mean of the image original image and it comes out ~597. I calculate the mean of the masked array and it comes out differently around -179. It produces the same negative mean value no matter what masks I try, e.g. (all True, all False, etc). Furthermore there are no negative samples in the entire array. Any ideas on what am I doing wrong? Here is some sample code showing the behavior: In [1]: img.dtype, img.shape Out[1]: (dtype('int16'), (3200, 3456)) In [2]: mask = numpy.zeros(img.shape, dtype=bool) In [3]: imgma = ma.masked_array(img, mask) In [4]: img.mean() Out[4]: 597.15437617549185 In [5]: imgma.mean() Out[5]: -179.56858678747108 In [6]: imgma.min() Out[6]: 25 In [7]: numpy.__version__ Out[7]: '1.3.0' In [8]: numpy.ma.__version__ Out[8]: '1.0' Just a guess untill Pierre replies: It looks to me like an integer overflow bug. Can you try imgma.mean(dtype=float) to do the accumulation with floating points? Josef Indeed using dtype=float solved the problem. The numpy.mean doc string says the default accumulator type for all int types is a float. Why is ma.mean different, especially since thema.mean doc string says to see the numpy.mean doc string? Thanks Lane ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Masked Array Usage Problems
On Sun, Apr 11, 2010 at 1:00 AM, Lane Brooks l...@brooks.nu wrote: On Apr 10, 2010, at 5:17 AM, josef.p...@gmail.com wrote: On Sat, Apr 10, 2010 at 3:49 AM, Lane Brooks l...@brooks.nu wrote: I am trying out masked arrays for the first time and having some problems. I have a 2-D image as dtype=numpy.int16 I create a mask of all False to not mask out any pixels. I calculate the mean of the image original image and it comes out ~597. I calculate the mean of the masked array and it comes out differently around -179. It produces the same negative mean value no matter what masks I try, e.g. (all True, all False, etc). Furthermore there are no negative samples in the entire array. Any ideas on what am I doing wrong? Here is some sample code showing the behavior: In [1]: img.dtype, img.shape Out[1]: (dtype('int16'), (3200, 3456)) In [2]: mask = numpy.zeros(img.shape, dtype=bool) In [3]: imgma = ma.masked_array(img, mask) In [4]: img.mean() Out[4]: 597.15437617549185 In [5]: imgma.mean() Out[5]: -179.56858678747108 In [6]: imgma.min() Out[6]: 25 In [7]: numpy.__version__ Out[7]: '1.3.0' In [8]: numpy.ma.__version__ Out[8]: '1.0' Just a guess untill Pierre replies: It looks to me like an integer overflow bug. Can you try imgma.mean(dtype=float) to do the accumulation with floating points? Josef Indeed using dtype=float solved the problem. The numpy.mean doc string says the default accumulator type for all int types is a float. Why is ma.mean different, especially since the ma.mean doc string says to see the numpy.mean doc string? I think it is a bug in ma.mean, since the docstring clearly specifies the casting to float. Can you file a ticket? I think Pierre will look at it when he finds time. Thanks, Josef Thanks Lane ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion