Re: [Numpy-discussion] change the mask state of one element in a masked array

2012-02-18 Thread Olivier Delalleau
There may be a better way to do it, but you can first do:
a.mask = np.zeros_like(a)
then afterwards e.g. a.mask[0, 0] = True will work.

-=- Olivier

Le 18 février 2012 10:52, Chao YUE  a écrit :

> Dear all,
>
> I built a new empty masked array:
>
> In [91]: a=np.ma.empty((2,5))
>
> In [92]: a
> Out[92]:
> masked_array(data =
>  [[  1.20569155e-312   3.34730819e-316   1.13580079e-316   1.11459945e-316
> 9.69610549e-317]
>  [  6.94900258e-310   8.48292532e-317   6.94900258e-310   9.76397825e-317
> 6.94900258e-310]],
>  mask =
>  False,
>fill_value = 1e+20)
>
>
> as you see, the mask for all the elements are false. so how can I set for
> some elements to masked elements (mask state as true)?
> let's say, I want a[0,0] to be masked.
>
> thanks & cheers,
>
> Chao
>
> --
>
> ***
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> 
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray and lazy evaluation (was: Proposed Rodmap Overview)

2012-02-20 Thread Olivier Delalleau
Never mind. The link Francesc posted answered my question :)

-=- Olivier

Le 20 février 2012 12:54, Olivier Delalleau  a
écrit :

> Le 20 février 2012 12:46, Dag Sverre Seljebotn  > a écrit :
>
> On 02/20/2012 09:24 AM, Olivier Delalleau wrote:
>> > Hi Dag,
>> >
>> > Would you mind elaborating a bit on that example you mentioned at the
>> > end of your email? I don't quite understand what behavior you would like
>> > to achieve
>>
>> Sure, see below. I think we should continue discussion on numpy-discuss.
>>
>> I wrote:
>>
>> > You need at least a slightly different Python API to get anywhere, so
>> > numexpr/Theano is the right place to work on an implementation of this
>> > idea. Of course it would be nice if numexpr/Theano offered something as
>> > convenient as
>> >
>> > with lazy:
>> >  arr = A + B + C # with all of these NumPy arrays
>> > # compute upon exiting...
>>
>> More information:
>>
>> The disadvantage today of using Theano (or numexpr) is that they require
>> using a different API, so that one has to learn and use Theano "from the
>> ground up", rather than just slap it on in an optimization phase.
>>
>> The alternative would require extensive changes to NumPy, so I guess
>> Theano authors or Francesc would need to push for this.
>>
>> The alternative would be (with A, B, C ndarray instances):
>>
>> with theano.lazy:
>> arr = A + B + C
>>
>> On __enter__, the context manager would hook into NumPy to override it's
>> arithmetic operators. Then it would build a Theano symbolic tree instead
>> of performing computations right away.
>>
>> In addition to providing support for overriding arithmetic operators,
>> slicing etc., it would be necesarry for "arr" to be an ndarray instance
>> which is "not yet computed" (data-pointer set to NULL, and store a
>> compute-me callback and some context information).
>>
>> Finally, the __exit__ would trigger computation. For other operations
>> which need the data pointer (e.g., single element lookup) one could
>> either raise an exception or trigger computation.
>>
>> This is just a rough sketch. It is not difficult "in principle", but of
>> course there's really a massive amount of work involved to work support
>> for this into the NumPy APIs.
>>
>> Probably, we're talking a NumPy 3.0 thing, after the current round of
>> refactorings have settled...
>>
>> Please: Before discussing this further one should figure out if there's
>> manpower available for it; no sense in hashing out a castle in the sky
>> in details. Also it would be better to talk in person about this if
>> possible (I'm in Berkeley now and will attend PyData and PyCon).
>>
>> Dag
>>
>
> Thanks for the additional details.
>
> I feel like this must be a stupid question, but I have to ask: what is the
> point of being lazy here, since the computation is performed on exit anyway?
>
> -=- Olivier
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to modify an array

2012-02-26 Thread Olivier Delalleau
This should do what you want:

array_copy = my_array.copy()
array_copy[array_copy == 2] = 0

-=- Olivier

Le 26 février 2012 19:53,  a écrit :

> Dear sirs,
>
>
> Please allow me to ask you a beginner's question.
>
> I have an nparray whose shape is (144, 91, 1). The elements of this array
> are integer "0", "1" or "2", but I don't know which of the three integers
> is assigned to each element.
> I would like to make a copy of this array, and then replace only the
> elements whose value is "2" into "0". Could you teach how to make such a
> modification?
>
>
> Sincerely yours,
>
> Tetsuro Kikuchi
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy fitting

2012-03-01 Thread Olivier Delalleau
Sorry I can't help, but I'd just suggest to post this on the scipy mailing
list as you may get more replies there.

-=- Olivier

Le 1 mars 2012 10:24, Pierre Barthelemy  a écrit :

> Dear all,
>
> i am writing a program for data analysis. One of the functions of this
> program gives the possibility to fit the functions. I followed the recipe
> described in : h*MailScanner soupçonne le lien suivant d'être une
> tentative de fraude de la part de "www.scipy.org" 
> *ttp://www.scipy.org/Cookbook/FittingData
>  under
> the section "Simplifying the syntax".
>
> To fit, i use the function:
>
> out=optimize.leastsq(f, p, full_output=1)
>
> where f is my function and p a list of parameters.
>
> One thing that i would like to know is how can i get the error on the
> parameters ? From what i understood from the "Cookbook" page, and from the
> scipy manual (
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html#scipy.optimize.leastsq),
> the second argument returned by the leastsq function gives access to these
> errors.
> std_error=std(y-function(x))
> param_error=sqrt(diagonal(out[1])*std_error)
>
> The param_errors that i get in this case are extremely small. Much smaller
> than what i expected, and much smaller than what i can get fitting the
> function with matlab. So i guess i made an error here.
>
> Can someone tell me how i should do to retrieve the parameter errors ?
>
> Bests,
>
> Pierre
>
> PS: i got the impression something went wrong with my previous message,
> sorry for that.
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating point "close" function?

2012-03-03 Thread Olivier Delalleau
Le 3 mars 2012 10:27, Robert Kern  a écrit :

> On Sat, Mar 3, 2012 at 14:34, Robert Kern  wrote:
> > On Sat, Mar 3, 2012 at 14:31, Ralf Gommers 
> wrote:
>
> >> Because this is also bad:
> > np.
> >> Display all 561 possibilities? (y or n)
> >
> > Not as bad as overloading np.allclose(x,y,return_array=True). Or
> > deprecating np.allclose() in favor of np.close().all().
>
> I screwed up this paragraph. I meant that as "Another alternative
> would be to deprecate ...".
>

np.close().all() would probably be a lot less efficient in terms of CPU /
memory though, wouldn't it?

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating point "close" function?

2012-03-03 Thread Olivier Delalleau
Le 3 mars 2012 11:03, Robert Kern  a écrit :

> On Sat, Mar 3, 2012 at 15:51, Olivier Delalleau  wrote:
> > Le 3 mars 2012 10:27, Robert Kern  a écrit :
> >>
> >> On Sat, Mar 3, 2012 at 14:34, Robert Kern 
> wrote:
> >> > On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <
> ralf.gomm...@googlemail.com>
> >> > wrote:
> >>
> >> >> Because this is also bad:
> >> >>>>> np.
> >> >> Display all 561 possibilities? (y or n)
> >> >
> >> > Not as bad as overloading np.allclose(x,y,return_array=True). Or
> >> > deprecating np.allclose() in favor of np.close().all().
> >>
> >> I screwed up this paragraph. I meant that as "Another alternative
> >> would be to deprecate ...".
> >
> >
> > np.close().all() would probably be a lot less efficient in terms of CPU /
> > memory though, wouldn't it?
>
> No. np.allclose() is essentially doing exactly this already.
>

Ok. What about then, np.allclose() could theoretically be a lot more
efficient in terms of CPU / memory? ;)

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating point "close" function?

2012-03-03 Thread Olivier Delalleau
Le 3 mars 2012 13:07, Joe Kington  a écrit :

>
>
> On Sat, Mar 3, 2012 at 9:26 AM, Robert Kern  wrote:
>
>> On Sat, Mar 3, 2012 at 15:22, Benjamin Root  wrote:
>> >
>> >
>> > On Saturday, March 3, 2012, Robert Kern  wrote:
>> >> On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <
>> ralf.gomm...@googlemail.com>
>> >> wrote:
>> >>>
>> >>>
>> >>> On Sat, Mar 3, 2012 at 3:05 PM, Robert Kern 
>> >>> wrote:
>> 
>>  On Sat, Mar 3, 2012 at 13:59, Ralf Gommers <
>> ralf.gomm...@googlemail.com>
>>  wrote:
>>  >
>>  >
>>  > On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington 
>>  > wrote:
>>  >>
>>  >> Is there a numpy function for testing floating point equality that
>>  >> returns
>>  >> a boolean array?
>>  >>
>>  >> I'm aware of np.allclose, but I need a boolean array.  Properly
>>  >> handling
>>  >> NaN's and Inf's (as allclose does) would be a nice bonus.
>>  >>
>>  >> I wrote the function below to do this, but I suspect there's a
>> method
>>  >> in
>>  >> numpy that I missed.
>>  >
>>  >
>>  > I don't think such a function exists, would be nice to have. How
>> about
>>  > just
>>  > adding a keyword "return_array" to allclose to do so?
>> 
>>  As a general design principle, adding a boolean flag that changes the
>>  return type is worse than making a new function.
>> >>>
>> >>>
>> >>> That's certainly true as a general principle. Do you have a concrete
>> >>> suggestion in this case though?
>> >>
>> >> np.close()
>> >>
>> >
>> > When I read that, I mentally think of "close" as in closing a file.  I
>> think
>> > we need a synonym.
>>
>> np.isclose()
>>
>
> Would it be helpful if I went ahead and submitted a pull request with the
> function in my original question called "isclose" (along with a complete
> docstring and a few tests)?
>
> One note:
> At the moment, it deliberately compares NaN's as equal. E.g.
>
> isclose([np.nan, np.nan], [np.nan, np.nan])
>
> will return:
>
> [True, True]
>
> This obviously runs counter to the standard way NaN's are handled (and
> indeed the definition of NaN).
>
> However, in the context of a floating point "close to" function, I think
> it makes the most sense.
>
> I've had this sitting around in a small project for awhile now, and it's
> been more useful to have it compare NaN's as "approximately equal" than not
> for my purposes at least.
>
> Nonetheless, it's something that needs additional consideration.
>
> Thanks,
> -Joe
>

It would be confusing if numpy.isclose().all() was different from
numpy.allclose(). That being said, I agree it's useful to have NaNs compare
equal in some cases, maybe it could be a new argument to the function?

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] copy mask from existing masked array?

2012-03-04 Thread Olivier Delalleau
Should work with:
b = numpy.ma.masked_array(b, mask=a.mask)

-=- Olivier

Le 4 mars 2012 13:01, Chao YUE  a écrit :

> Dear all,
>
> I have a matrix with dimension of (360,720) but with all global data.
> I have another land-sea mask matrix with only 2 unique values in it
> (land=1, sea=-1).
> So I can easily create transform the second array to a masked array.
> the problem is, how can I quickly transform the first one to a masked
> array using the same mask as the land-sea mask array?
>
> I hope my question is clear. If not, here is an example:
>
> In [93]: a=np.arange(10).reshape(2,5)
> In [95]: a=np.ma.masked_equal(a,2
> In [96]: a=np.ma.masked_equal(a,8)
>
> In [97]: a
> Out[97]:
> masked_array(data =
>  [[0 1 -- 3 4]
>  [5 6 7 -- 9]],
>  mask =
>  [[False False  True False False]
>  [False False False  True False]],
>fill_value = 8)
>
> In [100]: b=np.random.normal(0,2,size=(2,5))
>
> I want to convert b to a masked array using exactly the same mask as a.
>
> thanks to all,
> cheers,
>
> Chao
> --
>
> ***
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> 
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] all elements equal

2012-03-05 Thread Olivier Delalleau
Le 5 mars 2012 14:29, Keith Goodman  a écrit :

> On Mon, Mar 5, 2012 at 11:24 AM, Neal Becker  wrote:
> > Keith Goodman wrote:
> >
> >> On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker 
> wrote:
> >>> What is a simple, efficient way to determine if all elements in an
> array (in
> >>> my case, 1D) are equal?  How about close?
> >>
> >> For the exactly equal case, how about:
> >>
> >> I[1] a = np.array([1,1,1,1])
> >> I[2] np.unique(a).size
> >> O[2] 1# All equal
> >>
> >> I[3] a = np.array([1,1,1,2])
> >> I[4] np.unique(a).size
> >> O[4] 2   # All not equal
> >
> > I considered this - just not sure if it's the most efficient
>
> Yeah, it is slow:
>
> I[1] a = np.ones(10)
> I[2] timeit np.unique(a).size
> 1000 loops, best of 3: 1.56 ms per loop
> I[3] timeit (a == a[0]).all()
> 1000 loops, best of 3: 203 us per loop
>
> I think all() short-circuits for bool arrays:
>
> I[4] a[1] = 9
> I[5] timeit (a == a[0]).all()
> 1 loops, best of 3: 89 us per loop
>
> You could avoid making the bool array by writing a function in cython.
> It could grab the first array element and then return False as soon as
> it finds an element that is not equal to it. And you could check for
> closeness.
>
> Or:
>
> I[8] np.allclose(a, a[0])
> O[8] False
> I[9] a = np.ones(10)
> I[10] np.allclose(a, a[0])
> O[10] True
>

Looks like the following is even faster:
np.max(a) == np.min(a)

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM

2012-03-12 Thread Olivier Delalleau
One major difference is that Theano doesn't attempt to parse existing
Python (byte)code: you need to explicitly code with the Theano syntax
(which tries to be close to Numpy, but can end up looking quite different,
especially if you want to control the program flow with loops and ifs for
instance).

A potentially interesting avenue would be to parse Python (byte)code to
generate a Theano graph. It'd be nice if numba could output some
intermediate information that would represent the computational graph being
compiled, so that Theano could re-use it directly :) (probably much easier
said than done though)

-=- Olivier

Le 12 mars 2012 12:57, Till Stensitzki  a écrit :

> Doesent Theano does the same, only via GCC compilation?
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM

2012-03-20 Thread Olivier Delalleau
This sounds a lot like Theano, did you look into it?

-=- Olivier

Le 20 mars 2012 13:49, mark florisson  a écrit :

> On 13 March 2012 18:18, Travis Oliphant  wrote:
> >>>
> >>> (Mark F., how does the above match how you feel about this?)
> >>
> >> I would like collaboration, but from a technical perspective I think
> >> this would be much more involved than just dumping the AST to an IR
> >> and generating some code from there. For vector expressions I think
> >> sharing code would be more feasible than arbitrary (parallel) loops,
> >> etc. Cython as a compiler can make many decisions that a Python
> >> (bytecode) compiler can't make (at least without annotations and a
> >> well-defined subset of the language (not so much the syntax as the
> >> semantics)). I think in numba, if parallelism is to be supported, you
> >> will want a prange-like construct, as proving independence between
> >> iterations can be very hard to near impossible for a compiler.
> >
> > I completely agree that you have to define some kind of syntax to get
> parallelism.  But, a prange construct would not be out of the question, of
> course.
> >
> >>
> >> As for code generation, I'm not sure how llvm would do things like
> >> slicing arrays, reshaping, resizing etc (for vector expressions you
> >> can first evaluate all slicing and indexing operations and then
> >> compile the remaining vector expression), but for loops and array
> >> reassignment within loops this would have to invoke the actual slicing
> >> code from the llvm code (I presume).
> >
> > There could be some analysis on the byte-code, prior to emitting the
> llvm code in order to handle lots of things.   Basically, you have to
> "play" the byte-code on a simple machine anyway in order to emit the
> correct code.   The big thing about Cython is you have to typedef too many
> things that are really quite knowable from the code.   If Cython could
> improve it's type inference, then it would be a more suitable target.
> >
> >> There are many other things, like
> >> bounds checking, wraparound, etc, that are all supported in both numpy
> >> and Cython, but going through an llvm layer would as far as I can see,
> >> require re-implementing those, at least if you want top-notch
> >> performance. Personally, I think for non-trivial performance-critical
> >> code (for loops with indexing, slicing, function calls, etc) Cython is
> >> a better target.
> >
> > With libclang it is really quite possible to imagine a cython -> C
> target that itself compiles to llvm so that you can do everything at that
> intermediate layer.   However,  LLVM is a much better layer for
> optimization than C now that there are a lot of people collaborating on
> that layer.   I think it would be great if Cython targeted LLVM actually
> instead of C.
> >
> >>
> >> Finally, as for non-vector-expression code, I really believe Cython is
> >> a better target. cython.inline can have high overhead (at least the
> >> first time it has to compile), but with better (numpy-aware) type
> >> inference or profile guided optimizations (see recent threads on the
> >> cython-dev mailing list), in addition to things like prange, I
> >> personally believe Cython targets most of the use cases where numba
> >> would be able to generate performing code.
> >
> > Cython and Numba certainly overlap.  However, Cython requires:
> >
> >1) learning another language
> >2) creating an extension module --- loading bit-code files and
> dynamically executing (even on a different machine from the one that
> initially created them) can be a powerful alternative for run-time
> compilation and distribution of code.
> >
> > These aren't show-stoppers obviously.   But, I think some users would
> prefer an even simpler approach to getting fast-code than Cython (which
> currently doesn't do enought type-inference and requires building a dlopen
> extension module).
>
> Dag and I have been discussing this at PyCon, and here is my take on
> it (at this moment :).
>
> Definitely, if you can avoid Cython then that is easier and more
> desirable in many ways. So perhaps we can create a third project
> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
> abstract syntax tree in a rather simple form, performs code
> optimizations such as rewriting loops with array accesses to vector
> expressions, fusing vector expressions and loops, etc, and spits out a
> transformed AST containing these optimizations. If runtime information
> is given such as actual shape and stride information the
> transformations could figure out there and then whether to do things
> like collapsing, axes swapping, blocking (as in, introducing more axes
> or loops to retain discontiguous blocks in the cache), blocked memory
> copies to contiguous chunks, etc. The AST could then also say whether
> the final expressions are vectorizable. Part of this functionality is
> already in numpy's nditer, except that this would be implicit and do
> more (and h

Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM

2012-03-20 Thread Olivier Delalleau
I doubt Theano is already as smart as you'd want it to be right now,
however the core mechanisms are there to perform graph optimizations and
move computations to GPU. It may save time to start from there instead of
starting all over from scratch. I'm not sure though, but it looks like it
would be worth considering it at least.

-=- Olivier

Le 20 mars 2012 15:40, Dag Sverre Seljebotn  a
écrit :

> ** We talked some about Theano. There are some differences in project
> goals which means that it makes sense to make this a seperate project:
> Cython wants to use this to generate C code up front from the Cython AST at
> compilation time; numba also has a different frontend (parsing of python
> bytecode) and a different backend (LLVM).
>
> However, it may very well be possible that Theano could be refactored so
> that the more essential algorithms working on the syntax tree could be
> pulled out and shared with cython and numba. Then the question is whether
> the core of Theano is smart enough to compete with Fortran compilers and
> support arbitraily strided inputs optimally. Otherwise one might as well
> start from scratch. I'll leave that for Mark to figure out...
>
> Dag
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>
>
> Olivier Delalleau  wrote:
>>
>> This sounds a lot like Theano, did you look into it?
>>
>> -=- Olivier
>>
>> Le 20 mars 2012 13:49, mark florisson  a
>> écrit :
>>
>>> On 13 March 2012 18:18, Travis Oliphant  wrote:
>>> >>>
>>> >>> (Mark F., how does the above match how you feel about this?)
>>> >>
>>> >> I would like collaboration, but from a technical perspective I think
>>> >> this would be much more involved than just dumping the AST to an IR
>>> >> and generating some code from there. For vector expressions I think
>>> >> sharing code would be more feasible than arbitrary (parallel) loops,
>>> >> etc. Cython as a compiler can make many decisions that a Python
>>> >> (bytecode) compiler can't make (at least without annotations and a
>>> >> well-defined subset of the language (not so much the syntax as the
>>> >> semantics)). I think in numba, if parallelism is to be supported, you
>>> >> will want a prange-like construct, as proving independence between
>>> >> iterations can be very hard to near impossible for a compiler.
>>> >
>>> > I completely agree that you have to define some kind of syntax to get
>>> parallelism.  But, a prange construct would not be out of the question, of
>>> course.
>>> >
>>> >>
>>> >> As for code generation, I'm not sure how llvm would do things like
>>> >> slicing arrays, reshaping, resizing etc (for vector expressions you
>>> >> can first evaluate all slicing and indexing operations and then
>>> >> compile the remaining vector expression), but for loops and array
>>> >> reassignment within loops this would have to invoke the actual slicing
>>> >> code from the llvm code (I presume).
>>> >
>>> > There could be some analysis on the byte-code, prior to emitting the
>>> llvm code in order to handle lots of things.   Basically, you have to
>>> "play" the byte-code on a simple machine anyway in order to emit the
>>> correct code.   The big thing about Cython is you have to typedef too many
>>> things that are really quite knowable from the code.   If Cython could
>>> improve it's type inference, then it would be a more suitable target.
>>> >
>>> >> There are many other things, like
>>> >> bounds checking, wraparound, etc, that are all supported in both numpy
>>> >> and Cython, but going through an llvm layer would as far as I can see,
>>> >> require re-implementing those, at least if you want top-notch
>>> >> performance. Personally, I think for non-trivial performance-critical
>>> >> code (for loops with indexing, slicing, function calls, etc) Cython is
>>> >> a better target.
>>> >
>>> > With libclang it is really quite possible to imagine a cython -> C
>>> target that itself compiles to llvm so that you can do everything at that
>>> intermediate layer.   However,  LLVM is a much better layer for
>>> optimization than C now that there are a lot of people collaborating on
>>> that layer.   I think it would be great if Cython targeted LLVM actually
>>> instead of C.
>>&

Re: [Numpy-discussion] How to Extract the Number of Rows and Columns in a Matrix

2012-03-26 Thread Olivier Delalleau
len(M) will give you the number of rows of M.
For columns I just use M.shape[1] myself, I don't know if there exists a
shortcut.

-=- Olivier

Le 26 mars 2012 19:03, Stephanie Cooke  a écrit :

> Hello,
>
> I would like to extract the number of rows and columns of a matrix
> individually. The shape command outputs the rows and columns together,
> but are there commands that will separately give the rows and
> separately give the columns?
>
> Thanks
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] AttributeError with shape command

2012-03-26 Thread Olivier Delalleau
It means "array" is a regular Python list and not a numpy array. Use
numpy.array(array) to convert it into an array.

-=- Olivier

Le 26 mars 2012 20:07, Stephanie Cooke  a écrit :

> Hello,
>
> I am new to numpy. When I try to use the command array.shape, I get
> the following error:
>
> AttributeError: 'list' object has no attribute 'shape'
>
> Is anyone familiar with this type of error?
>
> Thanks
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy Memory Error with corrcoef

2012-03-27 Thread Olivier Delalleau
Le 27 mars 2012 06:04, Nicole Stoffels  a écrit
:

> **
> Hi Pierre,
>
> thanks for the fast answer!
>
> I actually have timeseries of 24 hours for 459375 gridpoints in Europe.
> The timeseries of every grid point is stored in a column. That's why in my
> real program I already transposed the data, so that the correlation is made
> column by column. What I finally need is the correlation of each gridpoint
> with every other gridpoint. I'm afraid that this results in a 459375*459375
> matrix.
>
> The correlation is actually just an interim result. So I'm currently
> trying to loop over every gridpoint to get single correlations which will
> then be processed further. Is this the right approach?
>
> for column in range(len(data_records)):
> for columnnumber in range(len(data_records)):
> correlation = corrcoef(data_records[column],
> data_records[columnnumber])
>
> Best wished,
> Nicole
>

It may be painfully slow... You should make sure you don't compute twice
each off-diagonal element.
Also, if all your computations can be vectorized, you'll probably get a
significant performance boost by computing your matrix by blocks instead of
element-by-element. Take blocks as big as can fit in memory.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] how to check type of array?

2012-03-29 Thread Olivier Delalleau
if type(a) == numpy.ndarray:
   ...
if a.dtype == 'int32':
   ...

-=- Olivier

Le 29 mars 2012 07:54, Chao YUE  a écrit :

> Dear all,
>
> how can I check type of array in if condition expression?
>
> In [75]: type(a)
> Out[75]: 
>
> In [76]: a.dtype
> Out[76]: dtype('int32')
>
> a.dtype=='int32'?
>
> thanks!
>
> Chao
>
>
> --
>
> ***
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> 
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray sub-classing and append function

2012-03-31 Thread Olivier Delalleau
It doesn't work because numpy.append(a, ...) doesn't modify the array a
in-place: it returns a copy.
Then in your append method, doing "self = numpy.append(...)" won't have any
effect: in Python such a syntax means the "self" local variable will now
point to the result of numpy.append, but it won't modify the object that
self previously pointed to.
I didn't try it, but it should work with

def append(self, other):
numpy.ndarray.append(self, other)

which will call the append method of the parent class numpy.ndarray,
modifying self in-place.

-=- Olivier

Le 31 mars 2012 02:25, Prashant Saxena  a écrit :

> Hi,
>
> I am sub-classing numpy.ndarry for vector array representation. The append
> function is like this:
>
> def append(self, other):
>self = numpy.append(self, [other], axis=0)
>
> Example:
> vary = VectorArray([v1, v2])
> #vary = numpy.append(vary, [v1], axis=0)
> vary.append(v1)
>
> The commented syntax (numpy syntax) is working but "vary.append(v1)" is
> not working.
>
> Any help?
>
> Cheers
>
> Prashant
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] small bug in ndarray.flatten()?

2012-04-05 Thread Olivier Delalleau
It works for me, which version of numpy are you using?
What do you get when you type help(b.flatten)?

-=- Olivier

Le 5 avril 2012 04:45, Chao YUE  a écrit :

> Dear all,
>
> Is there a small bug in following?
>
> In [2]: b
> Out[2]:
> array([[ 0,  1,  2,  3,  4,  5],
>[ 6,  7,  8,  9, 10, 11],
>[12, 13, 14, 15, 16, 17],
>[18, 19, 20, 21, 22, 23]])
>
>
>
> In [3]: b.flatten(order='C')
> ---
> TypeError Traceback (most recent call last)
>
> /mnt/f/ in ()
>
> TypeError: flatten() takes no keyword arguments
>
> order='F' gave tha same.
>
> cheers,
>
> chao
>
> --
>
> ***
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> 
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] small bug in ndarray.flatten()?

2012-04-05 Thread Olivier Delalleau
Ok, it looks weird indeed. I was using numpy 1.6.1 myself, not sure if it's
a bug that's been fixed in 1.6.
Try without the keyword argument (b.flatten('C')), see if at least that
works.

-=- Olivier

Le 5 avril 2012 08:12, Chao YUE  a écrit :

> Hi,
>
> I use 1.51.
> In [69]: np.__version__
> Out[69]: '1.5.1'
>
> the help information seems OK.
>
> In [70]: b.flatten?
> Type:builtin_function_or_method
> Base Class:
> String Form: 0xb5d4a58>
> Namespace:Interactive
> Docstring:
> a.flatten(order='C')
>
> Return a copy of the array collapsed into one dimension.
>
> Parameters
> --
> order : {'C', 'F'}, optional
> Whether to flatten in C (row-major) or Fortran (column-major)
> order.
> The default is 'C'.
>
> Returns
> ---
> y : ndarray
> A copy of the input array, flattened to one dimension.
>
> See Also
> 
> ravel : Return a flattened array.
> flat : A 1-D flat iterator over the array.
>
> Examples
> ----
>     >>> a = np.array([[1,2], [3,4]])
> >>> a.flatten()
> array([1, 2, 3, 4])
> >>> a.flatten('F')
> array([1, 3, 2, 4])
>
> cheers,
>
> Chao
>
>
> 2012/4/5 Olivier Delalleau 
>
>> It works for me, which version of numpy are you using?
>> What do you get when you type help(b.flatten)?
>>
>> -=- Olivier
>>
>> Le 5 avril 2012 04:45, Chao YUE  a écrit :
>>
>>> Dear all,
>>>
>>> Is there a small bug in following?
>>>
>>> In [2]: b
>>> Out[2]:
>>> array([[ 0,  1,  2,  3,  4,  5],
>>>[ 6,  7,  8,  9, 10, 11],
>>>[12, 13, 14, 15, 16, 17],
>>>[18, 19, 20, 21, 22, 23]])
>>>
>>>
>>>
>>> In [3]: b.flatten(order='C')
>>>
>>> ---
>>> TypeError Traceback (most recent call
>>> last)
>>>
>>> /mnt/f/ in ()
>>>
>>> TypeError: flatten() takes no keyword arguments
>>>
>>> order='F' gave tha same.
>>>
>>> cheers,
>>>
>>> chao
>>>
>>> --
>>>
>>> ***
>>> Chao YUE
>>> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
>>> UMR 1572 CEA-CNRS-UVSQ
>>> Batiment 712 - Pe 119
>>> 91191 GIF Sur YVETTE Cedex
>>> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>>>
>>> 
>>>
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
>
> ***
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> 
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] apply 'getitem to each element of obj array?

2012-04-05 Thread Olivier Delalleau
Le 5 avril 2012 11:45, Neal Becker  a écrit :

> Adam Hughes wrote:
>
> > If you are storing objects, then can't you store them in a list and just
> do:
> >
> > for obj in objectlist:
> >  obj.attribute = value
> >
> > Or am I misunderstanding?
> >
>
> It's multi-dimensional, and I wanted to avoid writing explicit loops.
>

You can do:

f = numpy.frompyfunc(lambda x: x.some_attribute == 0, 1, 1)

Then
f(array_of_objects_x)

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] apply 'getitem to each element of obj array?

2012-04-05 Thread Olivier Delalleau
Le 5 avril 2012 12:50, Neal Becker  a écrit :

> Ken Watford wrote:
>
> > On Thu, Apr 5, 2012 at 11:57 AM, Olivier Delalleau 
> wrote:
> >> Le 5 avril 2012 11:45, Neal Becker  a écrit :
> >>
> >> You can do:
> >>
> >> f = numpy.frompyfunc(lambda x: x.some_attribute == 0, 1, 1)
> >>
> >> Then
> >> f(array_of_objects_x)
> >
> > This is handy too:
> >
> > agetattr = numpy.frompyfunc(getattr, 2, 1)
> >
> > array_of_values = agetattr(array_of_objects, 'some_attribute')
>
> I suppose for setitem something similar, except I don't think you can do
> that
> with lambda since lambda doesn't allow an assignment.
>

You can call setattr in a lambda though, to bypass the assignment
limitation.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fancy-indexing reorders output in corner cases?

2012-05-15 Thread Olivier Delalleau
2012/5/15 Travis Oliphant 

>
> On May 14, 2012, at 7:07 PM, Stéfan van der Walt wrote:
>
> > Hi Zach
> >
> > On Mon, May 14, 2012 at 4:33 PM, Zachary Pincus 
> wrote:
> >> The below seems to be a bug, but perhaps it's unavoidably part of the
> indexing mechanism?
> >>
> >> It's easiest to show via example... note that using "[0,1]" to pull two
> columns out of the array gives the same shape as using ":2" in the simple
> case, but when there's additional slicing happening, the shapes get
> transposed or something.
> >
> > When fancy indexing and slicing is mixed, the resulting shape is
> > essentially unpredictable.  The "correct" way to do it is to only use
> > fancy indexing, i.e. generate the indices of the sliced dimension as
> > well.
>
> This is not quite accurate.   It is not unpredictable.  It is very
> predictable, but a bit (too) complicated in the most general case.  The
> problem occurs when you "intermingle" fancy indexing with slice notation
> (and for this purpose integer selection is considered "fancy-indexing").
> While in simple cases you can think that [0,1] is equivalent to :2 --- it
> is not because fancy-indexing uses "zip-based ideas" instead of
> cross-product based ideas.
>
> The problem in general is how to make sense of something like
>
> a[:, :, in1, in2]
>
> If you keep fancy indexing to one side of the slice notation only, then
> you get what you expect.   The shape of the output will be the first two
> dimensions of a + the broadcasted shape of in1 and in2 (where integers are
> interpreted as fancy-index arrays).
>
> So, let's say a is (10,9,8,7)  and in1 is (3,4) and in2 is (4,)
>
> The shape of the output will be (10,9,3,4) filled with essentially
> a[:,:,i,j] = a[:,:,in1[i,j], in2[j]]
>
> What happens, though when you have
>
> a[:, in1 :, in2]?
>
> in1 and in2 are broadcasted together to create a two-dimensional
> "sub-space" that must fit somewhere.   Where should it go?   Should it
> replace in1 or in2?I.e. should the output be
>
> (10,3,4,8) or (10,8,3,4).
>
> To "resolve" this ambiguity, the code sends the (3,4) sub-space to the
> front of the "dimensions" and returns (3,4,10,8).   In retro-spect, the
> code should raise an error as I doubt anyone actually relies on this
> behavior, and then we could have "done the right" thing for situations like
> in1 being an integer which actually makes some sense and should not have
> been confused with the "general case"
>
> In this particular case you might also think that we could say the result
> should be (10,3,8,4) but there is no guarantee that the number of
> dimensions that should be appended by the "fancy-indexing" objects will be
> the same as the number of dimensions replaced.Again, this is how
> fancy-indexing combines with other fancy-indexing objects.
>
> So, the behavior is actually quite predictable, it's just that in some
> common cases it doesn't do what you would expect --- especially if you
> think that [0,1] is "the same" as :2.   When I wrote this code to begin
> with I should have raised an error and then worked in the cases that make
> sense.This is a good example of making the mistake of thinking that
> it's better to provide something very general rather than just raise an
> error when an obvious and clear solution is not available.
>
> There is the possibility that we could now raise an error in NumPy when
> this situation is encountered because I strongly doubt anyone is actually
> relying on the current behavior.I would like to do this, actually, as
> soon as possible.  Comments?
>

+1 to raise an error instead of an unintuitive behavior.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Olivier Delalleau
2012/5/23 Nathaniel Smith 

> On Wed, May 23, 2012 at 6:06 AM, Travis Oliphant 
> wrote:
> > I just realized that the pull request doesn't do what I thought it did
> which
> > is just add the flag to warn users who are writing to an array that is a
> > view when it used to be a copy. It's more cautious and also "copies"
> the
> > data for 1.7.
> >
> > Is this really a necessary step?   I guess it depends on how many
> use-cases
> > there are where people are relying on .diagonal() being a copy.   Given
> that
> > this is such an easy thing for people who encounter the warning to fix
> their
> > code, it seems overly cautious to *also* make a copy (especially for a
> rare
> > code-path like this --- although I admit that I don't have any
> reproducible
> > data to support that assertion that it's a rare code-path).
> >
> > I think we have a mixed record of being cautious (not cautious enough in
> > some changes), but this seems like swinging in the other direction of
> being
> > overly cautious on a minor point.
>
> The reason this isn't a "minor point" is that if we just switched it
> then it's possible that existing, working code would start returning
> incorrect answers, and the only indication would be some console spew.
> I think that such changes should be absolutely verboten for a library
> like numpy. I'm already paranoid enough about my own code!
>
> That's why people up-thread were arguing that we just shouldn't risk
> the change at all, ever.
>
> I admit to some ulterior motive here: I'd like to see numpy be able to
> continue to evolve, but I am also, like I said, completely paranoid
> about fundamental libraries changing under me. So this is partly my
> attempt to find a way to make a potentially "dangerous" change in a
> responsible way. If we can't learn to do this, then honestly I think
> the only responsible alternative going forward would be to never
> change any existing API except in trivial ways (like removing
> deprecated functions).
>
> Basically my suggestion is that every time we alter the behaviour of
> existing, working code, there should be (a) a period when that
> existing code produces a warning, and (b) a period when that existing
> code produces an error. For a change like removing a function, this is
> easy. For something like this diagonal change, it's trickier, but
> still doable.
>

/agree with Nathaniel. Overly cautious is good!

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] command for retrieving unmasked data from a mask array?

2012-05-23 Thread Olivier Delalleau
Should be dt3.compressed()

-=- Olivier

2012/5/23 Chao YUE 

> Dear all,
>
> is there a command for  retrieving unmasked data from a mask array?
> excepting using dt3[~dt3.mask].flatten()?
>
> thanks,
>
> Chao
>
> --
>
> ***
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> 
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Olivier Delalleau
2012/5/23 Travis Oliphant 

>
> On May 23, 2012, at 8:02 AM, Olivier Delalleau wrote:
>
> 2012/5/23 Nathaniel Smith 
>
>> On Wed, May 23, 2012 at 6:06 AM, Travis Oliphant 
>> wrote:
>> > I just realized that the pull request doesn't do what I thought it did
>> which
>> > is just add the flag to warn users who are writing to an array that is a
>> > view when it used to be a copy. It's more cautious and also
>> "copies" the
>> > data for 1.7.
>> >
>> > Is this really a necessary step?   I guess it depends on how many
>> use-cases
>> > there are where people are relying on .diagonal() being a copy.   Given
>> that
>> > this is such an easy thing for people who encounter the warning to fix
>> their
>> > code, it seems overly cautious to *also* make a copy (especially for a
>> rare
>> > code-path like this --- although I admit that I don't have any
>> reproducible
>> > data to support that assertion that it's a rare code-path).
>> >
>> > I think we have a mixed record of being cautious (not cautious enough in
>> > some changes), but this seems like swinging in the other direction of
>> being
>> > overly cautious on a minor point.
>>
>> The reason this isn't a "minor point" is that if we just switched it
>> then it's possible that existing, working code would start returning
>> incorrect answers, and the only indication would be some console spew.
>> I think that such changes should be absolutely verboten for a library
>> like numpy. I'm already paranoid enough about my own code!
>>
>> That's why people up-thread were arguing that we just shouldn't risk
>> the change at all, ever.
>>
>> I admit to some ulterior motive here: I'd like to see numpy be able to
>> continue to evolve, but I am also, like I said, completely paranoid
>> about fundamental libraries changing under me. So this is partly my
>> attempt to find a way to make a potentially "dangerous" change in a
>> responsible way. If we can't learn to do this, then honestly I think
>> the only responsible alternative going forward would be to never
>> change any existing API except in trivial ways (like removing
>> deprecated functions).
>>
>> Basically my suggestion is that every time we alter the behaviour of
>> existing, working code, there should be (a) a period when that
>> existing code produces a warning, and (b) a period when that existing
>> code produces an error. For a change like removing a function, this is
>> easy. For something like this diagonal change, it's trickier, but
>> still doable.
>>
>
> /agree with Nathaniel. Overly cautious is good!
>
>
> Then are you suggesting that we need to back out the changes to the
> casting rules as well, because this will also cause code to stop working.
> This is part of my point.   We are not being consistently cautious.
>
> -Travis
>

Well, about casting rules... they've already been broken multiple times in
previous releases (at least between 1.5 and 1.6, although I think I
remember seeing some inconsistent behavior with older versions as well, but
I'm less sure). So in some sense it's already too late, and it shouldn't
hurt much more to break them again :P
But yes, breaking them in the first place was bad. I spent a lot of time
trying to figure out what was going on.

Although I just said I don't think it's a big deal to break them again, if
it's easy enough to add a warning on operations whose casting behavior
changed, with an option to disable this warning (would probably need to be
a global numpy setting -- is there a better way?), I would actually like it
even better.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Olivier Delalleau
2012/5/23 Nathaniel Smith 

> On Wed, May 23, 2012 at 6:29 PM, Travis Oliphant 
> wrote:
> > Then are you suggesting that we need to back out the changes to the
> casting
> > rules as well, because this will also cause code to stop working.   This
> is
> > part of my point.   We are not being consistently cautious.
>
> I never understood exactly what changed with the casting rules, but
> yeah, maybe. Still, the question of what our deprecation rules
> *should* be is somewhat separate from the question of what we've
> actually done (or even will do). You have to have ideals before you
> can ask whether you're living up to them :-).
>
> Didn't the casting rules become strictly stricter, i.e. some
> questionable operations that used to succeed now throw an error? If so
> then that's not a *major* violation of my suggested rules, but yeah, I
> guess it'd probably be better if they did warn. I imagine it wouldn't
> be terribly difficult to implement (add a new
> NPY_WARN_UNSAFE_CASTING_INTERNAL value, use it everywhere that used to
> be UNSAFE but now will be SAFE?), but someone who understands better
> what actually changed (Mark?) would have do it.
>

It wasn't just stricter rules. Some operations involving in particular
mixed scalar / array computations resulted in different outputs (with no
warning).

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Meta: help, devel and stackoverflow

2012-06-28 Thread Olivier Delalleau
+1 for a numpy-users list without "dev noise".

-=- Olivier

2012/6/28 Travis Oliphant 

> There are some good ideas here.
>
> I propose splitting this list into devel and users lists.
>
> This might best be done by creating a new list for users and using this
> list for development.
>
> Travis
>
> --
> Travis Oliphant
> (on a mobile)
> 512-826-7480
>
>
> On Jun 27, 2012, at 11:38 PM, srean  wrote:
>
> > Hi List,
> >
> > this has been brought up several times, and the response has been
> > generally positive but it has fallen through the cracks. So here are a
> > few repeat requests. Am keeping it terse just for brevity
> >
> > i) Split the list into [devel] and [help] and as was mentioned
> > recently [rant/flame]:
> >
> >   some request for help get drowned out during active development
> > related discussions and simple help requests pollutes more urgent
> > development related matters.
> >
> > ii) Stackoverflow like site for help as well as for proposals.
> >
> >The silent majority has been referred to a few times recently. I
> > suspect there does exist many lurkers on the list who do prefer one
> > discussed solution over the other but for various reasons do not break
> > out of their lurk mode to send a mail saying "I prefer this solution".
> > Such an interface will also help in keeping track of the level of
> > support as compared to mails that are larges hunks of quoted text with
> > a line or two stating ones preference or seconding a proposal.
> >
> > One thing I have learned from traffic accidents is that if one asks
> > for a help of the assembled crowd, no one knows how to respond. On the
> > other hand if you say "hey there in a blue shirt could you get some
> > water"  you get instant results. So pardon me for taking the
> > presumptuous liberty to request Travis to please set it up or
> > delegate.
> >
> > Splitting the lists shouldn't be hard work, setting up overflow might
> > be more work in comparison.
> >
> > Best
> > -- srean
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dropping support for Python 2.4 in NumPy 1.8

2012-06-28 Thread Olivier Delalleau
2012/6/28 David Cournapeau 

> Hi Travis,
>
> On Thu, Jun 28, 2012 at 1:25 PM, Travis Oliphant 
> wrote:
> > Hey all,
> >
> > I'd like to propose dropping support for Python 2.4 in NumPy 1.8 (not
> the 1.7 release).  What does everyone think of that?
>
> I think it would depend on 1.7 state. I am unwilling to drop support
> for 2.4 in 1.8 unless we make 1.7 a LTS, that would be supported up to
> 2014 Q1 (when RHEL5 stops getting security fixes - RHEL 5 is the one
> platform that warrants supporting 2.4 IMO)
>
> In my mind, it means 1.7 needs to be stable. Ondrej (and others) work
> to make sure we break neither API or ABI since a few releases would
> help achieving that.
>
> David
>

As a user stuck with Python 2.4 for an undefined period of time, I would
definitely appreciate a long-term support release that would retain Python
2.4 compatibility.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dropping support for Python 2.4 in NumPy 1.8

2012-06-28 Thread Olivier Delalleau
2012/6/28 Ralf Gommers 

>
>
> On Thu, Jun 28, 2012 at 4:44 PM, Olivier Delalleau  wrote:
>
>> 2012/6/28 David Cournapeau 
>>
>>> Hi Travis,
>>>
>>> On Thu, Jun 28, 2012 at 1:25 PM, Travis Oliphant 
>>> wrote:
>>> > Hey all,
>>> >
>>> > I'd like to propose dropping support for Python 2.4 in NumPy 1.8 (not
>>> the 1.7 release).  What does everyone think of that?
>>>
>>> I think it would depend on 1.7 state. I am unwilling to drop support
>>> for 2.4 in 1.8 unless we make 1.7 a LTS, that would be supported up to
>>> 2014 Q1 (when RHEL5 stops getting security fixes - RHEL 5 is the one
>>> platform that warrants supporting 2.4 IMO)
>>>
>>> In my mind, it means 1.7 needs to be stable. Ondrej (and others) work
>>> to make sure we break neither API or ABI since a few releases would
>>> help achieving that.
>>>
>>> David
>>>
>>
>> As a user stuck with Python 2.4 for an undefined period of time, I would
>> definitely appreciate a long-term support release that would retain Python
>> 2.4 compatibility.
>>
>
> Hi, I have an honest question for you (and other 2.4 users). Many packages
> have long since dropped 2.4 compatibility. IPython and scikit-learn require
> 2.6 as a minimum, scikits-image and statsmodels 2.5. So what do you do
> about those packages, not use them at all, or use an older version?
>
> All those packages are improving (in my opinion) at a much faster rate
> than numpy. So if you do use them, up-to-date versions of those are likely
> to be more useful than a new version of numpy. In that light, does keeping
> 2.4 support really add significant value for you?
>

I just don't use any package that is not Python 2.4-compatible. The
application I currently work with requires numpy, scipy and theano.
I might not need new features from newer numpy versions (not sure), but
fixes for bugs and future compatibility issues that may come up would be
nice.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2012-11-12 Thread Olivier Delalleau
2012/11/12 Nathaniel Smith 

> On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett 
> wrote:
> > Hi,
> >
> > I wanted to check that everyone knows about and is happy with the
> > scalar casting changes from 1.6.0.
> >
> > Specifically, the rules for (array, scalar) casting have changed such
> > that the resulting dtype depends on the _value_ of the scalar.
> >
> > Mark W has documented these changes here:
> >
> > http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
> >
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
> >
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
> >
> > Specifically, as of 1.6.0:
> >
> > In [19]: arr = np.array([1.], dtype=np.float32)
> >
> > In [20]: (arr + (2**16-1)).dtype
> > Out[20]: dtype('float32')
> >
> > In [21]: (arr + (2**16)).dtype
> > Out[21]: dtype('float64')
> >
> > In [25]: arr = np.array([1.], dtype=np.int8)
> >
> > In [26]: (arr + 127).dtype
> > Out[26]: dtype('int8')
> >
> > In [27]: (arr + 128).dtype
> > Out[27]: dtype('int16')
> >
> > There's discussion about the changes here:
> >
> >
> http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
> > http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
> >
> http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
> >
> > It seems to me that this change is hard to explain, and does what you
> > want only some of the time, making it a false friend.
>
> The old behaviour was that in these cases, the scalar was always cast
> to the type of the array, right? So
>   np.array([1], dtype=np.int8) + 256
> returned 1? Is that the behaviour you prefer?
>
> I agree that the 1.6 behaviour is surprising and somewhat
> inconsistent. There are many places where you can get an overflow in
> numpy, and in all the other cases we just let the overflow happen. And
> in fact you can still get an overflow with arr + scalar operations, so
> this doesn't really fix anything.
>
> I find the specific handling of unsigned -> signed and float32 ->
> float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
> representable as a float32, but it doesn't *overflow*, it just gives
> you 2.0**16... if I'm using float32 then I presumably don't care that
> much about exact representability, so it's surprising that numpy is
> working to enforce it, and definitely a separate decision from what to
> do about overflow.)
>
> None of those threads seem to really get into the question of what the
> best behaviour here *is*, though.
>
> Possibly the most defensible choice is to treat ufunc(arr, scalar)
> operations as performing an implicit cast of the scalar to arr's
> dtype, and using the standard implicit casting rules -- which I think
> means, raising an error if !can_cast(scalar, arr.dtype,
> casting="safe")


I like this suggestion. It may break some existing code, but I think it'd
be for the best. The current behavior can be very confusing.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2012-11-13 Thread Olivier Delalleau
2012/11/12 Matthew Brett 

> Hi,
>
> On Mon, Nov 12, 2012 at 8:15 PM, Benjamin Root  wrote:
> >
> >
> > On Monday, November 12, 2012, Olivier Delalleau wrote:
> >>
> >> 2012/11/12 Nathaniel Smith 
> >>>
> >>> On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett <
> matthew.br...@gmail.com>
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > I wanted to check that everyone knows about and is happy with the
> >>> > scalar casting changes from 1.6.0.
> >>> >
> >>> > Specifically, the rules for (array, scalar) casting have changed such
> >>> > that the resulting dtype depends on the _value_ of the scalar.
> >>> >
> >>> > Mark W has documented these changes here:
> >>> >
> >>> > http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
> >>> >
> >>> >
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
> >>> >
> >>> >
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
> >>> >
> >>> > Specifically, as of 1.6.0:
> >>> >
> >>> > In [19]: arr = np.array([1.], dtype=np.float32)
> >>> >
> >>> > In [20]: (arr + (2**16-1)).dtype
> >>> > Out[20]: dtype('float32')
> >>> >
> >>> > In [21]: (arr + (2**16)).dtype
> >>> > Out[21]: dtype('float64')
> >>> >
> >>> > In [25]: arr = np.array([1.], dtype=np.int8)
> >>> >
> >>> > In [26]: (arr + 127).dtype
> >>> > Out[26]: dtype('int8')
> >>> >
> >>> > In [27]: (arr + 128).dtype
> >>> > Out[27]: dtype('int16')
> >>> >
> >>> > There's discussion about the changes here:
> >>> >
> >>> >
> >>> >
> http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
> >>> >
> http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
> >>> >
> >>> >
> http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
> >>> >
> >>> > It seems to me that this change is hard to explain, and does what you
> >>> > want only some of the time, making it a false friend.
> >>>
> >>> The old behaviour was that in these cases, the scalar was always cast
> >>> to the type of the array, right? So
> >>>   np.array([1], dtype=np.int8) + 256
> >>> returned 1? Is that the behaviour you prefer?
> >>>
> >>> I agree that the 1.6 behaviour is surprising and somewhat
> >>> inconsistent. There are many places where you can get an overflow in
> >>> numpy, and in all the other cases we just let the overflow happen. And
> >>> in fact you can still get an overflow with arr + scalar operations, so
> >>> this doesn't really fix anything.
> >>>
> >>> I find the specific handling of unsigned -> signed and float32 ->
> >>> float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
> >>> representable as a float32, but it doesn't *overflow*, it just gives
> >>> you 2.0**16... if I'm using float32 then I presumably don't care that
> >>> much about exact representability, so it's surprising that numpy is
> >>> working to enforce it, and definitely a separate decision from what to
> >>> do about overflow.)
> >>>
> >>> None of those threads seem to really get into the question of what the
> >>> best behaviour here *is*, though.
> >>>
> >>> Possibly the most defensible choice is to treat ufunc(arr, scalar)
> >>> operations as performing an implicit cast of the scalar to arr's
> >>> dtype, and using the standard implicit casting rules -- which I think
> >>> means, raising an error if !can_cast(scalar, arr.dtype,
> >>> casting="safe")
> >>
> >>
> >> I like this suggestion. It may break some existing code, but I think
> it'd
> >> be for the best. The current behavior can be very confusing.
> >>
> >> -=- Olivier
> >
> >
> >
> > "break some existing code"
> >
> > I really should set up an email filter for this phrase and have it send
> back
> > an email automatically: "Are you nuts?!"
>
> Well, hold on though, I was asking earlier in the thread what we
> thought the behavior should be in 2.0 or maybe better put, sometime in
> the future.
>
> If we know what we think the best answer is, and we think the best
> answer is worth shooting for, then we can try to think of sensible
> ways of getting there.
>
> I guess that's what Nathaniel and Olivier were thinking of but they
> can correct me if I'm wrong...
>
> Cheers,
>
> Matthew
>

This is indeed what I had in mind, thanks.
I definitely agree a (long) period with a deprecation warning would be
needed if this is changed.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy's policy for releasing memory

2012-11-13 Thread Olivier Delalleau
How are you monitoring memory usage?
Personally I've been using psutil and it seems to work well, although I've
used it only on Windows and not in applications with large numpy arrays, so
I can't tell whether it would work you.

Also, keep in mind that:
- The "auto-delete object when it goes out of scope" behavior is specific
to the CPython implementation and not part of the Python standard, so if
you're actually using a different implementation you may see a different
behavior.
- CPython deals with small objects in a special way, not actually releasing
allocated memory. For more info:
http://deeplearning.net/software/theano/tutorial/python-memory-management.html#internal-memory-management

-=- Olivier

2012/11/13 Austin Bingham 

> OK, if numpy is just subject to Python's behavior then what I'm seeing
> must be due to the vagaries of Python. I've noticed that things like
> removing a particular line of code or reordering seemingly unrelated calls
> (unrelated to the memory issue, that is) can affect when memory is reported
> as free. I'll just assume that everything is in order and carry on. Thanks!
>
> Austin
>
>
> On Tue, Nov 13, 2012 at 9:41 AM, Nathaniel Smith  wrote:
>
>> On Tue, Nov 13, 2012 at 8:26 AM, Austin Bingham
>>  wrote:
>> > I'm trying to understand how numpy decides when to release memory and
>> > whether it's possible to exert any control over that. The situation is
>> that
>> > I'm profiling memory usage on a system in which a great deal of the
>> overall
>> > memory is tied up in ndarrays. Since numpy manages ndarray memory on
>> its own
>> > (i.e. without the python gc, or so it seems), I'm finding that I can't
>> do
>> > much to convince numpy to release memory when things get tight. For
>> python
>> > object, for example, I can explicitly run gc.collect().
>> >
>> > So, in an effort to at least understand the system better, can anyone
>> tell
>> > me how/when numpy decides to release memory? And is there any way via
>> either
>> > the Python or C-API to explicitly request release? Thanks.
>>
>> Numpy array memory is released when the corresponding Python objects
>> are deleted, so it exactly follows Python's rules. You can't
>> explicitly request release, because by definition, if memory is not
>> released, then it means that it's still accessible somehow, so
>> releasing it could create segfaults. Perhaps you have stray references
>> sitting around that you have forgotten to clear -- that's a common
>> cause of memory leaks in Python. gc.get_referrers() can be useful to
>> debug such things.
>>
>> Some things to note:
>> - Numpy uses malloc() instead of going through the Python low-level
>> memory allocation layer (which itself is a wrapper around malloc with
>> various optimizations for small objects). This is really only relevant
>> because it might create some artifacts depending on how your memory
>> profiler gathers data.
>> - gc.collect() doesn't do that much in Python... it only matters if
>> you have circular references. Mostly Python releases the memory
>> associated with objects as soon as the object becomes unreferenced.
>> You could try avoiding circular references, and then gc.collect()
>> won't even do anything.
>> - If you have multiple views of the same memory in numpy, then they
>> share the same underlying memory, so that memory won't be released
>> until all of the views objects are released. (The one thing to watch
>> out for is you can do something like 'huge_array = np.zeros((2,
>> 1000)); tiny_array = a[:, 100]' and now since tiny_array is a view
>> onto huge_array, so long as a reference to tiny_array exists the full
>> big memory allocation will remain.)
>>
>> -n
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float32 to float64 casting

2012-11-16 Thread Olivier Delalleau
2012/11/16 Charles R Harris 

>
>
> On Thu, Nov 15, 2012 at 8:24 PM, Gökhan Sever wrote:
>
>> Hello,
>>
>> Could someone briefly explain why are these two operations are casting my
>> float32 arrays to float64?
>>
>> I1 (np.arange(5, dtype='float32')).dtype
>> O1 dtype('float32')
>>
>> I2 (10*np.arange(5, dtype='float32')).dtype
>> O2 dtype('float64')
>>
>
> This one is depends on the size of the multiplier and is first present in
> 1.6.0. I suspect it is a side effect of making the type conversion code
> sensitive to magnitude.
>
>
>>
>>
>>
>> I3 (np.arange(5, dtype='float32')[0]).dtype
>> O3 dtype('float32')
>>
>> I4 (1*np.arange(5, dtype='float32')[0]).dtype
>> O4 dtype('float64')
>>
>
> This one probably depends on the fact that the element is a scalar, but
> doesn't look right. Scalars are promoted differently. Also holds in numpy
> 1.5.0 so is of old provenance.
>
> Chuck
>

My understanding is that non-mixed operations (scalar/scalar or
array/array) use casting rules that don't depend on magnitude, and the
upcast of int{32,64} mixed with float32 has always been float64 (probably
because the result has to be a kind of float, and float64 makes it possible
to represent exactly a larger integer range than float32). Note that if you
cast 1 into int16 the result will be float32 (I guess float32 can represent
exactly all int16 integers).

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float32 to float64 casting

2012-11-16 Thread Olivier Delalleau
2012/11/16 Olivier Delalleau 

> 2012/11/16 Charles R Harris 
>
>>
>>
>> On Thu, Nov 15, 2012 at 11:37 PM, Charles R Harris <
>> charlesr.har...@gmail.com> wrote:
>>
>>>
>>>
>>> On Thu, Nov 15, 2012 at 8:24 PM, Gökhan Sever wrote:
>>>
>>>> Hello,
>>>>
>>>> Could someone briefly explain why are these two operations are casting
>>>> my float32 arrays to float64?
>>>>
>>>> I1 (np.arange(5, dtype='float32')).dtype
>>>> O1 dtype('float32')
>>>>
>>>> I2 (10*np.arange(5, dtype='float32')).dtype
>>>> O2 dtype('float64')
>>>>
>>>
>>> This one is depends on the size of the multiplier and is first present
>>> in 1.6.0. I suspect it is a side effect of making the type conversion code
>>> sensitive to magnitude.
>>>
>>>
>>>>
>>>>
>>>>
>>>> I3 (np.arange(5, dtype='float32')[0]).dtype
>>>> O3 dtype('float32')
>>>>
>>>> I4 (1*np.arange(5, dtype='float32')[0]).dtype
>>>> O4 dtype('float64')
>>>>
>>>
>>> This one probably depends on the fact that the element is a scalar, but
>>> doesn't look right. Scalars are promoted differently. Also holds in numpy
>>> 1.5.0 so is of old provenance.
>>>
>>>
>> This one has always bothered me:
>>
>> In [3]: (-1*arange(5, dtype=uint64)).dtype
>> Out[3]: dtype('float64')
>>
>
> My interpretation here is that since the possible results when multiplying
> an int64 with an uint64 can be signed, and can go beyond the range of
> int64, numpy prefers to cast everything to float64, which can represent
> (even if approximately) a larger range of signed values.
>

Actually, thinking about it a bit more, I suspect the logic is not related
to the result of the operation, but to the fact numpy needs to cast both
arguments into a common dtype before doing the operation, and it has no
integer dtype available that can hold both int64 and uint64 numbers, so it
uses float64 instead.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float32 to float64 casting

2012-11-17 Thread Olivier Delalleau
2012/11/17 Gökhan Sever 

>
>
> On Sat, Nov 17, 2012 at 9:47 AM, Nathaniel Smith  wrote:
>
>> On Fri, Nov 16, 2012 at 9:53 PM, Gökhan Sever 
>> wrote:
>> > Thanks for the explanations.
>> >
>> > For either case, I was expecting to get float32 as a resulting data
>> type.
>> > Since, float32 is large enough to contain the result. I am wondering if
>> > changing casting rule this way, requires a lot of modification in the
>> NumPy
>> > code. Maybe as an alternative to the current casting mechanism?
>> >
>> > I like the way that NumPy can convert to float64. As if these
>> data-types are
>> > continuation of each other. But just the conversation might happen too
>> early
>> > --at least in my opinion, as demonstrated in my example.
>> >
>> > For instance comparing this example to IDL surprises me:
>> >
>> > I16 np.float32()*5e38
>> > O16 2.77749998e+42
>> >
>> > I17 (np.float32()*5e38).dtype
>> > O17 dtype('float64')
>>
>> In this case, what's going on is that 5e38 is a Python float object,
>> and Python float objects have double-precision, i.e., they're
>> equivalent to np.float64's. So you're multiplying a float32 and a
>> float64. I think most people will agree that in this situation it's
>> better to use float64 for the output?
>>
>> -n
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> OK, I see your point. Python numeric data objects and NumPy data objects
> mixed operations require more attention.
>
> The following causes float32 overflow --rather than casting to float64 as
> in the case for Python float multiplication, and behaves like in IDL.
>
> I3 (np.float32()*np.float32(5e38))
> O3 inf
>
> However, these two still surprises me:
>
> I5 (np.float32()*1).dtype
> O5 dtype('float64')
>
> I6 (np.float32()*np.int32(1)).dtype
> O6 dtype('float64')
>

That's because the current way of finding out the result's dtype is based
on input dtypes only (not on numeric values), and numpy.can_cast('int32',
'float32') is False, while numpy.can_cast('int32', 'float64') is True (and
same for int64).
Thus it decides to cast to float64.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] the mean, var, std of empty arrays

2012-11-21 Thread Olivier Delalleau
Current behavior looks sensible to me. I personally would prefer no warning
but I think it makes sense to have one as it can be helpful to detect
issues faster.

-=- Olivier

2012/11/21 Charles R Harris 

> What should be the value of the mean, var, and std of empty arrays?
> Currently
>
> In [12]: a
> Out[12]: array([], dtype=int64)
>
> In [13]: a.mean()
> Out[13]: nan
>
> In [14]: a.std()
> Out[14]: nan
>
> In [15]: a.var()
> Out[15]: nan
>
> I think the nan comes from 0/0. All of these also raise warnings the first
> time they are called.
>
> Chuck
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ImportError: libatlas.so.3: cannot open shared object file

2012-12-10 Thread Olivier Delalleau
2012/12/10 Allan Kamau 

> I did add the paths to LD_LIBRARY_PATH as advised (see below), then
> "python setup.py clean;python setup.py build;python setup.py install;"  but
> the same error persists.
>
> export LAPACK=/usr/lib/lapack/liblapack.so;export
> ATLAS=/usr/lib/atlas-base/libatlas.so;
> export
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/lapack:/usr/lib/atlas-base;
>

Is the file libatlas.so.3 present in /usr/lib/lapack:/usr/lib/atlas-base?

-=- Olivier



> On Mon, Dec 10, 2012 at 2:54 PM, Alexander Eberspächer <
> alex.eberspaec...@gmail.com> wrote:
>
>> On Mon, 10 Dec 2012 13:57:04 +0300
>> Allan Kamau  wrote:
>>
>> > I have built and installed numpy on Debian from source successfully as
>> > follows.
>> [...]
>> > ImportError: libatlas.so.3: cannot open shared object file: No such
>> > file or directory
>>
>> Are the paths to ATLAS in your $LD_LIBRARY_PATH? If not, try adding
>> those.
>>
>> Hope that helps!
>>
>> Cheers,
>>
>> Alex
>>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also?

2012-12-13 Thread Olivier Delalleau
I'd say it's a good idea, although I hope 1.7.x will still be maintained
for a while for those who are still stuck with Python 2.4-5 (sometimes you
don't have a choice).

-=- Olivier

2012/12/13 Charles R Harris 

> The previous proposal to drop python 2.4 support garnered no opposition.
> How about dropping support for python 2.5 also?
>
> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also?

2012-12-13 Thread Olivier Delalleau
2012/12/13 Chris Barker - NOAA Federal 

> On Thu, Dec 13, 2012 at 3:01 PM, Bradley M. Froehle
>  wrote:
> > Yes, but the point was that since you can live with an older version on
> > Python you can probably live with an older version of NumPy.
>
> exactly -- also:
>
> How likely are you to nee the latest and greatest numpy but not a new
> PyGTK, or a new name_your_package_here. And, in fact, other packages
> drop support for older Python's too.
>
> However, what I can imagine is pretty irrelevant -- sorry I brought it
> up -- either there are a significant number of folks for whom support
> for old Pythons in important, or there aren't.
>

I doubt it's a common situation, but just to give an example: I am
developing some machine learning code that heavily relies on Numpy, and it
is meant to run into a large Python 2.4 software environment, which can't
easily be upgraded because it contains lots of libraries that have been
built against Python 2.4. And even if I could rebuild it, they wouldn't let
me ;) This Python code is mostly proprietary and doesn't require external
dependencies to be upgraded... except my little module that may take
advantage of Numpy improvements.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-03 Thread Olivier Delalleau
2013/1/3 Andrew Collette :
> Hi Dag,
>
>> If neither is objectively better, I think that is a very good reason to
>> kick it down to the user. "Explicit is better than implicit".
>
> I agree with you, up to a point.  However, we are talking about an
> extremely common operation that I think most people (myself included)
> would not expect to raise an exception: namely, adding a number to an
> array.
>
>> It's a good solution to encourage bug-free code. It may not be a good
>> solution to avoid typing.
>
> Ha!  But seriously, checking every time I make an addition?  And in
> the current version of numpy it's not buggy code to add 128 to an int8
> array; it's documented to give you an int16 with the result of the
> addition.  Maybe it shouldn't, but that's what it does.
>
>> I think you usually have a bug in your program when this happens, since
>> either the dtype is wrong, or the value one is trying to store is wrong.
>> I know that's true for myself, though I don't claim to know everybody
>> elses usecases.
>
> I don't think it's unreasonable to add a number to an int16 array (or
> int32), and rely on specific, documented behavior if the number is
> outside the range.  For example, IDL will clip the value.  Up until
> 1.6, in NumPy it would roll over. Currently it upcasts.
>
> I won't make the case for upcasting vs rollover again, as I think
> that's dealt with extensively in the threads linked in the bug.  I am
> concerned about the tests I need to add wherever I might have a
> scalar, or the program blows up.
>
> It occurs to me that, if I have "a = b + c" in my code, and "c" is
> sometimes a scalar and sometimes an array, I will get different
> behavior.  If I have this right, if "c" is an array of larger dtype,
> including a 1-element array, it will upcast, if it's the same dtype,
> it will roll over regardless, but if it's a scalar and the result
> won't fit, it will raise ValueError.
>
> By the way, how do I test for this?  I can't test just the scalar
> because the proposed behavior (as I understand it) considers the
> result of the addition.  Should I always compute amax (nanmax)? Do I
> need to try adding them and look for ValueError?
>
> And things like this suddenly become dangerous:
>
> try:
> some_function(myarray + something)
> except ValueError:
>print "Problem in some_function!"

Actually, the proposed behavior considers only the value of the
scalar, not the result of the addition.
So the correct way to do things with this proposal would be to be sure
you don't add to an array a scalar value that can't fit in the array's
dtype.

In 1.6.1, you should make this check anyway, since otherwise your
computation can be doing something completely different without
telling you (and I doubt it's what you'd want):
In [50]: np.array([2], dtype='int8') + 127
Out[50]: array([-127], dtype=int8)
In [51]: np.array([2], dtype='int8') + 128
Out[51]: array([130], dtype=int16)

If the decision is to always roll-over, the first thing to decide is
whether this means the scalar is downcasted, or the output of the
computation. It doesn't matter for +, but for instance for the
"maximum" ufunc, I don't think it makes sense to perform the
computation at higher precision then downcast the output, as you would
otherwise have:
np.maximum(np.ones(1, dtype='int8'), 128)) == [-128]
So out of consistency (across ufuncs) I think it should always
downcast the scalar (it has the advantage of being more efficient too,
since you don't need to do an upcast to perform the computation). But
then you're up for some nasty surprise if your scalar overflows and
you didn't expect it. For instance the "maximum" example above would
return [1], which may be expected... or not (maybe you wanted to
obtain [128] instead?).

Another solution is to forget about trying to be smart and always
upcast the operation. That would be my 2nd preferred solution, but it
would make it very annoying to deal with Python scalars (typically
int64 / float64) that would be upcasting lots of things, potentially
breaking a significant amount of existing code.

So, personally, I don't see a straightforward solution without
warning/error, that would be safe enough for programmers.

-=- Olivier

>
> Nathaniel asked:
>
>> But if this is something you're running into in practice then you may have a 
>> better idea than us about the practical effects. Do you have any examples 
>> where this has come up that you can share?
>
> The only time I really ran into the 1.5/1.6 change was some old code
> ported from IDL which did odd things with the wrapping behavior.  But
> what I'm really trying to get a handle on here is the proposed future
> behavior.  I am coming to this from the perspective of both a user and
> a library developer (h5py) trying to work out what if anything I have
> to do when handling arrays and values I get from users.
>
> Andrew
___
NumPy-Discussion mailing list
NumPy-Discussion@scip

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-04 Thread Olivier Delalleau
2013/1/3 Andrew Collette :
>> Another solution is to forget about trying to be smart and always
>> upcast the operation. That would be my 2nd preferred solution, but it
>> would make it very annoying to deal with Python scalars (typically
>> int64 / float64) that would be upcasting lots of things, potentially
>> breaking a significant amount of existing code.
>>
>> So, personally, I don't see a straightforward solution without
>> warning/error, that would be safe enough for programmers.
>
> I guess what's really confusing me here is that I had assumed that this:
>
> result = myarray + scalar
>
> was equivalent to this:
>
> result = myarray + numpy.array(scalar)
>
> where the dtype of the converted scalar was chosen to be "just big
> enough" for it to fit.  Then you proceed using the normal rules for
> array addition.  Yes, you can have upcasting or rollover depending on
> the values involved, but you have that anyway with array addition;
> it's just how arrays work in NumPy.

A key difference is that with arrays, the dtype is not chosen "just
big enough" for your data to fit. Either you set the dtype yourself,
or you're using the default inferred dtype (int/float). In both cases
you should know what to expect, and it doesn't depend on the actual
numeric values (except for the auto int/float distinction).

>
> Also, have I got this (proposed behavior) right?
>
> array([127], dtype=int8) + 128 -> ValueError
> array([127], dtype=int8) + 127 -> -2
>
> It seems like all this does is raise an error when the current rules
> would require upcasting, but still allows rollover for smaller values.
>  What error condition, specifically, is the ValueError designed to
> tell me about?   You can still get "unexpected" data (if you're not
> expecting rollover) with no exception.

The ValueError is here to warn you that the operation may not be doing
what you want. The rollover for smaller values would be the documented
(and thus hopefully expected) behavior.

Taking the addition as an example may be misleading, as it makes it
look like we could just "always rollover" to obtain consistent
behavior, and programmers are to some extent used to integer rollover
on this kind of operation. However, I gave examples with "maximum"
that I believe show it's not that easy (this behavior would just
appear "wrong"). Another example is with the integer division, where
casting the scalar silently would result in
array([-128], dtype=int8) // 128 -> [1]
which is unlikely to be something someone would like to obtain.

To summarize the goals of the proposal (in my mind):
1. Low cognitive load (simple and consistent across ufuncs).
2. Low risk of doing something unexpected.
3. Efficient by default.
4. Most existing (non buggy) code should not be affected.

If we always do the silent cast, it will significantly break existing
code relying on the 1.6 behavior, and increases the risk of doing
something unexpected (bad on #2 & #4)
If we always upcast, we may break existing code and lose efficiency
(bad on #3 and #4).
If we keep current behavior, we stay with something that's difficult
to understand and has high risk of doing weird things (bad on #1 and
#2).

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Scalar casting rules use-case reprise

2013-01-04 Thread Olivier Delalleau
2013/1/4 Nathaniel Smith :
> On Fri, Jan 4, 2013 at 11:09 AM, Matthew Brett  
> wrote:
>> Hi,
>>
>> Reading the discussion on the scalar casting rule change I realized I
>> was hazy on the use-cases that led to the rule that scalars cast
>> differently from arrays.
>>
>> My impression was that the primary use-case was for lower-precision
>> floats. That is, when you have a large float32 arr, you do not want to
>> double your memory use with:
>>
> large_float32 + 1.0 # please no float64 here
>>
>> Probably also:
>>
> large_int8 + 1 # please no int32 / int64 here.
>>
>> That makes sense.  On the other hand these are more ambiguous:
>>
> large_float32 + np.float64(1) # really - you don't want float64?
>>
> large_int8 + np.int32(1) # ditto
>>
>> I wonder whether the main use-case was to deal with the automatic
>> types of Python floats and scalars?  That is, I wonder whether it
>> would be worth considering (in the distant long term), doing fancy
>> guess-what-you-mean stuff with Python scalars, on the basis that they
>> are of unspecified dtype, and make 0 dimensional scalars follow the
>> array casting rules.  As in:
>>
> large_float32 + 1.0
>> # no upcast - we don't know what float type you meant for the scalar
> large_float32 + np.float64(1)
>> # upcast - you clearly meant the scalar to be float64
>
> Hmm, but consider this, which is exactly the operation in your example:
>
> In [9]: a = np.arange(3, dtype=np.float32)
>
> In [10]: a / np.mean(a) # normalize
> Out[10]: array([ 0.,  1.,  2.], dtype=float32)
>
> In [11]: type(np.mean(a))
> Out[11]: numpy.float64
>
> Obviously the most common situation where it's useful to have the rule
> to ignore scalar width is for avoiding "width contamination" from
> Python float and int literals. But you can easily end up with numpy
> scalars from indexing, high-precision operations like np.mean, etc.,
> where you don't "really mean" you want high-precision. And at least
> it's easy to understand the rule: same-kind scalars don't affect
> precision.
>
> ...Though arguably the bug here is that np.mean actually returns a
> value with higher precision. Interestingly, we seem to have some
> special cases so that if you want to normalize each row of a matrix,
> then again the dtype is preserved, but for a totally different
> reasons. In
>
> a = np.arange(4, dtype=np.float32).reshape((2, 2))
> a / np.mean(a, axis=0, keepdims=True)
>
> the result has float32 type, even though this is an array/array
> operation, not an array/scalar operation. The reason is:
>
> In [32]: np.mean(a).dtype
> Out[32]: dtype('float64')
>
> But:
>
> In [33]: np.mean(a, axis=0).dtype
> Out[33]: dtype('float32')
>
> In this respect np.var and np.std behave like np.mean, but np.sum
> always preserves the input dtype. (Which is curious because np.sum is
> just like np.mean in terms of potential loss of precision, right? The
> problem in np.mean is the accumulating error over many addition
> operations, not the divide-by-n at the end.)

IMO having a different dtype depending on whether or not you provide
the "axis" argument to mean() should be considered as a bug.
As to what the correct dtype should be... it's not such an easy
question. Personally I would go with float64 by default to be
consistent across all int / float dtypes. Then someone who wants to
downcast it can use the "out" argument to mean().

To come back to Matthew's use-case question, I agree the most common
use case is to prevent a float32 or small int array from being
upcasted, and most of the time this would come from Python scalars.
However I don't think it's a good idea to have a behavior that is
different between Python and Numpy scalars, because it's a subtle
difference that users could have trouble understanding & foreseeing.
The expected behavior of numpy functions when providing them with
non-numpy objects is they should behave the same as if we had called
numpy.asarray() on these objects, and straying away from this behavior
seems dangerous to me.

As far as I'm concerned, in a world where numpy would be brand new
with no existing codebase using it, I would probably prefer to use the
same casting rules for array/array and array/scalar operations. It may
cause some unwanted array upcasting, but it's a lot simpler to
understand. However, given that there may be a lot of code relying on
the current dtype-preserving behavior, doing it now doesn't sound like
a good idea to me.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-04 Thread Olivier Delalleau
(sorry, no time for full reply, so for now just answering what I
believe is the main point)

2013/1/4 Andrew Collette :
>> The ValueError is here to warn you that the operation may not be doing
>> what you want. The rollover for smaller values would be the documented
>> (and thus hopefully expected) behavior.
>
> Right, but what confuses me is that the only thing this prevents is
> the current upcast behavior.  Why is that so evil it should be
> replaced with an exception?

The evilness lies in the silent switch between the rollover and upcast
behavior, as in the example I gave previously:

In [50]: np.array([2], dtype='int8') + 127
Out[50]: array([-127], dtype=int8)
In [51]: np.array([2], dtype='int8') + 128
Out[51]: array([130], dtype=int16)

If the scalar is the user-supplied value, it's likely you actually
want a fixed behavior (either rollover or upcast) regardless of the
numeric value being provided.

Looking at what other numeric libraries are doing is definitely a good
suggestion.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-06 Thread Olivier Delalleau
2013/1/5 Nathaniel Smith :
> On Fri, Jan 4, 2013 at 5:25 PM, Andrew Collette
>  wrote:
>> I agree the current behavior is confusing.  Regardless of the details
>> of what to do, I suppose my main objection is that, to me, it's really
>> unexpected that adding a number to an array could result in an
>> exception.
>
> I think the main objection to the 1.5 behaviour was that it violated
> "Errors should never pass silently." (from 'import this'). Granted
> there are tons of places where numpy violates this but this is the one
> we're thinking about right now...
>
> Okay, here's another idea I'll throw out, maybe it's a good compromise:
>
> 1) We go back to the 1.5 behaviour.
>
> 2) If this produces a rollover/overflow/etc., we signal that using the
> standard mechanisms (whatever is configured via np.seterr). So by
> default things like
>   np.maximum(np.array([1, 2, 3], dtype=uint8), 256)
> would succeed (and produce [1, 2, 3] with dtype uint8), but also issue
> a warning that 256 had rolled over to become 0. Alternatively those
> who want to be paranoid could call np.seterr(overflow="raise") and
> then it would be an error.

That'd work for me as well. Although I'm not sure about the name
"overflow", it sounds generic enough that it may be associated to many
different situations. If I want to have an error but only for this
very specific scenario (an "unsafe" cast in a mixed scalar/array
operation), would that be possible?

Also, do we all agree that "float32 array + float64 scalar" should
cast the scalar to float32 (thus resulting in a float32 array as
output) without warning, even if the scalar can't be represented
exactly in float32?

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-06 Thread Olivier Delalleau
2013/1/6 Nathaniel Smith :
> On Mon, Jan 7, 2013 at 1:43 AM, Olivier Delalleau  wrote:
>> 2013/1/5 Nathaniel Smith :
>>> On Fri, Jan 4, 2013 at 5:25 PM, Andrew Collette
>>>  wrote:
>>>> I agree the current behavior is confusing.  Regardless of the details
>>>> of what to do, I suppose my main objection is that, to me, it's really
>>>> unexpected that adding a number to an array could result in an
>>>> exception.
>>>
>>> I think the main objection to the 1.5 behaviour was that it violated
>>> "Errors should never pass silently." (from 'import this'). Granted
>>> there are tons of places where numpy violates this but this is the one
>>> we're thinking about right now...
>>>
>>> Okay, here's another idea I'll throw out, maybe it's a good compromise:
>>>
>>> 1) We go back to the 1.5 behaviour.
>>>
>>> 2) If this produces a rollover/overflow/etc., we signal that using the
>>> standard mechanisms (whatever is configured via np.seterr). So by
>>> default things like
>>>   np.maximum(np.array([1, 2, 3], dtype=uint8), 256)
>>> would succeed (and produce [1, 2, 3] with dtype uint8), but also issue
>>> a warning that 256 had rolled over to become 0. Alternatively those
>>> who want to be paranoid could call np.seterr(overflow="raise") and
>>> then it would be an error.
>>
>> That'd work for me as well. Although I'm not sure about the name
>> "overflow", it sounds generic enough that it may be associated to many
>> different situations. If I want to have an error but only for this
>> very specific scenario (an "unsafe" cast in a mixed scalar/array
>> operation), would that be possible?
>
> I suggested "overflow" because that's how we signal rollover in
> general right now:
>
> In [5]: np.int8(100) * np.int8(2)
> /home/njs/.user-python2.7-64bit/bin/ipython:1: RuntimeWarning:
> overflow encountered in byte_scalars
>   #!/home/njs/.user-python2.7-64bit/bin/python
> Out[5]: -56
>
> Two caveats on this: One, right now this is only implemented for
> scalars, not arrays -- which is bug #593 -- and two, I actually agree
> (?) that integer rollover and float overflow are different things we
> should probably add a new category to np.seterr() for integer rollover
> specifically.
>
> But the proposal here is that we not add a specific category for
> "unsafe cast" (which we would then have to define!), but instead just
> signal it using the standard mechanisms for the particular kind of
> corruption that happened. (Which right now is overflow, and might
> become something else later.)

Hehe, I didn't even know there was supposed to be a warning for arrays... Ok.

But I'm not convinced that re-using the "overflow" category is a good
idea, because to me the overflow is typically associated to the result
of an operation (when it goes beyond the dtype's supported range),
while here the problem is with the unsafe cast an input (even if it
makes no difference for addition, it does for some other ufuncs). I
may also want to have different error settings for operation overflow
vs. input overflow.

It may just be me though... let's see what others think about it.

>
>> Also, do we all agree that "float32 array + float64 scalar" should
>> cast the scalar to float32 (thus resulting in a float32 array as
>> output) without warning, even if the scalar can't be represented
>> exactly in float32?
>
> I guess for consistency, if this proposal is adopted then a float64
> which ends up getting cast to 'inf' or 0.0 should trigger an overflow
> or underflow warning respectively... e.g.:
>
> In [12]: np.float64(1e300)
> Out[12]: 1.0001e+300
>
> In [13]: np.float32(_12)
> Out[13]: inf
>
> ...but otherwise I think yes we agree.

Sounds good to me.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Olivier Delalleau
2013/1/8 Andrew Collette :
> Hi,
>
>> I think you are voting strongly for the current casting rules, because
>> they make it less obvious to the user that scalars are different from
>> arrays.
>
> Maybe this is the source of my confusion... why should scalars be
> different from arrays?  They should follow the same rules, as closely
> as possible.  If a scalar value would fit in an int16, why not add it
> using the rules for an int16 array?

As I mentioned in another post, I also agree that it would make things
simpler and safer to just yield the same result as if we were using a
one-element array.

My understanding of the motivation for the rule "scalars do not upcast
arrays unless they are of a fundamentally different type" is that it
avoids accidentally upcasting arrays in operations like "x + 1" (for
instance if x is a float32 array, the upcast would yield a float64
result, and if x is an int16, it would yield int64), which may waste
memory. I find it a useful feature, however I'm not sure it's worth
the headaches it can lead to.

However, my first reaction at the idea of dropping this rule
altogether is that it would lead to a long and painful deprecation
process. I may be wrong though, I really haven't thought about it
much.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Olivier Delalleau
2013/1/8 Sebastian Berg :
> On Tue, 2013-01-08 at 19:59 +, Nathaniel Smith wrote:
>> On 8 Jan 2013 17:24, "Andrew Collette"  wrote:
>> >
>> > Hi,
>> >
>> > > I think you are voting strongly for the current casting rules, because
>> > > they make it less obvious to the user that scalars are different from
>> > > arrays.
>> >
>> > Maybe this is the source of my confusion... why should scalars be
>> > different from arrays?  They should follow the same rules, as closely
>> > as possible.  If a scalar value would fit in an int16, why not add it
>> > using the rules for an int16 array?
>>
>> The problem is that rule for arrays - and for every other party of
>> numpy in general - are that we *don't* pick types based on values.
>> Numpy always uses input types to determine output types, not input
>> values.
>>
>> # This value fits in an int8
>> In [5]: a = np.array([1])
>>
>> # And yet...
>> In [6]: a.dtype
>> Out[6]: dtype('int64')
>>
>> In [7]: small = np.array([1], dtype=np.int8)
>>
>> # Computing 1 + 1 doesn't need a large integer... but we use one
>> In [8]: (small + a).dtype
>> Out[8]: dtype('int64')
>>
>> Python scalars have an unambiguous types: a Python 'int' is a C
>> 'long', and a Python 'float' is a C 'double'. And these are the types
>> that np.array() converts them to. So it's pretty unambiguous that
>> "using the same rules for arrays and scalars" would mean, ignore the
>> value of the scalar, and in expressions like
>>   np.array([1], dtype=np.int8) + 1
>> we should always upcast to int32/int64. The problem is that this makes
>> working with narrow types very awkward for no real benefit, so
>> everyone pretty much seems to want *some* kind of special case. These
>> are both absolutely special cases:
>>
>> numarray through 1.5: in a binary operation, if one operand has
>> ndim==0 and the other has ndim>0, ignore the width of the ndim==0
>> operand.
>>
>> 1.6, your proposal: in a binary operation, if one operand has ndim==0
>> and the other has ndim>0, downcast the ndim==0 item to the smallest
>> width that is consistent with its value and the other operand's type.
>>
>
> Well, that leaves the maybe not quite implausible proposal of saying
> that numpy scalars behave like arrays with ndim>0, but python scalars
> behave like they do in 1.6. to allow for easier working with narrow
> types.

I know I already said it, but I really think it'd be a bad idea to
have a different behavior between Python scalars and Numpy scalars,
because I think most people would expect them to behave the same (when
knowing what dtype is a Python float / int). It could lead to very
tricky bugs to handle them differently.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Olivier Delalleau
2013/1/8 Chris Barker - NOAA Federal :
> On Tue, Jan 8, 2013 at 12:43 PM, Alan G Isaac  wrote:
>>> New users don't use narrow-width dtypes... it's important to remember
>
>> 1. I think the first statement is wrong.
>> Control over dtypes is a good reason for
>> a new use to consider NumPy.
>
> Absolutely.
>
>> Because NumPy supports broadcasting,
>> it is natural for array-array operations and
>> scalar-array operations to be consistent.
>> I believe anything else will be too confusing.
>
> Theoretically true -- but in practice, the problem arrises because it
> is easy to write literals with the standard python scalars, so one is
> very likely to want to do:
>
> arr = np.zeros((m,n), dtype=np.uint8)
> arr += 3
>
> and not want an upcast.

Note that the behavior with in-place operations is also an interesting
topic, but slightly different, since there is no ambiguity on the
dtype of the output (which is required to match that of the input). I
was actually thinking about this earlier today but decided not to
mention it yet to avoid making the discussion even more complex ;)

The key question is whether the operand should be cast before the
operation, or whether to perform the operation in an upcasted array,
then downcast it back into the original version. I actually thnk the
latter makes more sense (and that's actually what's being done I think
in 1.6.1 from a few tests I tried), and to me this is an argument in
favor of the upcast behavior for non-inplace operations.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Olivier Delalleau
Le mardi 8 janvier 2013, Andrew Collette a écrit :

> Hi Dag,
>
> > So you are saying that, for an array x, you want
> >
> > x + random.randint(10)
> >
> > to produce an array with a random dtype?
>
> Under the proposed behavior, depending on the dtype of x and the value
> from random, this would sometimes add-with-rollover and sometimes
> raise ValueError.
>
> Under the 1.5 behavior, it would always add-with-rollover and preserve
> the type of x.
>
> Under the 1.6 behavior, it produces a range of dtypes, each of which
> is at least large enough to hold the random int.
>
> Personally, I prefer the third option, but I strongly prefer either
> the second or the third to the first.
>
> Andrew
>

Keep in mind that in the third option (current 1.6 behavior) the dtype is
large enough to hold the random number, but not necessarily to hold the
result. So for instance if x is an int16 array with only positive values,
the result of this addition may contain negative values (or not, depending
on the number being drawn). That's the part I feel is flawed with this
behavior, it is quite unpredictable.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-08 Thread Olivier Delalleau
Le mardi 8 janvier 2013, Andrew Collette a écrit :

> Hi,
>
> > Keep in mind that in the third option (current 1.6 behavior) the dtype is
> > large enough to hold the random number, but not necessarily to hold the
> > result. So for instance if x is an int16 array with only positive values,
> > the result of this addition may contain negative values (or not,
> depending
> > on the number being drawn). That's the part I feel is flawed with this
> > behavior, it is quite unpredictable.
>
> Yes, certainly.  But in either the proposed or 1.5 behavior, if the
> values in x are close to the limits of the type, this can happen also.
>

My previous email may not have been clear enough, so to be sure: in my
above example, if the random number is 3, then the result may
contain negative
values (int16). If the random number is 5, then the result will only
contain positive values (upcast to int32). Do you believe it is a good
behavior?

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Olivier Delalleau
2013/1/14 Matthew Brett :
> Hi,
>
> On Mon, Jan 14, 2013 at 9:02 AM, Dave Hirschfeld
>  wrote:
>> Robert Kern  gmail.com> writes:
>>
>>>
>>> >>> >
>>> >>> > One alternative that does not expand the API with two-liners is to let
>>> >>> > the ndarray.fill() method return self:
>>> >>> >
>>> >>> >   a = np.empty(...).fill(20.0)
>>> >>>
>>> >>> This violates the convention that in-place operations never return
>>> >>> self, to avoid confusion with out-of-place operations. E.g.
>>> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
>>> >>> np.sort(), and in the broader Python world, list.sort() versus
>>> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
>>> >>> reason given for list.sort to not return self, even.)
>>> >>>
>>> >>> Maybe enabling this idiom is a good enough reason to break the
>>> >>> convention ("Special cases aren't special enough to break the rules. /
>>> >>> Although practicality beats purity"), but it at least makes me -0 on
>>> >>> this...
>>> >>>
>>> >>
>>> >> I tend to agree with the notion that inplace operations shouldn't return
>>> >> self, but I don't know if it's just because I've been conditioned this 
>>> >> way.
>>> >> Not returning self breaks the fluid interface pattern [1], as noted in a
>>> >> similar discussion on pandas [2], FWIW, though there's likely some way to
>>> >> have both worlds.
>>> >
>>> > Ah-hah, here's the email where Guide officially proclaims that there
>>> > shall be no "fluent interface" nonsense applied to in-place operators
>>> > in Python, because it hurts readability (at least for Dutch people
>>> > ):
>>> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
>>>
>>> That's a statement about the policy for the stdlib, and just one
>>> person's opinion. You, and numpy, are permitted to have a different
>>> opinion.
>>>
>>> In any case, I'm not strongly advocating for it. It's violation of
>>> principle ("no fluent interfaces") is roughly in the same ballpark as
>>> np.filled() ("not every two-liner needs its own function"), so I
>>> thought I would toss it out there for consideration.
>>>
>>> --
>>> Robert Kern
>>>
>>
>> FWIW I'm +1 on the idea. Perhaps because I just don't see many practical
>> downsides to breaking the convention but I regularly see a big issue with 
>> there
>> being no way to instantiate an array with a particular value.
>>
>> The one obvious way to do it is use ones and multiply by the value you want. 
>> I
>> work with a lot of inexperienced programmers and I see this idiom all the 
>> time.
>> It takes a fair amount of numpy knowledge to know that you should do it in 
>> two
>> lines by using empty and setting a slice.
>>
>> In [1]: %timeit NaN*ones(1)
>> 1000 loops, best of 3: 1.74 ms per loop
>>
>> In [2]: %%timeit
>>...: x = empty(1, dtype=float)
>>...: x[:] = NaN
>>...:
>> 1 loops, best of 3: 28 us per loop
>>
>> In [3]: 1.74e-3/28e-6
>> Out[3]: 62.142857142857146
>>
>>
>> Even when not in the mythical "tight loop" setting an array to one and then
>> multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude 
>> slower
>> than what we know they *should* be doing.
>>
>> I'm agnostic as to whether fill should be modified or new functions provided 
>> but
>> I think numpy is currently missing this functionality and that providing it
>> would save a lot of new users from shooting themselves in the foot 
>> performance-
>> wise.
>
> Is this a fair summary?
>
> => fill(shape, val), fill_like(arr, val) - new functions, as proposed
> For: readable, seems to fit a pattern often used, presence in
> namespace may clue people into using the 'fill' rather than * val or +
> val
> Con: a very simple alias for a = ones(shape) ; a.fill(val), maybe
> cluttering already full namespace.
>
> => empty(shape).fill(val) - by allowing return value from arr.fill(val)
> For: readable
> Con: breaks guideline not to return anything from in-place operations,
> no presence in namespace means users may not find this pattern.
>
> => no new API
> For : easy maintenance
> Con : harder for users to discover fill pattern, filling a new array
> requires two lines instead of one.
>
> So maybe the decision rests on:
>
> How important is it that users see these function names in the
> namespace in order to discover the pattern "a = ones(shape) ;
> a.fill(val)"?
>
> How important is it to obey guidelines for no-return-from-in-place?
>
> How important is it to avoid expanding the namespace?
>
> How common is this pattern?
>
> On the last, I'd say that the only common use I have for this pattern
> is to fill an array with NaN.

My 2 cts from a user perspective:

- +1 to have such a function. I usually use numpy.ones * scalar
because honestly, spending two lines of code for such a basic
operations seems like a waste. Even if it's slower and potentially
dangerous due to casting rules.
- I think having a noun rather than a verb makes more sense since we
have numpy.ones and numpy.

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-17 Thread Olivier Delalleau
2013/1/17 Matthew Brett :
> Hi,
>
> On Thu, Jan 17, 2013 at 10:27 PM, Mark Wiebe  wrote:
>>
>> On Thu, Jan 17, 2013 at 2:10 PM, Benjamin Root  wrote:
>>>
>>>
>>>
>>> On Thu, Jan 17, 2013 at 5:04 PM, Eric Firing  wrote:

 On 2013/01/17 4:13 AM, Pierre Haessig wrote:
 > Hi,
 >
 > Le 14/01/2013 20:05, Benjamin Root a écrit :
 >> I do like the way you are thinking in terms of the broadcasting
 >> semantics, but I wonder if that is a bit awkward.  What I mean is, if
 >> one were to use broadcasting semantics for creating an array, wouldn't
 >> one have just simply used broadcasting anyway?  The point of
 >> broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
 >> I can be convinced with some examples.
 >
 > I feel that one of the point of the discussion is : although a new (or
 > not so new...) function to create a filled array would be more elegant
 > than the existing pair of functions "np.zeros" and "np.ones", there are
 > maybe not so many usecases for filled arrays *other than zeros values*.
 >
 > I can remember having initialized a non-zero array *some months ago*.
 > For the anecdote it was a vector of discretized vehicule speed values
 > which I wanted to be initialized with a predefined mean speed value
 > prior to some optimization. In that usecase, I really didn't care about
 > the performance of this initialization step.
 >
 > So my overall feeling after this thread is
 >   - *yes* a single dedicated fill/init/someverb function would give a
 > slightly better API,
 >   -  but *no* it's not important because np.empty and np.zeros covers
 > 95
 > % usecases !

 I agree with your summary and conclusion.

 Eric

>>>
>>> Can we at least have a np.nans() and np.infs() functions?  This should
>>> cover an additional 4% of use-cases.
>>>
>>> Ben Root
>>>
>>> P.S. - I know they aren't verbs...
>>
>>
>> Would it be too weird or clumsy to extend the empty and empty_like functions
>> to do the filling?
>>
>> np.empty((10, 10), fill=np.nan)
>> np.empty_like(my_arr, fill=np.nan)
>
> That sounds like a good idea to me.  Someone wanting a fast way to
> fill an array will probably check out the 'empty' docstring first.
>
> See you,
>
> Matthew

+1 from me. Even though it *is* weird to have both "empty" and "fill" ;)

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Casting Bug or a "Feature"?

2013-01-17 Thread Olivier Delalleau
2013/1/16  :
> On Wed, Jan 16, 2013 at 10:43 PM, Patrick Marsh
>  wrote:
>> Thanks, everyone for chiming in.  Now that I know this behavior exists, I
>> can explicitly prevent it in my code. However, it would be nice if a warning
>> or something was generated to alert users about the inconsistency between
>> var += ... and var = var + ...
>
> Since I also got bitten by this recently in my code, I fully agree.
> I could live with an exception for lossy down casting in this case.

About exceptions: someone mentioned in another thread about casting
how having exceptions can make it difficult to write code. I've
thought a bit more about this issue and I tend to agree, especially on
code that used to "work" (in the sense of doing something -- not
necessarily what you'd want -- without complaining).

Don't get me wrong, when I write code I love when a library crashes
and forces me to be more explicit about what I want, thus saving me
the trouble of hunting down a tricky overflow / casting bug. However,
in a production environment for instance, such an unexpected crash
could have much worse consequences than an incorrect output. And
although you may blame the programmer for not being careful enough
about types, he couldn't expect it might crash the application back
when this code was written

Long story short, +1 for warning, -1 for exception, and +1 for a
config flag that allows one to change to exceptions by default, if
desired.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-17 Thread Olivier Delalleau
2013/1/17 Matthew Brett :
> Hi,
>
> On Fri, Jan 18, 2013 at 1:04 AM, Chris Barker - NOAA Federal
>  wrote:
>> On Thu, Jan 17, 2013 at 6:26 AM, Matthew Brett  
>> wrote:
>>
>>> I am starting to wonder if we should aim for making
>>>
>>> * scalar and array casting rules the same;
>>> * Python int / float scalars become int32 / 64 or float64;
>>
>> aren't they already? I'm not sure what you are proposing.
>
> Sorry - yes that is what they are already, this sentence refers back
> to an earlier suggestion of mine on the thread, which I am discarding.
>
>>> This has the benefit of being very easy to understand and explain.  It
>>> makes dtypes predictable in the sense they don't depend on value.
>>
>> That is key -- I don't think casting should ever depend on value.
>>
>>> Those wanting to maintain - say - float32 will need to cast scalars to 
>>> float32.
>>>
>>> Maybe the use-cases motivating the scalar casting rules - maintaining
>>> float32 precision in particular - can be dealt with by careful casting
>>> of scalars, throwing the burden onto the memory-conscious to maintain
>>> their dtypes.
>>
>> IIRC this is how it worked "back in the day" (the Numeric day? -- and
>> I'm pretty sure that in the long run it worked out badly. the core
>> problem is that there are only python literals for a couple types, and
>> it was oh so easy to do things like:
>>
>> my_arr = np,zeros(shape, dtype-float32)
>>
>> another_array = my_array * 4.0
>>
>> and you'd suddenly get a float64 array. (of course, we already know
>> all that..) I suppose this has the up side of being safe, and having
>> scalar and array casting rules be the same is of course appealing, but
>> you use a particular size dtype for a reason,and it's a real pain to
>> maintain it.
>
> Yes, I do understand that.  The difference - as I understand it - is
> that back in the day, numeric did not have the the float32 etc
> scalars, so you could not do:
>
> another_array = my_array * np.float32(4.0)
>
> (please someone correct me if I'm wrong).
>
>> Casual users will use the defaults that match the Python types anyway.
>
> I think what we are reading in this thread is that even experienced
> numpy users can find the scalar casting rules surprising, and that's a
> real problem, it seems to me.
>
> The person with a massive float32 array certainly should have the
> ability to control upcasting, but I think the default should be the
> least surprising thing, and that, it seems to me, is for the casting
> rules to be the same for arrays and scalars.   In the very long term.

That would also be my preference, after banging my head against this
problem for a while now, because it's simple and consistent.

Since most of the related issues seem to come from integer arrays, a
middle-ground may be the following:
- Integer-type arrays get upcasted by scalars as in usual array /
array operations.
- Float/Complex-type arrays don't get upcasted by scalars except when
the scalar is complex and the array is float.

It makes the rule a bit more complex, but has the advantage of better
preserving float types while getting rid of most issues related to
integer overflows.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Casting Bug or a "Feature"?

2013-01-18 Thread Olivier Delalleau
Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a écrit :

> On Thu, Jan 17, 2013 at 5:19 PM, Olivier Delalleau 
> >
> wrote:
> > 2013/1/16  >:
> >> On Wed, Jan 16, 2013 at 10:43 PM, Patrick Marsh
> >> > wrote:
>
> >> I could live with an exception for lossy down casting in this case.
>
> I'm not sure what the idea here is -- would you only get an exception
> if the value was such that the downcast would be lossy? If so, a major
> -1
>
> The other option would be to always raise an exception if types would
> cause a downcast, i.e:
>
> arr = np.zeros(shape, dtype-uint8)
>
> arr2 = arr + 30 # this would raise an exception
>
> arr2 = arr + np.uint8(30) # you'd have to do this
>
> That sure would be clear and result if few errors of this type, but
> sure seems verbose and "static language like" to me.
>
> > Long story short, +1 for warning, -1 for exception, and +1 for a
> > config flag that allows one to change to exceptions by default, if
> > desired.
>
> is this for value-dependent or any casting of this sort?


What I had in mind here is the situation where the scalar's dtype is
fundamentally different from the array's dtype (i.e. float vs int, complex
vs float) and can't be cast exactly into the array's dtype (so,
value-dependent), which is the situation that originated this thread.
I don't mind removing the second part ("and can't be cast exactly...") to
have it value-independent.
Other tricky situations with integer arrays are to some extent related to
how regular (not in-place) additions are handled, something that should
probably be settled first.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-18 Thread Olivier Delalleau
Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a écrit :

> On Thu, Jan 17, 2013 at 5:34 PM, Olivier Delalleau 
> >
> wrote:
> >> Yes, I do understand that.  The difference - as I understand it - is
> >> that back in the day, numeric did not have the the float32 etc
> >> scalars, so you could not do:
> >>
> >> another_array = my_array * np.float32(4.0)
> >>
> >> (please someone correct me if I'm wrong).
>
> correct, it didn't have any scalars, but you could (and had to) still
> do something like:
>
> another_array = my_array * np.array(4.0, dtype=np.float32)
>
> a bit more verbose, but the verbosity wasn't the key issue -- it was
> doing anything special at all.
>
> >>> Casual users will use the defaults that match the Python types anyway.
> >>
> >> I think what we are reading in this thread is that even experienced
> >> numpy users can find the scalar casting rules surprising, and that's a
> >> real problem, it seems to me.
>
> for sure -- but it's still relevant -- if you want non-default types,
> you need to understand the rules an be more careful.
>
> >> The person with a massive float32 array certainly should have the
> >> ability to control upcasting, but I think the default should be the
> >> least surprising thing, and that, it seems to me, is for the casting
> >> rules to be the same for arrays and scalars.   In the very long term.
>
> "A foolish consistency is the hobgoblin of little minds"
>
> -- just kidding.
>
> But in all seriousness -- accidental upcasting really was a big old
> pain back in the day -- we are not making this up. We re using the
> term "least surprising", but I now I was often surprised that I had
> lost my nice compact array.
>
> The user will need to think about it no matter how you slice it.
>
> > Since most of the related issues seem to come from integer arrays, a
> > middle-ground may be the following:
> > - Integer-type arrays get upcasted by scalars as in usual array /
> > array operations.
> > - Float/Complex-type arrays don't get upcasted by scalars except when
> > the scalar is complex and the array is float.
>
> I'm not sure that integer arrays are any more of an an issue, and
> having integer types and float typed behave differently is really
> asking for trouble!


"A foolish consistency is the hobgoblin of little minds" :P

If you check again the examples in this thread exhibiting surprising /
unexpected behavior, you'll notice most of them are with integers.
The tricky thing about integers is that downcasting can dramatically change
your result. With floats, not so much: you get approximation errors
(usually what you want) and the occasional nan / inf creeping in (usally
noticeable).

I too would prefer similar rules between ints & floats, but after all these
discussions I'm starting to think it may be worth acknowledging they are
different beasts.

Anyway, in my mind we were discussing what might be the desired behavior in
the long term, and my suggestion isn't practical in the short term since it
may break a significant amount of code. So I'm still in favor of
Nathaniel's proposal, except with exceptions replaced by warnings by
default (and no warning for lossy downcasting of e.g. float64 ->
float32 except for zero / inf, as discussed at some point in the thread).
-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2013-01-20 Thread Olivier Delalleau
2013/1/18 Matthew Brett :
> Hi,
>
> On Fri, Jan 18, 2013 at 7:58 PM, Chris Barker - NOAA Federal
>  wrote:
>> On Fri, Jan 18, 2013 at 4:39 AM, Olivier Delalleau  wrote:
>>> Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a écrit :
>>
>>> If you check again the examples in this thread exhibiting surprising /
>>> unexpected behavior, you'll notice most of them are with integers.
>>> The tricky thing about integers is that downcasting can dramatically change
>>> your result. With floats, not so much: you get approximation errors (usually
>>> what you want) and the occasional nan / inf creeping in (usally noticeable).
>>
>> fair enough.
>>
>> However my core argument is that people use non-standard (usually
>> smaller) dtypes for a reason, and it should be hard to accidentally
>> up-cast.
>>
>> This is in contrast with the argument that accidental down-casting can
>> produce incorrect results, and thus it should be hard to accidentally
>> down-cast -- same argument whether the incorrect results are drastic
>> or not
>>
>> It's really a question of which of these we think should be prioritized.
>
> After thinking about it for a while, it seems to me Olivier's
> suggestion is a good one.
>
> The rule becomes the following:
>
> array + scalar casting is the same as array + array casting except
> array + scalar casting does not upcast floating point precision of the
> array.
>
> Am I right (Chris, Perry?) that this deals with almost all your cases?
>  Meaning that it is upcasting of floats that is the main problem, not
> upcasting of (u)ints?
>
> This rule seems to me not very far from the current 1.6 behavior; it
> upcasts more - but the dtype is now predictable.  It's easy to
> explain.  It avoids the obvious errors that the 1.6 rules were trying
> to avoid.  It doesn't seem too far to stretch to make a distinction
> between rules about range (ints) and rules about precision (float,
> complex).
>
> What do you'all think?

Personally, I think the main issue with my suggestion is that it seems
hard to go there from the current behavior -- without potentially
breaking existing code in non-obvious ways. The main problematic case
I foresee is the typical "small_int_array + 1", which would get
upcasted while it wasn't the case before (neither in 1.5 nor in 1.6).
That's why I think Nathaniel's proposal is more practical.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sources more confusing in Python

2013-04-07 Thread Olivier Delalleau
The Python Package Index (https://pypi.python.org/pypi) is to my knowledge
the largest centralized source of Python packages. That's where
easy_install and pip typically fetch them so that you can install from the
command line without manual download.

-=- Olivier


2013/4/7 Happyman 

> Hello,
>
> I started using python 4-5 months ago. At that time I didn't realize there
> are incredibly many resource like modules, additional programs (ready one)
> in python. The problem is to which one I can get all I want "properly". I
> mean where (exact place) I can download standard modules without going
> other links?? For example, Excel python module, Image processing module,
> something module..Every time I get modules from different links..
>
> Is there exact place (stable) to get simply rather than picking/jumping
> from one to another site??
>
> Any answer is appreciated
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sources more confusing in Python

2013-04-07 Thread Olivier Delalleau
2013/4/7 

> On Sun, Apr 7, 2013 at 5:34 PM, Steve Waterbury 
> wrote:
> > On 04/07/2013 05:30 PM, Nathaniel Smith wrote:
> >> On Sun, Apr 7, 2013 at 10:25 PM, Steve Waterbury
> >>  wrote:
> >>> On 04/07/2013 05:02 PM, Chris Barker - NOAA Federal wrote:
>  On Sun, Apr 7, 2013 at 8:06 AM, Daπid  wrote:
> > On 7 April 2013 16:53, Happyman  wrote:
> 
> > $pip install numpy # to install package "numpy"
> 
>  as a warning, last I checked pip did not support binary installs  ...
> >>>
> >>> Guess you didn't check very recently ;) -- pip does indeed
> >>> support binary installs.
> >>
> >> Binary install in this case means, downloading a pre-built package
> >> containing .so/.dll files -- very useful if you don't have a working C
> >> compiler environment on the system you're installing onto.
> >
> > Point taken -- just didn't want pip to be sold short.
> > I'm one of those spoiled Linux people, obviously ... ;)
>
> However, pip is really awful on Windows.
>
> If you have a virtualenv and you use --upgrade, it wants to upgrade all
> package dependencies (!), but it doesn't know how (with numpy and scipy).
>
> (easy_install was so much nicer.)
>
> Josef
>

You can use --no-deps to prevent pip from trying to upgrade dependencies.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Inconsistent output dtype in mixed scalar-array operations

2011-05-10 Thread Olivier Delalleau
Hi,

I opened a ticket about this (http://projects.scipy.org/numpy/ticket/1827),
but thought I'd also ask on the mailing list whether this is working as
intended.

It looks like the resulting dtype of a "+" operation between a scalar and an
array can depend on the order of the arguments. This seems wrong.

Tested with numpy 1.5.1 under Ubuntu 11.04 (64-bit).

In [4]: (numpy.array(0, dtype='uint16') + numpy.array([1],
dtype='int8')).dtype
Out[4]: dtype('int16')

In [6]: (numpy.array([1], dtype='int8') + numpy.array(0,
dtype='uint16')).dtype
Out[6]: dtype('int8')

Thanks for any feedback,

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Inconsistent output dtype in mixed scalar-array operations

2011-05-10 Thread Olivier Delalleau
2011/5/10 Charles R Harris 

>
>
> On Tue, May 10, 2011 at 11:16 AM, Olivier Delalleau  wrote:
>
>> Hi,
>>
>> I opened a ticket about this (http://projects.scipy.org/numpy/ticket/1827
>> ),
>> but thought I'd also ask on the mailing list whether this is working as
>> intended.
>>
>> It looks like the resulting dtype of a "+" operation between a scalar and
>> an
>> array can depend on the order of the arguments. This seems wrong.
>>
>> Tested with numpy 1.5.1 under Ubuntu 11.04 (64-bit).
>>
>> In [4]: (numpy.array(0, dtype='uint16') + numpy.array([1],
>> dtype='int8')).dtype
>> Out[4]: dtype('int16')
>>
>> In [6]: (numpy.array([1], dtype='int8') + numpy.array(0,
>> dtype='uint16')).dtype
>> Out[6]: dtype('int8')
>>
>> Thanks for any feedback,
>>
>>
> This has been fixed in the coming 1.6.0 release.
>
> In [2]: (numpy.array(0, dtype='uint16') +
> numpy.array([1],dtype='int8')).dtype
> Out[2]: dtype('int8')
>
> In [3]: (numpy.array([1], dtype='int8') +
> numpy.array(0,dtype='uint16')).dtype
> Out[3]: dtype('int8')
>
> Chuck
>

Cool, thanks, I'll close the ticket then :)

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Mapping of dtype to C types

2011-05-11 Thread Olivier Delalleau
2011/5/11 Sturla Molden 

> Den 09.05.2011 15:58, skrev Keith Goodman:
> > On Mon, May 9, 2011 at 1:46 AM, Pauli Virtanen  wrote:
> >> Sun, 08 May 2011 14:45:45 -0700, Keith Goodman wrote:
> >>> I'm writing a function that accepts four possible dtypes: int32, int64,
> >>> float32, float64. The function will call a C extension (wrapped in
> >>> Cython). What are the equivalent C types? int, long, float, double,
> >>> respectively? Will that work on all systems?
> >> Long can be 32-bit or 64-bit, depending on the platform.
> >>
> >> The types available in Numpy are listed here, including
> >> the information which of them are compatible with which C types:
> >>
> >> http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html
> >>
> >> The C long seems not to be listed -- but it's the same as
> >> "Python int", i.e., np.int_ will work.
> >>
> >> IIRC, the Numpy type codes, dtype.kind, map directly to C types.
> > Does this mapping look right?
> >
>
> No.
>
> C int is at least 16 bits, C long is at least 32 bits.
>
> The size of long and size of int depend on compiler and platform.
>
> I'd write a small Cython function and ask the C compiler for an
> authorative answer.
>
> Alternatively you could use ctypes. This is 64-bit Python on Windows:
>
>  >>> import ctypes
>  >>> ctypes.sizeof(ctypes.c_long)
> 4
>  >>> ctypes.sizeof(ctypes.c_int)
> 4
>
>
I think Keith's approach should work, as long as there is one C type listed
on the url you mention that corresponds to your four dtypes.
Something like (not tested at all):

map = {}
for dtype in ('int32', 'int64', 'float32', 'float64'):
  for ctype (N.byte, N.short, N.intc, N.long.long, ...): # List all
compatible C types here
if N.dtype(ctype) == dtype:
   map{dtype} = ctype
   break
assert len(map) == 4

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] cPickle.load says 'invalid signature'

2011-05-18 Thread Olivier Delalleau
It's a wild guess, but try to save your pickle with 'wb' as argument of
open, and protocol=-1. Then open it with 'rb'. It helped me fix some
cross-platform issues in the past.

-=- Olivier

2011/5/18 Neal Becker 

> The file is pickle saved on i386 and loaded on x86_64.  It contains a numpy
> array (amoungst other things).
>
> On load it says:
>
> RuntimeError: invalid signature
>
> Is binary format not portable?
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How can Import DATA from a Fortran file

2011-05-18 Thread Olivier Delalleau
Seems like something the following would work (it's not particularly
efficient nor elegant though, and there may be some stupid bug since I
didn't actually try it). Also, there may be some more appropriate numpy
structure to handle named columns (the example below computes separately a
basic numpy array, and the map from a colunn's name to its index).

def to_float(x):
  try:
return float(x)
  except ValueError:
assert x == "POS-VELOCITY"
return numpy.nan

data = []
for line_idx, line in enumerate(open("data.txt").
readlines()):
  if line_idx == 0:
 # header
 field = dict((field_name, col_idx) for col_idx, field_name in
enumerate(line.split()))
  else:
# data line
data.append(map(to_float, line.split())
data = numpy.array(data)
# Then to get a single column...
col_X = data[:, field['X']]

-=- Olivier


2011/5/18 Aradenatorix Veckhom Vacelaevus 

> Hi everybody:
>
> I have a file in simple text with information obtained in Fortran  77 and I
> need to use the data inside for visualize with Mayavi. I was fighting for a
> while with the VTK simple legacy format. Finally I could run an small
> example, but now I need to select specific information from that file. Let
> me explain how is it.
>
> I have a header of two lines where tell me the number of domains I have,
> and later the values of the position and pressure
>
> Later for each domain I have two arrays with 125 rows, the first one has
> another header with two lines where we can find the name of each variable
> (column) and the dimension of the array for each coordinate (x, y, z). Later
> we have the array (125 x 8) where the first column is the number or ID for
> each point (row) the next three contains the coordinates for the point pnx,
> pny, pnz the following three the displacement Dnx, Dny, Dnz finally the last
> column contains the value of the pressure field in each point.
>
> The second array has only one line as header where specifies the number of
> points (rows) and the names of each variable (column). This array has the
> dimension 125 x 11, as in the first array the first column has the number of
> row, the following columns contain the values of the alpha elements used for
> the Finite Element Analysis from where I have to find the velocity (vx = 0.5
> (alpha1 + alpha2), vy = 0.5 (alpha3 + alpha4), vz = 0.5 (alpha5 + alpha6))
> finally the last column is a string that says pos-velocity so we can forget
> it.
>
> In an schematic form:
>
> Header:
>
> 5 NUMBER OF SUBDOMAINS
>
>   4852108.55805672200   4.858791656580212E+008  POSITION-PRESSURE
>
>
> First array
>
>  10.00 10.00 10.00 LX, LY, LZ, ... #header
>
>  5 5 5  LS,MS,NS, - NEE,X,Y,Z,DX,DY,DZ,LAMBDA
> (NEE) #header
>
> 1  .000E+00  .000E+00  .000E+00  .200E+01  .200E+01  .200E+01
> 485879165.658 #125 rows with 8 columns
>
> 2  .200E+01  .000E+00  .000E+00  .200E+01  .200E+01  .200E+01
> 362994604.232
> 3  .400E+01  .000E+00  .000E+00  .200E+01  .200E+01  .200E+01
> 287889668.714
> 4  .600E+01  .000E+00  .000E+00  .200E+01  .200E+01  .200E+01
> 249984468.929
> 5  .800E+01  .000E+00  .000E+00  .200E+01  .200E+01  .200E+01
> 224851296.708
>
> .
>
> .
>
> .
>
> 125  .800E+01  .800E+01  .800E+01  .200E+01  .200E+01  .200E+01
> 192572200.800
>
>
> Second array:
>
> 125 L, X, Y, Z, ALPHA(I1), ALPHA(I2), ALPHA(J1), ALPHA(J2), ALPHA(K1),
> ALPHA(K2), ... #header
>  1 1.000 1.000 1.000  .000E+00  .845E-04  .000E+00
>  .826E-04  .000E+00  .828E-04  POS-VELOCITY
>  2 3.000 1.000 1.000  .845E-04  .308E-04  .000E+00
>  .267E-04  .000E+00  .269E-04  POS-VELOCITY
>  3 5.000 1.000 1.000  .308E-04  .177E-04  .000E+00
>  .633E-05  .000E+00  .666E-05  POS-VELOCITY
>  4 7.000 1.000 1.000  .177E-04  .122E-04  .000E+00
>  .246E-05  .000E+00  .297E-05  POS-VELOCITY
>  5 9.000 1.000 1.000  .122E-04  .908E-05  .000E+00
>  .114E-05  .000E+00  .183E-05  POS-VELOCITY
>
> .
> .
> .
>
>  125 9.000 9.000 9.000  .102E-04  .160E-04  .133E-05  .000E+00
>  .457E-05  .000E+00  POS-VELOCITY
>
>
> And both arrays repeat other 4 times, it means I have 5 pairs of arrays, a
> pair for each domain.
>
> I want to divide the file in five pieces one for each domain, but what I
> really need is can manipulate the arrays by columns. I know that numpy is
> able to import files in many formats, and I want to believe that inside
> numpy I can easily manipulate an array by columns, but I don't how, so all
> this explanation is for ask your helping and find a way to can get from this
> arrays the information I need to write a new file with the info for export
> as vtk file and can visualize in Mayavi2 (do you have a better idea or way
> for visualize this?). Thanks for your helping and your time.
>
> Aradnix!
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listi

Re: [Numpy-discussion] How can Import DATA from a Fortran file

2011-05-20 Thread Olivier Delalleau
StringIO(data_file_name) will not read your file, it will just allow you to
read the string data_file_name. You can probably just use
open(data_file_name) (and yes,you'll probably need to open it for each call
to genfromtxt).
Sorry my script didn't work, I definitely didn't expect it to work right
away, it was more to give you an idea of the kind of stuff you could do
manually. I'd expect it should be pretty straightforward to fix it. I
personally find it easier to write my own custom code to load such small
data files, rather than try to find the right parameters for an existing
function ;) (especially with mixed float / string dta).

-=- Olivier

2011/5/19 Aradenatorix Veckhom Vacelaevus 

> Thanks both for your helping:
>
> I was checking the script fro Olivier but doesn't works yet, later I tried
> with the asciitable package but I also never could read any of my files
> (finally I decide create a single file for each case, it means to get 2
> arrays for each domain).
>
> I was reading a while about how to import data with genfromtxt in Numpy, it
> was very interesting because I have a lot of options for import data, I can
> skip headers and footers, I can import selected columns and other stuff, so
> I wanna use it for import my columns in each case and later get an output in
> a text file, how can I get it?
>
> My idea is to make something like:
>
> #!/usr/bin/env python
> #-*- coding:utf-8 -*-
>
> import numpy as np
> from StringIO import StringIO
>
> np.genfromtxt(StringIO(data_file_name), autostrip=True, skip_header=2,
> usecols=(1, 2, 3))
>
> np.genfromtxt(StringIO(data_file_name), autostrip=True, skip_header=2,
> usecols=(4, 5, 6))
>
> np.genfromtxt(StringIO(data_file_name), autostrip=True, skip_header=2,
> usecols=(-1))
>
> But I'm not sure how to use it, I have not experience importing that stuff,
> but also I don't know If I need to add a line at the beginning with
>
> data = StringIO(data_file_name)
>
> Could you help me once more?
>
> Regards:
> Aradnix
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How can Import DATA from a Fortran file

2011-05-21 Thread Olivier Delalleau
There are free file hosting services you can google up.

-=- Olivier

2011/5/20 Aradenatorix Veckhom Vacelaevus 

> Hi again:
>
> Thanks Olivier, I understood you only wanna show me a way for solve my
> problem. I'm reading about the options in python for files and stuff. I
> think is better to write a code for solve our problems, but I found very
> useful the genfromtxt tool so I want to use it if I can't I have to parse my
> file with a lot of loops.
>
> And I agree with Bruce also about the complexity of my original data file
> so I divided the original file in ten files each one with an array for
> simplify the task, now I need to see how to open each file for can read it
> and obtain what I need from.
>
> I think I need to make two scripts one for the first type arrays and the
> second for the others where maybe I need to operate over the columns, and I
> think I really need your helping, but how to provide the actual file? is a
> little big for attach here. Thanks again for all your helping and advices.
>
> Regards:
> Aradnix
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] breaking array indices

2011-06-02 Thread Olivier Delalleau
I think this does what you want:

def seq_split(x):
  r = [0] + list(numpy.where(x[1:] != x[:-1] + 1)[0] + 1) + [None]
  return [x[r[i]:r[i + 1]] for i in xrange(len(r) - 1)]

-=- Olivier

2011/6/2 Mathew Yeates 

> Hi
> I have indices into an array I'd like split so they are sequential
> e.g.
> [1,2,3,10,11] -> [1,2,3],[10,11]
>
> How do I do this?
>
> -Mathew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy input with genfromttxt()

2011-06-03 Thread Olivier Delalleau
Here's an ugly one-liner:

numpy.genfromtxt('data.txt', converters=dict([k, lambda x:
float(x.replace(',', '.'))] for k in
range(len(open('data.txt').readline().strip().split()

-=- Olivier

2011/6/3 jgrub 

>
> Hello, im actually try to read in  data with genfromtxt(),
> i want to read in numbers which are stored in  a textfile like this:
>
> 0,000,0012210,0012210,001,278076
>  160,102539
>
> 4,00E-7 0,000,000,0024411,279297
>  160,00
>
> 8,00E-7 -0,001221   0,000,0012211,279297
>  159,897461
>
> 1,20E-6 0,000,000,0012211,279297
>  160,00
>
> 1,60E-6 -0,001221   0,000,0036621,278076
>  159,897461
>
> 2,00E-6 0,00-0,001221   0,0036621,279297
>  160,00
>
> my problem is that they are seperated with a comma so when i try to read
> them
> i just get a numpy array with NaN's.  is there a short way to replace the
> "," with "."  ?
>
> --
> View this message in context:
> http://old.nabble.com/numpy-input-with-genfromttxt%28%29-tp31757790p31757790.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy input with genfromttxt()

2011-06-03 Thread Olivier Delalleau
2011/6/3 Bruce Southey 

> On 06/03/2011 10:33 AM, jonasr wrote:
> >
> >
> > Bruce Southey wrote:
> >> On 06/03/2011 09:33 AM, jonasr wrote:
> >>> thank you very much, works much nicer and faster in comparison to the
> >>> script
> >>> i wrote and used before  ,
> >>> im not that much used to lambda forms but it seems quit usefull in
> >>> situations like this
> >>>
> >>>
> >>> Olivier Delalleau-2 wrote:
> >>>> Here's an ugly one-liner:
> >>>>
> >>>> numpy.genfromtxt('data.txt', converters=dict([k, lambda x:
> >>>> float(x.replace(',', '.'))] for k in
> >>>> range(len(open('data.txt').readline().strip().split()
> >>>>
> >>>> -=- Olivier
> >>>>
> >>>> 2011/6/3 jgrub
> >>>>
> >>>>> Hello, im actually try to read in  data with genfromtxt(),
> >>>>> i want to read in numbers which are stored in  a textfile like this:
> >>>>>
> >>>>> 0,000,0012210,0012210,00
> >>>>> 1,278076
> >>>>>160,102539
> >>>>>
> >>>>> 4,00E-7 0,000,000,002441
> >>>>> 1,279297
> >>>>>160,00
> >>>>>
> >>>>> 8,00E-7 -0,001221   0,000,001221
> >>>>> 1,279297
> >>>>>159,897461
> >>>>>
> >>>>> 1,20E-6 0,000,000,001221
> >>>>> 1,279297
> >>>>>160,00
> >>>>>
> >>>>> 1,60E-6 -0,001221   0,000,003662
> >>>>> 1,278076
> >>>>>159,897461
> >>>>>
> >>>>> 2,00E-6 0,00-0,001221   0,003662
> >>>>> 1,279297
> >>>>>160,00
> >>>>>
> >>>>> my problem is that they are seperated with a comma so when i try to
> >>>>> read
> >>>>> them
> >>>>> i just get a numpy array with NaN's.  is there a short way to replace
> >>>>> the
> >>>>> "," with "."  ?
> >>>>>
> >>>>> --
> >>>>> View this message in context:
> >>>>>
> http://old.nabble.com/numpy-input-with-genfromttxt%28%29-tp31757790p31757790.html
> >>>>> Sent from the Numpy-discussion mailing list archive at Nabble.com.
> >>>>>
> >>>>> ___
> >>>>> NumPy-Discussion mailing list
> >>>>> NumPy-Discussion@scipy.org
> >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>>>
> >>>> ___
> >>>> NumPy-Discussion mailing list
> >>>> NumPy-Discussion@scipy.org
> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>>
> >>>>
> >> Isn't this just because of the 'locale' settings?
> >> A quick search showed  ticket 884 that has code changing the locale that
> >> may be useful for you:
> >> http://projects.scipy.org/numpy/ticket/884
> >>
> >> Perhaps a similar bug exists with genfromtxt?
> >>
> >> If it is nicely behaved, just use Python's csv module.
> >>   From the csv documentation:
> >> "Since open()
> >> <http://docs.python.org/release/3.1.3/library/functions.html#open>  is
> >> used to open a CSV file for reading, the file will by default be decoded
> >> into unicode using the system default encoding (see
> >> locale.getpreferredencoding()
> >> <
> http://docs.python.org/release/3.1.3/library/locale.html#locale.getpreferredencoding
> >).
> >> To decode a file using a different encoding, use the encoding argument
> >> of open:"
> >>
> >>
> >> Bruce
> >>
> >>
> >>
> >> ___
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >>
> > i checked my local settings and got
> >
> > In [37]: locale.getlocale()
> > Out[37]: ('de_DE', 'UTF8')
> >
> > when i try
> >
> > locale.setlocale(locale.LC_NUMERIC, 'de_DE')
> >
> > i get: Error: unsupported locale setting
> > same with 'fi_FI'
> > any idea ?
> >
> >
> >
> >
> You don't seem to be using Python.
> The 'In [37]' suggest ipython so you have to find how ipython does this.
>
> However, I am not sure that changing locale will help here but I am
> rather naive on this. If it doesn't then file a ticket with a clear
> example and ideally with a patch (one can hope).
>
> Bruce
>

Should work just the same in iPython. Works fine with me in iPython with
Python 2.5.1.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] k maximal elements

2011-06-06 Thread Olivier Delalleau
I don't really understand your proposed solution, but you can do something
like:

import heapq
q = list(x)
heapq.heapify(q)
k_smallest = [heapq.heappop(q) for i in xrange(k)]

which is in O(n + k log n)

-=- Olivier

2011/6/6 Alex Ter-Sarkissov 

> I have a vector of positive integers length n. Is there a simple (i.e.
> without sorting/ranking) of 'pulling out' k larrgest (or smallest) values.
> Something like
>
> *sum(x[sum(x,1)>(max(sum(x,1)+min(sum(x,1/2,])*
>
> but smarter
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy function error

2011-06-10 Thread Olivier Delalleau
You are overriding your f1 function with a float (with
"f1=f1(self.ptdata[i])"), so trying to call f1(xul) later will raise this
exception.

-=- Olivier

2011/6/10 jonasr 

>
> Hello,
> i have the following problem, the following code doesnt work
>
> def f1(x): return self.lgdata[2*i][0]*float(x)+self.lgdata[2*i][1]
> def f2(x): return self.lgdata[2*i+1][0]*float(x)+self.lgdata[2*i+1][1]
> f1=f1(self.ptdata[i])
> 2=f2(self.ptdata[i])
> t=abs(f1-f2)
> deltat.append(t)
> temptot.append(f1)
> rezipr.append(1/t)
> xul , xur = self.ptdata[i]-0.001, self.ptdata[i]+0.001
> print xul,xur,f1,f2,"test \n"
> verts =[(xul,f1(xul)),(xul,f2(xul)),(xur,f2(xur)),(xur,f1(xur))]
>
> it gives me the following error:
>
> Traceback (most recent call last):
>  File
> "/usr/lib/python2.7/site-packages/matplotlib/backends/backend_gtk.py", line
> 265, in key_press_event
>FigureCanvasBase.key_press_event(self, key, guiEvent=event)
>  File "/usr/lib/python2.7/site-packages/matplotlib/backend_bases.py", line
> 1523, in key_press_event
>self.callbacks.process(s, event)
>  File "/usr/lib/python2.7/site-packages/matplotlib/cbook.py", line 265, in
> process
>proxy(*args, **kwargs)
>  File "/usr/lib/python2.7/site-packages/matplotlib/cbook.py", line 191, in
> __call__
>return mtd(*args, **kwargs)
>  File "auswertung.py", line 103, in __call__
>verts =[(xul,f1(xul)),(xul,f2(xul)),(xur,f2(xur)),(xur,f1(xur))]
> TypeError: 'numpy.float64' object is not callable
>
> have no idea where the problem is, since ptdata is an 1dim array there
> should be no problem to
> pass it to a function ?
>
>
>
> --
> View this message in context:
> http://old.nabble.com/numpy-function-error-tp31817481p31817481.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication

2011-06-10 Thread Olivier Delalleau
It may not work for you depending on your specific problem constraints, but
if you could flatten the arrays, then it would be a dot, and you could maybe
compute multiple such dot products by storing those flattened arrays into a
matrix.

-=- Olivier

2011/6/10 Brandt Belson 

> Hi,
> Thanks for getting back to me.
> I'm doing element wise multiplication, basically innerProduct =
> numpy.sum(array1*array2) where array1 and array2 are, in general,
> multidimensional. I need to do many of these operations, and I'd like to
> split up the tasks between the different cores. I'm not using numpy.dot, if
> I'm not mistaken I don't think that would do what I need.
> Thanks again,
> Brandt
>
>
> Message: 1
>> Date: Thu, 09 Jun 2011 13:11:40 -0700
>> From: Christopher Barker 
>> Subject: Re: [Numpy-discussion] Using multiprocessing (shared memory)
>>with numpy array multiplication
>> To: Discussion of Numerical Python 
>> Message-ID: <4df128fc.8000...@noaa.gov>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Not much time, here, but since you got no replies earlier:
>>
>>
>> >  > I'm parallelizing some code I've written using the built in
>> > multiprocessing
>> >  > module. In my application, I need to multiply many large arrays
>> > together
>>
>> is the matrix multiplication, or element-wise? If matrix, then numpy
>> should be using LAPACK, which, depending on how its built, could be
>> using all your cores already. This is heavily dependent on your your
>> numpy (really the LAPACK it uses0 is built.
>>
>> >  > and
>> >  > sum the resulting product arrays (inner products).
>>
>> are you using numpy.dot() for that? If so, then the above applies to
>> that as well.
>>
>> I know I could look at your code to answer these questions, but I
>> thought this might help.
>>
>> -Chris
>>
>>
>>
>>
>>
>> --
>> Christopher Barker, Ph.D.
>> Oceanographer
>>
>> Emergency Response Division
>> NOAA/NOS/OR&R(206) 526-6959   voice
>> 7600 Sand Point Way NE   (206) 526-6329   fax
>> Seattle, WA  98115   (206) 526-6317   main reception
>>
>> chris.bar...@noaa.gov
>>
>>
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy type mismatch

2011-06-10 Thread Olivier Delalleau
It's ok to have two different dtypes (in the sense that "d1 is not d2") such
that they represent the same kind of data (in the sense that d1 == d2).

However I think your very first test should have returned True (for what
it's worth, it returns true with 1.5.1 on Windows 32 bit).

-=- Olivier

2011/6/10 Benjamin Root 

> Came across an odd error while using numpy master.  Note, my system is
> 32-bits.
>
> >>> import numpy as np
> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32
> False
> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64
> True
> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32
> True
> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64
> True
>
> So, only the summation performed with a np.int32 accumulator results in a
> type that doesn't match the expected type.  Now, for even more strangeness:
>
> >>> type(np.sum([1, 2, 3], dtype=np.int32))
> 
> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32
> '0x9599a0'
> >>> hex(id(np.int32))
> '0x959a80'
>
> So, the type from the sum() reports itself as a numpy int, but its memory
> address is different from the memory address for np.int32.
>
> Weirdness...
> Ben Root
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy type mismatch

2011-06-10 Thread Olivier Delalleau
2011/6/10 Benjamin Root 

>
>
> On Fri, Jun 10, 2011 at 3:02 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Fri, Jun 10, 2011 at 1:50 PM, Benjamin Root  wrote:
>>
>>> Came across an odd error while using numpy master.  Note, my system is
>>> 32-bits.
>>>
>>> >>> import numpy as np
>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32
>>> False
>>> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64
>>> True
>>> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32
>>> True
>>> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64
>>> True
>>>
>>> So, only the summation performed with a np.int32 accumulator results in a
>>> type that doesn't match the expected type.  Now, for even more strangeness:
>>>
>>> >>> type(np.sum([1, 2, 3], dtype=np.int32))
>>> 
>>> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32
>>> '0x9599a0'
>>> >>> hex(id(np.int32))
>>> '0x959a80'
>>>
>>> So, the type from the sum() reports itself as a numpy int, but its memory
>>> address is different from the memory address for np.int32.
>>>
>>>
>> One of them is probably a long, print out the typecode, dtype.char.
>>
>> Chuck
>>
>>
>>
> Good intuition, but odd result...
>
> >>> import numpy as np
> >>> a = np.sum([1, 2, 3], dtype=np.int32)
> >>> b = np.int32(6)
> >>> type(a)
> 
> >>> type(b)
> 
> >>> a.dtype.char
> 'i'
> >>> b.dtype.char
> 'l'
>
> So, the standard np.int32 is getting listed as a long somehow?  To further
> investigate:
>
> >>> a.dtype.itemsize
> 4
> >>> b.dtype.itemsize
> 4
>
> So, at least the sizes are right.
>
> Ben Root
>

long on a 32 bit computer is indeed int32.

I think your issue is that in your version of numpy, numpy.dtype('i') !=
numpy.dtype('l') (while they are equal e.g. in Numpy 1.5.1).

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy type mismatch

2011-06-10 Thread Olivier Delalleau
2011/6/10 Charles R Harris 

>
>
> On Fri, Jun 10, 2011 at 3:43 PM, Benjamin Root  wrote:
>
>>
>>
>> On Fri, Jun 10, 2011 at 3:24 PM, Charles R Harris <
>> charlesr.har...@gmail.com> wrote:
>>
>>>
>>>
>>> On Fri, Jun 10, 2011 at 2:17 PM, Benjamin Root  wrote:
>>>


 On Fri, Jun 10, 2011 at 3:02 PM, Charles R Harris <
 charlesr.har...@gmail.com> wrote:

>
>
> On Fri, Jun 10, 2011 at 1:50 PM, Benjamin Root wrote:
>
>> Came across an odd error while using numpy master.  Note, my system is
>> 32-bits.
>>
>> >>> import numpy as np
>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32
>> False
>> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64
>> True
>> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32
>> True
>> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64
>> True
>>
>> So, only the summation performed with a np.int32 accumulator results
>> in a type that doesn't match the expected type.  Now, for even more
>> strangeness:
>>
>> >>> type(np.sum([1, 2, 3], dtype=np.int32))
>> 
>> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32
>> '0x9599a0'
>> >>> hex(id(np.int32))
>> '0x959a80'
>>
>> So, the type from the sum() reports itself as a numpy int, but its
>> memory address is different from the memory address for np.int32.
>>
>>
> One of them is probably a long, print out the typecode, dtype.char.
>
> Chuck
>
>
>
 Good intuition, but odd result...

 >>> import numpy as np
 >>> a = np.sum([1, 2, 3], dtype=np.int32)
 >>> b = np.int32(6)
 >>> type(a)
 
 >>> type(b)
 
 >>> a.dtype.char
 'i'
 >>> b.dtype.char
 'l'

 So, the standard np.int32 is getting listed as a long somehow?  To
 further investigate:


>>> Yes, long shifts around from int32 to int64 depending on the OS. For
>>> instance, in 64 bit Windows it's 32 bits while in 64 bit Linux it's 64 bits.
>>> On 32 bit systems it is 32 bits.
>>>
>>> Chuck
>>>
>>>
>> Right, that makes sense.  But, the question is why does sum() put out a
>> result dtype that is not identical to the dtype that I requested, or even
>> the dtype of the input array?  Could this be an indication of a bug
>> somewhere?  Even if the bug is harmless (it was only noticed within the test
>> suite of larry), is this unexpected?
>>
>>
> I expect sum is using a ufunc and it acts differently on account of the
> cleanup of the ufunc casting rules. And yes, a long *is* int32 on your
> machine. On mine
>
> In [4]: dtype('q') # long long
> Out[4]: dtype('int64')
>
> In [5]: dtype('l') # long
> Out[5]: dtype('int64')
>
> The mapping from C types to numpy width types isn't 1-1. Personally, I
> think we should drop long ;) But it used to be the standard Python type in
> the C API. Mark has also pointed out the problems/confusion this ambiguity
> causes and someday we should probably think it out and fix it. But I don't
> think it is the most pressing problem.
>
> Chuck
>
>
But isn't it a bug if numpy.dtype('i') != numpy.dtype('l') on a 32 bit
computer where both are int32?

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy type mismatch

2011-06-10 Thread Olivier Delalleau
2011/6/10 Charles R Harris 

>
>
> On Fri, Jun 10, 2011 at 5:19 PM, Olivier Delalleau  wrote:
>
>> 2011/6/10 Charles R Harris 
>>
>>>
>>>
>>> On Fri, Jun 10, 2011 at 3:43 PM, Benjamin Root  wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jun 10, 2011 at 3:24 PM, Charles R Harris <
>>>> charlesr.har...@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 10, 2011 at 2:17 PM, Benjamin Root wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 10, 2011 at 3:02 PM, Charles R Harris <
>>>>>> charlesr.har...@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 10, 2011 at 1:50 PM, Benjamin Root wrote:
>>>>>>>
>>>>>>>> Came across an odd error while using numpy master.  Note, my system
>>>>>>>> is 32-bits.
>>>>>>>>
>>>>>>>> >>> import numpy as np
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32
>>>>>>>> False
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64
>>>>>>>> True
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32
>>>>>>>> True
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64
>>>>>>>> True
>>>>>>>>
>>>>>>>> So, only the summation performed with a np.int32 accumulator results
>>>>>>>> in a type that doesn't match the expected type.  Now, for even more
>>>>>>>> strangeness:
>>>>>>>>
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32))
>>>>>>>> 
>>>>>>>> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32
>>>>>>>> '0x9599a0'
>>>>>>>> >>> hex(id(np.int32))
>>>>>>>> '0x959a80'
>>>>>>>>
>>>>>>>> So, the type from the sum() reports itself as a numpy int, but its
>>>>>>>> memory address is different from the memory address for np.int32.
>>>>>>>>
>>>>>>>>
>>>>>>> One of them is probably a long, print out the typecode, dtype.char.
>>>>>>>
>>>>>>> Chuck
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Good intuition, but odd result...
>>>>>>
>>>>>> >>> import numpy as np
>>>>>> >>> a = np.sum([1, 2, 3], dtype=np.int32)
>>>>>> >>> b = np.int32(6)
>>>>>> >>> type(a)
>>>>>> 
>>>>>> >>> type(b)
>>>>>> 
>>>>>> >>> a.dtype.char
>>>>>> 'i'
>>>>>> >>> b.dtype.char
>>>>>> 'l'
>>>>>>
>>>>>> So, the standard np.int32 is getting listed as a long somehow?  To
>>>>>> further investigate:
>>>>>>
>>>>>>
>>>>> Yes, long shifts around from int32 to int64 depending on the OS. For
>>>>> instance, in 64 bit Windows it's 32 bits while in 64 bit Linux it's 64 
>>>>> bits.
>>>>> On 32 bit systems it is 32 bits.
>>>>>
>>>>> Chuck
>>>>>
>>>>>
>>>> Right, that makes sense.  But, the question is why does sum() put out a
>>>> result dtype that is not identical to the dtype that I requested, or even
>>>> the dtype of the input array?  Could this be an indication of a bug
>>>> somewhere?  Even if the bug is harmless (it was only noticed within the 
>>>> test
>>>> suite of larry), is this unexpected?
>>>>
>>>>
>>> I expect sum is using a ufunc and it acts differently on account of the
>>> cleanup of the ufunc casting rules. And yes, a long *is* int32 on your
>>> machine. On mine
>>>
>>> In [4]: dtype('q') # long long
>>> Out[4]: dtype('int64')
>>>
>>> In [5]: dtype('l') # long
>>> Out[5]: dtype('int64')
>>>
>>> The mapping from C types to numpy width types isn't 1-1. Personally, I
>>> think we should drop long ;) But it used to be the standard Python type in
>>> the C API. Mark has also pointed out the problems/confusion this ambiguity
>>> causes and someday we should probably think it out and fix it. But I don't
>>> think it is the most pressing problem.
>>>
>>> Chuck
>>>
>>>
>> But isn't it a bug if numpy.dtype('i') != numpy.dtype('l') on a 32 bit
>> computer where both are int32?
>>
>>
> Maybe yes, maybe no ;) They have different descriptors, so from numpy's
> perspective they are different, but at the hardware/precision level they are
> the same. It's more of a decision as to what  != means in this case. Since
> numpy started as Numeric with only the c types the current behavior is
> consistent, but that doesn't mean it shouldn't change at some point.
>
> Chuck
>

Well apparently it was actually changed recently, since in Numpy 1.5.1 on a
Windows 32 bit machine, they are considered equal with '=='.
Personally I think if the string representation of two dtypes is "int32",
then they should be ==, otherwise it wouldn't make much sense given that you
can directly test the equality of a dtype with a string like "int32" (like
dtype('i') == "int32" and dtype('l') == "int32").

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy type mismatch

2011-06-10 Thread Olivier Delalleau
2011/6/10 Olivier Delalleau 

> 2011/6/10 Charles R Harris 
>
>>
>>
>> On Fri, Jun 10, 2011 at 5:19 PM, Olivier Delalleau  wrote:
>>
>>> 2011/6/10 Charles R Harris 
>>>
>>>>
>>>>
>>>> On Fri, Jun 10, 2011 at 3:43 PM, Benjamin Root  wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 10, 2011 at 3:24 PM, Charles R Harris <
>>>>> charlesr.har...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 10, 2011 at 2:17 PM, Benjamin Root wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 10, 2011 at 3:02 PM, Charles R Harris <
>>>>>>> charlesr.har...@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 10, 2011 at 1:50 PM, Benjamin Root wrote:
>>>>>>>>
>>>>>>>>> Came across an odd error while using numpy master.  Note, my system
>>>>>>>>> is 32-bits.
>>>>>>>>>
>>>>>>>>> >>> import numpy as np
>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32
>>>>>>>>> False
>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64
>>>>>>>>> True
>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32
>>>>>>>>> True
>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64
>>>>>>>>> True
>>>>>>>>>
>>>>>>>>> So, only the summation performed with a np.int32 accumulator
>>>>>>>>> results in a type that doesn't match the expected type.  Now, for 
>>>>>>>>> even more
>>>>>>>>> strangeness:
>>>>>>>>>
>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32))
>>>>>>>>> 
>>>>>>>>> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32
>>>>>>>>> '0x9599a0'
>>>>>>>>> >>> hex(id(np.int32))
>>>>>>>>> '0x959a80'
>>>>>>>>>
>>>>>>>>> So, the type from the sum() reports itself as a numpy int, but its
>>>>>>>>> memory address is different from the memory address for np.int32.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> One of them is probably a long, print out the typecode, dtype.char.
>>>>>>>>
>>>>>>>> Chuck
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> Good intuition, but odd result...
>>>>>>>
>>>>>>> >>> import numpy as np
>>>>>>> >>> a = np.sum([1, 2, 3], dtype=np.int32)
>>>>>>> >>> b = np.int32(6)
>>>>>>> >>> type(a)
>>>>>>> 
>>>>>>> >>> type(b)
>>>>>>> 
>>>>>>> >>> a.dtype.char
>>>>>>> 'i'
>>>>>>> >>> b.dtype.char
>>>>>>> 'l'
>>>>>>>
>>>>>>> So, the standard np.int32 is getting listed as a long somehow?  To
>>>>>>> further investigate:
>>>>>>>
>>>>>>>
>>>>>> Yes, long shifts around from int32 to int64 depending on the OS. For
>>>>>> instance, in 64 bit Windows it's 32 bits while in 64 bit Linux it's 64 
>>>>>> bits.
>>>>>> On 32 bit systems it is 32 bits.
>>>>>>
>>>>>> Chuck
>>>>>>
>>>>>>
>>>>> Right, that makes sense.  But, the question is why does sum() put out a
>>>>> result dtype that is not identical to the dtype that I requested, or even
>>>>> the dtype of the input array?  Could this be an indication of a bug
>>>>> somewhere?  Even if the bug is harmless (it was only noticed within the 
>>>

Re: [Numpy-discussion] numpy type mismatch

2011-06-10 Thread Olivier Delalleau
2011/6/10 Benjamin Root 

>
>
> On Fri, Jun 10, 2011 at 9:29 PM, Olivier Delalleau  wrote:
>
>>
>> 2011/6/10 Olivier Delalleau 
>>
>>> 2011/6/10 Charles R Harris 
>>>
>>>>
>>>>
>>>> On Fri, Jun 10, 2011 at 5:19 PM, Olivier Delalleau wrote:
>>>>
>>>>> 2011/6/10 Charles R Harris 
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 10, 2011 at 3:43 PM, Benjamin Root wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 10, 2011 at 3:24 PM, Charles R Harris <
>>>>>>> charlesr.har...@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 10, 2011 at 2:17 PM, Benjamin Root wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Jun 10, 2011 at 3:02 PM, Charles R Harris <
>>>>>>>>> charlesr.har...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Jun 10, 2011 at 1:50 PM, Benjamin Root 
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Came across an odd error while using numpy master.  Note, my
>>>>>>>>>>> system is 32-bits.
>>>>>>>>>>>
>>>>>>>>>>> >>> import numpy as np
>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32
>>>>>>>>>>> False
>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64
>>>>>>>>>>> True
>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32
>>>>>>>>>>> True
>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64
>>>>>>>>>>> True
>>>>>>>>>>>
>>>>>>>>>>> So, only the summation performed with a np.int32 accumulator
>>>>>>>>>>> results in a type that doesn't match the expected type.  Now, for 
>>>>>>>>>>> even more
>>>>>>>>>>> strangeness:
>>>>>>>>>>>
>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32))
>>>>>>>>>>> 
>>>>>>>>>>> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32
>>>>>>>>>>> '0x9599a0'
>>>>>>>>>>> >>> hex(id(np.int32))
>>>>>>>>>>> '0x959a80'
>>>>>>>>>>>
>>>>>>>>>>> So, the type from the sum() reports itself as a numpy int, but
>>>>>>>>>>> its memory address is different from the memory address for 
>>>>>>>>>>> np.int32.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> One of them is probably a long, print out the typecode,
>>>>>>>>>> dtype.char.
>>>>>>>>>>
>>>>>>>>>> Chuck
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Good intuition, but odd result...
>>>>>>>>>
>>>>>>>>> >>> import numpy as np
>>>>>>>>> >>> a = np.sum([1, 2, 3], dtype=np.int32)
>>>>>>>>> >>> b = np.int32(6)
>>>>>>>>> >>> type(a)
>>>>>>>>> 
>>>>>>>>> >>> type(b)
>>>>>>>>> 
>>>>>>>>> >>> a.dtype.char
>>>>>>>>> 'i'
>>>>>>>>> >>> b.dtype.char
>>>>>>>>> 'l'
>>>>>>>>>
>>>>>>>>> So, the standard np.int32 is getting listed a

Re: [Numpy-discussion] numpy type mismatch

2011-06-13 Thread Olivier Delalleau
2011/6/10 Olivier Delalleau 

> 2011/6/10 Charles R Harris 
>
>>
>>
>> On Fri, Jun 10, 2011 at 3:43 PM, Benjamin Root  wrote:
>>
>>>
>>>
>>> On Fri, Jun 10, 2011 at 3:24 PM, Charles R Harris <
>>> charlesr.har...@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jun 10, 2011 at 2:17 PM, Benjamin Root  wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 10, 2011 at 3:02 PM, Charles R Harris <
>>>>> charlesr.har...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 10, 2011 at 1:50 PM, Benjamin Root wrote:
>>>>>>
>>>>>>> Came across an odd error while using numpy master.  Note, my system
>>>>>>> is 32-bits.
>>>>>>>
>>>>>>> >>> import numpy as np
>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32
>>>>>>> False
>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64
>>>>>>> True
>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32
>>>>>>> True
>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64
>>>>>>> True
>>>>>>>
>>>>>>> So, only the summation performed with a np.int32 accumulator results
>>>>>>> in a type that doesn't match the expected type.  Now, for even more
>>>>>>> strangeness:
>>>>>>>
>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32))
>>>>>>> 
>>>>>>> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32
>>>>>>> '0x9599a0'
>>>>>>> >>> hex(id(np.int32))
>>>>>>> '0x959a80'
>>>>>>>
>>>>>>> So, the type from the sum() reports itself as a numpy int, but its
>>>>>>> memory address is different from the memory address for np.int32.
>>>>>>>
>>>>>>>
>>>>>> One of them is probably a long, print out the typecode, dtype.char.
>>>>>>
>>>>>> Chuck
>>>>>>
>>>>>>
>>>>>>
>>>>> Good intuition, but odd result...
>>>>>
>>>>> >>> import numpy as np
>>>>> >>> a = np.sum([1, 2, 3], dtype=np.int32)
>>>>> >>> b = np.int32(6)
>>>>> >>> type(a)
>>>>> 
>>>>> >>> type(b)
>>>>> 
>>>>> >>> a.dtype.char
>>>>> 'i'
>>>>> >>> b.dtype.char
>>>>> 'l'
>>>>>
>>>>> So, the standard np.int32 is getting listed as a long somehow?  To
>>>>> further investigate:
>>>>>
>>>>>
>>>> Yes, long shifts around from int32 to int64 depending on the OS. For
>>>> instance, in 64 bit Windows it's 32 bits while in 64 bit Linux it's 64 
>>>> bits.
>>>> On 32 bit systems it is 32 bits.
>>>>
>>>> Chuck
>>>>
>>>>
>>> Right, that makes sense.  But, the question is why does sum() put out a
>>> result dtype that is not identical to the dtype that I requested, or even
>>> the dtype of the input array?  Could this be an indication of a bug
>>> somewhere?  Even if the bug is harmless (it was only noticed within the test
>>> suite of larry), is this unexpected?
>>>
>>>
>> I expect sum is using a ufunc and it acts differently on account of the
>> cleanup of the ufunc casting rules. And yes, a long *is* int32 on your
>> machine. On mine
>>
>> In [4]: dtype('q') # long long
>> Out[4]: dtype('int64')
>>
>> In [5]: dtype('l') # long
>> Out[5]: dtype('int64')
>>
>> The mapping from C types to numpy width types isn't 1-1. Personally, I
>> think we should drop long ;) But it used to be the standard Python type in
>> the C API. Mark has also pointed out the problems/confusion this ambiguity
>> causes and someday we should probably think it out and fix it. But I don't
>> think it is the most pressing problem.
>>
>> Chuck
>>
>>
> But isn't it a bug if numpy.dtype('i') != numpy.dtype('l') on a 32 bit
> computer where both are int32?
>
> -=- Olivier
>

So, I tried with the latest 1.6.1rc1.

It turns out I misinterpreted the OP. Although the first test indeed returns
False now, I hadn't taken into account it was a comparison of types, not
dtypes. If you replace it with
(np.sum([1, 2, 3], dtype=np.int32)).dtype == np.dtype(np.int32)
then it returns True.
So we still have numpy.dtype('i') == numpy.dtype('l'), and I'm not concerned
anymore :) Sorry for the false alarm...

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Behaviour of ndarray and other objects with __radd__?

2011-06-15 Thread Olivier Delalleau
I don't really understand this behavior either, but juste note that
according to
http://docs.scipy.org/doc/numpy/user/c-info.beyond-basics.html
"This attribute can also be defined by objects that are not sub-types of the
ndarray"

-=- Olivier

2011/6/15 Jonathan Taylor 

> Hi,
>
> I would like to have objects that I can mix with ndarrays in
> arithmetic expressions but I need my object to have control of the
> operation even when it is on the right hand side of the equation.  I
> realize from the documentation that the way to do this is to actually
> subclass ndarray but this is undesirable because I do not need all the
> heavy machinery of a ndarray and I do not want users to see all of the
> ndarray methods.  Is there a way to somehow achieve these goals?
>
> I would also very much appreciate some clarification of what is
> happening in the following basic example:
>
> import numpy as np
> class Foo(object):
># THE NEXT LINE IS COMMENTED
># __array_priority__ = 0
>def __add__(self, other):
>print 'Foo has control over', other
>return 1
>def __radd__(self, other):
>print 'Foo has control over', other
>return 1
>
> x = np.arange(3)
> f = Foo()
>
> print f + x
> print x + f
>
> yields
>
> Foo has control over [0 1 2]
> 1
> Foo has control over 0
> Foo has control over 1
> Foo has control over 2
> [1 1 1]
>
> I see that I have control from the left side as expected and I suspect
> that what is happening in the second case is that numpy is trying to
> "broadcast" my object onto the left side as if it was an object array?
>
> Now if I uncomment the line __array_priority__ = 0 I do seem to
> accomplish my goals (see below) but I am not sure why.  I am
> surprised, given what I have read in the documentation, that
> __array_priority__ does anything in a non subclass of ndarray.
> Furthermore, I am even more surprised that it does anything when it is
> 0, which is the same as ndarray.__array_priority__ from what I
> understand.  Any clarification of this would be greatly appreciated.
>
> Output with __array_priority__ uncommented:
>
> jtaylor@yukon:~$ python foo.py
> Foo has control over [0 1 2]
> 1
> Foo has control over [0 1 2]
> 1
>
> Thanks,
> Jonathan.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] what python module to handle csv?

2011-06-15 Thread Olivier Delalleau
Using savetxt with delimiter=',' should do the trick.

If you want a more advanced csv interface to e.g. save more than a numpy
array into a single csv, you can probably look into the python csv module.

-=- Olivier

2011/6/15 Chao YUE 

> Dear all pythoners,
>
> what do you use python module to handle csv file (for reading we can use
> numpy.genfromtxt)? is there anyone we can do with csv file very convinient
> as that in R?
> can numpy.genfromtxt be used as writing? (I didn't try this yet because on
> our server we have only numpy 1.0.1...). This really make me struggling
> since csv is a very
> important interface file (I think so...).
>
> Thanks a lot,
>
> Sincerely,
>
> Chao
>
> --
>
> ***
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 77 30; Fax:01.69.08.77.16
>
> 
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Regression in choose()

2011-06-16 Thread Olivier Delalleau
If performance is not critical you could just write your own function for a
quick fix, doing something like
numpy.array([choices[j][i] for i, j in enumerate([2, 3, 1, 0])])

This 32-array limitation definitely looks weird to me though, doesn't seem
to make sense.

-=- Olivier

2011/6/16 Ed Schofield 

> Hi all,
>
> I have been investigation the limitation of the choose() method (and
> function) to 32 elements. This is a regression in recent versions of NumPy.
> I have tested choose() in the following NumPy versions:
>
> 1.0.4: fine
> 1.1.1: bug
> 1.2.1: fine
> 1.3.0: bug
> 1.4.x: bug
> 1.5.x: bug
> 1.6.x: bug
> Numeric 24.3: fine
>
> (To run the tests on versions of NumPy prior to 1.4.x I used Python 2.4.3.
> For the other tests I used Python 2.7.)
>
> Here 'bug' means the choose() function has the 32-element limitation. I
> have been helping an organization to port a large old Numeric-using codebase
> to NumPy, and the choose() limitation in recent NumPy versions is throwing a
> spanner in the works. The codebase is currently using both NumPy and Numeric
> side-by-side, with Numeric only being used for its choose() function, with a
> few dozen lines like this:
>
> a = numpy.array(Numeric.choose(b, c))
>
> Here is a simple example that triggers the bug. It is a simple extension of
> the example from the choose() docstring:
>
> 
>
> import numpy as np
>
> choices = [[0, 1, 2, 3], [10, 11, 12, 13],
>   [20, 21, 22, 23], [30, 31, 32, 33]]
>
> np.choose([2, 3, 1, 0], choices * 8)
>
> 
>
> A side note: the exception message (defined in
> core/src/multiarray/iterators.c) is also slightly inconsistent with the
> actual behaviour:
>
> Traceback (most recent call last):
>   File "chooser.py", line 6, in 
> np.choose([2, 3, 1, 0], choices * 8)
>   File "/usr/lib64/python2.7/site-packages/numpy/core/fromnumeric.py", line
> 277, in choose
> return _wrapit(a, 'choose', choices, out=out, mode=mode)
>   File "/usr/lib64/python2.7/site-packages/numpy/core/fromnumeric.py", line
> 37, in _wrapit
> result = getattr(asarray(obj),method)(*args, **kwds)
> ValueError: Need between 2 and (32) array objects (inclusive).
>
> The actual behaviour is that choose() passes with 31 objects but fails with
> 32 objects, so this should read "exclusive" rather than "inclusive". (And
> why the parentheses around 32?)
>
> Does anyone know what changed between 1.2.1 and 1.3.0 that introduced the
> 32-element limitation to choose(), and whether we might be able to lift this
> limitation again for future NumPy versions? I have a couple of days to work
> on a patch ... if someone can advise me how to approach this.
>
> Best wishes,
> Ed
>
>
> --
> Dr. Edward Schofield
> Python Charmers
> +61 (0)405 676 229
> http://pythoncharmers.com
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
If I understand correctly, your error is that you convert only the second
column, because your converters dictionary contains a single key (1).
If you have it contain keys from 0 to 3 associated to the same function, it
should work.

-=- Olivier

2011/6/17 gary ruben 

> I'm trying to read a file containing data formatted as in the
> following example using genfromtxt and I'm doing something wrong. It
> almost works. Can someone point out my error, or suggest a simpler
> solution to the ugly converter function? I thought I'd leave in the
> commented-out line for future reference, which I thought was a neat
> way to get genfromtxt to show what it is trying to pass to the
> converter.
>
> import numpy as np
> from StringIO import StringIO
>
> a = StringIO('''\
>  (-3.9700,-5.0400) (-1.1318,-2.5693) (-4.6027,-0.1426) (-1.4249, 1.7330)
>  (-5.4797, 0.) ( 1.8585,-1.5502) ( 4.4145,-0.7638) (-0.4805,-1.1976)
>  ( 0., 0.) ( 6.2673, 0.) (-0.4504,-0.0290) (-1.3467, 1.6579)
>  ( 0., 0.) ( 0., 0.) (-3.5000, 0.) ( 2.5619,-3.3708)
> ''')
>
> #~ b = np.genfromtxt(a,converters={1:lambda
> x:str(x)},dtype=object,delimiter=18)
> b = np.genfromtxt(a,converters={1:lambda
> x:complex(*eval(x))},dtype=None,delimiter=18,usecols=range(4))
>
> print b
>
> --
>
> This produces
> [ (' (-3.9700,-5.0400)', (-1.1318-2.5693j), ' (-4.6027,-0.1426)', '
> (-1.4249, 1.7330)')
>  (' (-5.4797, 0.)', (1.8585-1.5502j), ' ( 4.4145,-0.7638)', '
> (-0.4805,-1.1976)')
>  (' ( 0., 0.)', (6.2673+0j), ' (-0.4504,-0.0290)', ' (-1.3467,
> 1.6579)')
>  (' ( 0., 0.)', 0j, ' (-3.5000, 0.)', ' ( 2.5619,-3.3708)')]
>
> which I just need to unpack into a 4x4 array, but I get an error if I
> try to apply a different view.
>
> thanks,
> Gary
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
2011/6/17 Bruce Southey 

> On 06/17/2011 08:22 AM, gary ruben wrote:
> > Thanks Olivier,
> > Your suggestion gets me a little closer to what I want, but doesn't
> > quite work. Replacing the conversion with
> >
> > c = lambda x:np.cast[np.complex64](complex(*eval(x)))
> > b = np.genfromtxt(a,converters={0:c, 1:c, 2:c,
> > 3:c},dtype=None,delimiter=18,usecols=range(4))
> >
> > produces
> >
> > [[(-3.9702861-5.0396185j) (-1.1318000555-2.56929993629j)
> >   (-4.60270023346-0.14259905j) (-1.42490005493+1.7334005j)]
> >   [(-5.4797000885+0j) (1.8585381-1.5501999855j)
> >   (4.41450023651-0.763800024986j) (-0.480500012636-1.1976706j)]
> >   [0j (6.26730012894+0j) (-0.4503485-0.028991655j)
> >   (-1.34669995308+1.65789997578j)]
> >   [0j 0j (-3.5+0j) (2.56189990044-3.37080001831j)]]
> >
> > which is not yet an array of complex numbers. It seems close to the
> > solution though.
> >
> > Gary
> >
> > On Fri, Jun 17, 2011 at 8:40 PM, Olivier Delalleau
>  wrote:
> >> If I understand correctly, your error is that you convert only the
> second
> >> column, because your converters dictionary contains a single key (1).
> >> If you have it contain keys from 0 to 3 associated to the same function,
> it
> >> should work.
> >>
> >> -=- Olivier
> >>
> >> 2011/6/17 gary ruben
> >>> I'm trying to read a file containing data formatted as in the
> >>> following example using genfromtxt and I'm doing something wrong. It
> >>> almost works. Can someone point out my error, or suggest a simpler
> >>> solution to the ugly converter function? I thought I'd leave in the
> >>> commented-out line for future reference, which I thought was a neat
> >>> way to get genfromtxt to show what it is trying to pass to the
> >>> converter.
> >>>
> >>> import numpy as np
> >>> from StringIO import StringIO
> >>>
> >>> a = StringIO('''\
> >>>   (-3.9700,-5.0400) (-1.1318,-2.5693) (-4.6027,-0.1426) (-1.4249,
> 1.7330)
> >>>   (-5.4797, 0.) ( 1.8585,-1.5502) ( 4.4145,-0.7638)
> (-0.4805,-1.1976)
> >>>   ( 0., 0.) ( 6.2673, 0.) (-0.4504,-0.0290) (-1.3467,
> 1.6579)
> >>>   ( 0., 0.) ( 0., 0.) (-3.5000, 0.) (
> 2.5619,-3.3708)
> >>> ''')
> >>>
> >>> #~ b = np.genfromtxt(a,converters={1:lambda
> >>> x:str(x)},dtype=object,delimiter=18)
> >>> b = np.genfromtxt(a,converters={1:lambda
> >>> x:complex(*eval(x))},dtype=None,delimiter=18,usecols=range(4))
> >>>
> >>> print b
> >>>
> >>> --
> >>>
> >>> This produces
> >>> [ (' (-3.9700,-5.0400)', (-1.1318-2.5693j), ' (-4.6027,-0.1426)', '
> >>> (-1.4249, 1.7330)')
> >>>   (' (-5.4797, 0.)', (1.8585-1.5502j), ' ( 4.4145,-0.7638)', '
> >>> (-0.4805,-1.1976)')
> >>>   (' ( 0., 0.)', (6.2673+0j), ' (-0.4504,-0.0290)', ' (-1.3467,
> >>> 1.6579)')
> >>>   (' ( 0., 0.)', 0j, ' (-3.5000, 0.)', ' (
> 2.5619,-3.3708)')]
> >>>
> >>> which I just need to unpack into a 4x4 array, but I get an error if I
> >>> try to apply a different view.
> >>>
> >>> thanks,
> >>> Gary
> >>> ___
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion@scipy.org
> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >> ___
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >>
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> Just an observation for the StringIO object, you have multiple spaces
> within the parentheses but, by default, you are using whitespace
> delimiters in genfromtxt. So, yes, genfromtxt is going have issues.
>
> If you can rewrite the input, then you need a non-space and non-comma
> delimiter then specify that delimiter to genfromtxt. Otherwise you are
> probably going to have to write you own parser - for each line, split on
> ' (' etc.
>
> Bruce


It's funny though because that part (the parsing) actually seems to work.

However I've been playing a bit with his example and indeed I can't get
numpy to return a complex array. It keeps resulting in an "object" dtype.
The only way I found was to convert it temporarily into a list to recast it
in complex64, adding the line:
  b = np.array(map(list, b), dtype=np.complex64)

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
2011/6/17 Derek Homeier 

> Hi Gary,
>
> On 17.06.2011, at 5:39PM, gary ruben wrote:
> > Thanks for the hints Olivier and Bruce. Based on them, the following
> > is a working solution, although I still have that itchy sense that
> genfromtxt
> > should be able to do it directly.
> >
> > import numpy as np
> > from StringIO import StringIO
> >
> > a = StringIO('''\
> > (-3.9700,-5.0400) (-1.1318,-2.5693) (-4.6027,-0.1426) (-1.4249, 1.7330)
> > (-5.4797, 0.) ( 1.8585,-1.5502) ( 4.4145,-0.7638) (-0.4805,-1.1976)
> > ( 0., 0.) ( 6.2673, 0.) (-0.4504,-0.0290) (-1.3467, 1.6579)
> > ( 0., 0.) ( 0., 0.) (-3.5000, 0.) ( 2.5619,-3.3708)
> > ''')
> >
> > b = np.genfromtxt(a, dtype=str, delimiter=18)[:,:-1]
> > b = np.vectorize(lambda x: complex(*eval(x)))(b)
> >
> > print b
>
> It should, I think you were very close in your earlier attempt:
>
> > On Sat, Jun 18, 2011 at 12:31 AM, Bruce Southey 
> wrote:
> >> On 06/17/2011 08:51 AM, Olivier Delalleau wrote:
> >>
> >> 2011/6/17 Bruce Southey 
> >>>
> >>> On 06/17/2011 08:22 AM, gary ruben wrote:
> >>>> Thanks Olivier,
> >>>> Your suggestion gets me a little closer to what I want, but doesn't
> >>>> quite work. Replacing the conversion with
> >>>>
> >>>> c = lambda x:np.cast[np.complex64](complex(*eval(x)))
> >>>> b = np.genfromtxt(a,converters={0:c, 1:c, 2:c,
> >>>> 3:c},dtype=None,delimiter=18,usecols=range(4))
> >>>>
> >>>> produces
> >>>>
> >>>> [[(-3.9702861-5.0396185j) (-1.1318000555-2.56929993629j)
> >>>>   (-4.60270023346-0.14259905j) (-1.42490005493+1.7334005j)]
> >>>>   [(-5.4797000885+0j) (1.8585381-1.5501999855j)
> >>>>   (4.41450023651-0.763800024986j) (-0.480500012636-1.1976706j)]
> >>>>   [0j (6.26730012894+0j) (-0.4503485-0.028991655j)
> >>>>   (-1.34669995308+1.65789997578j)]
> >>>>   [0j 0j (-3.5+0j) (2.56189990044-3.37080001831j)]]
> >>>>
> >>>> which is not yet an array of complex numbers. It seems close to the
> >>>> solution though.
>
> You were just overdoing it by already creating an array with the converter,
> this apparently caused genfromtxt to create a structured array from the
> input (which could be converted back to an ndarray, but that can prove
> tricky as well) - similar, if you omit the dtype=None. The following
>
> cnv = dict.fromkeys(range(4), lambda x: complex(*eval(x)))
> b = np.genfromtxt(a,converters=cnv, dtype=None, delimiter=18,
> usecols=range(4))
>
> directly produces a shape(4,4) complex array for me (you may have to apply
> an .astype(np.complex64) afterwards if so desired).
>
> BTW I think this is an interesting enough case of reading non-trivially
> structured data that it deserves to appear on some examples or cookbook
> page.
>
> HTH,
> Derek
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

I had tried that as well and it doesn't work with numpy 1.4.1 (I get an
object array). It may have been fixed in a later version.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
For the hardcoded part, you can easily read the first line of your file and
split it with the same delimiter to know the number of columns.
It's sure it'd be best to be able to be able to skip this part, but you
don't need to hardcode this number into your code at least.
Something like:
n_cols = len(open('myfile.txt').readline().split(delimiter))

-=- Olivier

2011/6/17 gary ruben 

> Thanks guys - I'm happy with the solution for now. FYI, Derek's
> suggestion doesn't work in numpy 1.5.1 either.
> For any developers following this thread, I think this might be a nice
> use case for genfromtxt to handle in future.
> As a corollary of this problem, I wonder whether there's a
> human-readable text format for complex numbers that genfromtxt can
> currently easily parse into a complex array? Having the hard-coded
> value for the number of columns in the converter and the genfromtxt
> call goes against the philosophy of the function's ability to form an
> array of shape matching the input layout.
>
> Gary
>
> On Sat, Jun 18, 2011 at 7:24 AM, Derek Homeier
>  wrote:
> > On 17.06.2011, at 11:01PM, Olivier Delalleau wrote:
> >
> >>> You were just overdoing it by already creating an array with the
> converter, this apparently caused genfromtxt to create a structured array
> from the input (which could be converted back to an ndarray, but that can
> prove tricky as well) - similar, if you omit the dtype=None. The following
> >>>
> >>> cnv = dict.fromkeys(range(4), lambda x: complex(*eval(x)))
> >>> b = np.genfromtxt(a,converters=cnv, dtype=None, delimiter=18,
> usecols=range(4))
> >>>
> >>> directly produces a shape(4,4) complex array for me (you may have to
> apply an .astype(np.complex64) afterwards if so desired).
> >>>
> >>> BTW I think this is an interesting enough case of reading non-trivially
> structured data that it deserves to appear on some examples or cookbook
> page.
> >>>
> >>> HTH,
> >>>Derek
> >>>
> >>> ___
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion@scipy.org
> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>
> >> I had tried that as well and it doesn't work with numpy 1.4.1 (I get an
> object array). It may have been fixed in a later version.
> >
> > OK, I was using the current master from github, but it works in 1.6.0 as
> well. I still noticed some differences between loadtxt and genfromtxt
> behaviour, e.g. where loadtxt would be able to take a string from the
> converter and automatically convert it to a number, whereas in genfromtxt
> the converter still had to include the float() or complex()...
> >
> >Derek
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] A bit of advice?

2011-06-23 Thread Olivier Delalleau
What about :
dict((k, [e for e in arr if (e['x0'], e['x1']) == k]) for k in cases)
?

(note: it is inefficient written this way though)

-=- Olivier

2011/6/23 Neal Becker 

> I have a set of experiments that I want to plot.  There will be many plots.
> Each will show different test conditions.
>
> Suppose I put each of the test conditions and results into a recarray.  The
> recarray could be:
>
> arr = np.empty ((#experiments,), dtype=[('x0',int), ('x1',int), ('y0',int)]
>
> where x0,x1 are 2 test conditions, and y0 is a result.
>
> First I want to group the plots such according to the test conditions.  So,
> I
> want to first find what all the combinations of test conditions are.
>
> Dunno if there is anything simpler than:
>
> cases = tuple (set ((e['x0'], e['x1'])) for e in arr)
>
> Next, need to select all those experiments which match each of these cases.
>  Now
> I know of no easy way.
>
> Any suggestions?  Perhaps I should look again at pytables?
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Olivier Delalleau
2011/6/25 Charles R Harris 

> I think what we really need to see are the use cases and work flow. The
> ones that hadn't occurred to me before were memory mapped files and data
> stored on disk in general. I think we may need some standard format for
> masked data on disk if we don't go the NA value route.
>
> Chuck
>

Sorry I can't follow closely this thread, but since use cases are mentioned,
I'll throw one. It may or may not already be supported by the current NEP, I
don't know. It may not be possible to support it either... anyway:

I typically use NaN to denote missing values in an array. The main
motivation vs. masks is my code typically involves multiple systems working
together, and not all of them are necessarily based on numpy. The nice thing
with NaN is it is preserved e.g. when I turn an array into a list, when I
write the data directly into a double* C array, etc. It's handy to
manipulate arrays with missing data without being constrained to a specific
container.
The one thing that is annoying with NaN is that NaN != NaN. It makes code
involving comparisons quite ugly.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] broacasting question

2011-06-30 Thread Olivier Delalleau
2011/6/30 Thomas K Gamble 

> I'm trying to convert some IDL code to python/numpy and i'm having some
> trouble understanding the rules for boradcasting during some operations.
> example:
>
> given the following arrays:
> a = array((2048,3577), dtype=float)
> b = array((256,25088), dtype=float)
> c = array((2048,3136), dtype=float)
> d = array((2048,3136), dtype=float)
>
> do:
> a = b * c + d
>
> In IDL, the computation is done without complaint and all array sizes are
> preserved.  In ptyhon I get a value error concerning broadcasting.  I can
> force it to work by taking slices, but the resulting size would be a =
> (256x3136) rather than (2048x3577).  I admit that I don't understand IDL
> (or
> python to be honest) well enough to know how it handles this to be able to
> replicate the result properly.  Does it only operate on the smallest
> dimensions ignoring the larger indices leaving their values unchanged?  Can
> someone explain this to me?
>
> --
> Thomas K. Gamble tkgam...@windstream.net
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

It should be all explained here:
http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] object scalars

2011-07-12 Thread Olivier Delalleau
I found a workaround but it's a bit ugly:
def some_call(x):
  rval = numpy.array(None, dtype='object')
  rval.fill(x)
  return rval

-=- Olivier

2011/7/12 Johann Hibschman 

> Is there any way to wrap a sequence (in particular a python list) as a
> numpy object scalar, without it being promoted to an object array?
>
> In particular,
>
>  np.object_([1, 2]).shape == (2,)
>  np.array([1,2], dtype='O').shape == (2,)
>
> while I want
>
>  some_call([1,2]).shape = ()
>
> Thanks,
> Johann
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] recommendation for saving data

2011-08-01 Thread Olivier Delalleau
I personally use pickle, which does exactly what you are asking for (and can
be customized with __getstate__ and __setstate__ if needed). What are your
issues with pickle?

-=- Olivier

2011/7/31 Brian Blais 

> Hello,
>
> I was wondering if there are any recommendations for formats for saving
> scientific data.  I am running a simulation, which has many
> somewhat-indepedent parts which have their own internal state and
> parameters.  I've been using pickle (gzipped) to save the entire object
> (which contains subobjects, etc...), but it is getting too unwieldy and I
> think it is time to look for a more robust solution.  Ideally I'd like to
> have something where I can call a save method on the simulation object, and
> it will call the save methods on all the children, on down the line all
> saving into one file.  It'd also be nice if it were cross-platform, and I
> could depend on the files being readable into the future for a while.
>
> Are there any good standards for this?  What do you use for saving
> scientific data?
>
>
>thank you,
>
>Brian Blais
>
>
>
> --
> Brian Blais
> bbl...@bryant.edu
> http://web.bryant.edu/~bblais
> http://bblais.blogspot.com/
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Segmentation Fault in Numpy.test()

2011-08-02 Thread Olivier Delalleau
It's a wild guess, but in the past I've had seg faults issues on Mac due to
conflicting versions of Python. Do you have multiple Python installs on your
Mac?

-=- Olivier


2011/8/2 Thomas Markovich 

> Hi All,
>
> I installed numpy from the scipy superpack on Snow Leopard with python 2.7
> and it all appears to work but when I do the following, I get a segmentation
> fault.
>
> >>> import numpy
> >>> print numpy.__version__, numpy.__file__
> 2.0.0.dev-b5cdaee
> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
> >>> numpy.test()
> Running unit tests for numpy
> NumPy version 2.0.0.dev-b5cdaee
> NumPy is installed in
> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
> Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC
> 4.2.1 (Apple Inc. build 5666) (dot 3)]
> nose version 1.1.2
> Segmentation
> fault
> thomasmarkovich:~ Thomas$
>
> What is the best way to trouble shoot this? Do you guys have any
> suggestions? I have also included the core dump in this email as a pastie
> link.
>
> http://pastie.org/2309652
>
> Best,
>
> Thomas
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Segmentation Fault in Numpy.test()

2011-08-02 Thread Olivier Delalleau
Maybe specify which scipy superpack. Your issue was probably because the
superpack you installed was not meant to be used with Python 2.7.

-=- Olivier

2011/8/2 Thomas Markovich 

> Oh okay, that's unfortunate but I guess not unexpected. Regardless, thank
> you so much for all your help Ralf, Bruce, and Oliver! You guys are great.
>
> Just to recap, the issue appears to stem from using the scipy superpack
> with python 2.7 from python.org. This was solved by using the apple python
> along with the scipy superpack.
>
> Thomas
>
>
> On Tue, Aug 2, 2011 at 12:06 PM, Ralf Gommers  > wrote:
>
>>
>>
>> On Tue, Aug 2, 2011 at 6:57 PM, Thomas Markovich <
>> thomasmarkov...@gmail.com> wrote:
>>
>>> It appears that uninstalling python 2.7 and installing the scipy
>>> superpack with the apple standard python removes the segfaulting behavior
>>> from numpy. Now it appears that just scipy is segfaulting at test
>>> "test_arpack.test_hermitian_modes(True, , 'F', 2, 'SM', None,
>>> 0.5, ) ... Segmentation fault"
>>>
>>> That is a known problem (unfortunately hard to fix), see
>> http://projects.scipy.org/scipy/ticket/1472
>> Everything else besides arpack should work fine for you.
>>
>> Cheers,
>> Ralf
>>
>>
>>>
>>>
>>>
>>> On Tue, Aug 2, 2011 at 11:28 AM, Ralf Gommers <
>>> ralf.gomm...@googlemail.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Aug 2, 2011 at 6:14 PM, Thomas Markovich <
>>>> thomasmarkov...@gmail.com> wrote:
>>>>
>>>>> I just have the default "apple" version of python that comes with Snow
>>>>> Leopard (Python 2.6.1 (r261:67515, Aug  2 2010, 20:10:18)) and python 2.7
>>>>> (Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) ) installed.
>>>>>
>>>>> Should I just remove 2.7 and reinstall everything with the standard
>>>>> apple python?
>>>>>
>>>>> Did you get it from http://stronginference.com/scipy-superpack/? The
>>>> info on the 10.6 installer has disappeared, but the 10.7 one is built
>>>> against Apple's Python. So conflicting Pythons makes sense. Even if you 
>>>> find
>>>> the right one, it may be worth emailing Chris to ask him to put back the
>>>> info for the 10.6 installer.
>>>>
>>>> Ralf
>>>>
>>>>
>>>> On Tue, Aug 2, 2011 at 11:08 AM, Olivier Delalleau wrote:
>>>>>
>>>>>> It's a wild guess, but in the past I've had seg faults issues on Mac
>>>>>> due to conflicting versions of Python. Do you have multiple Python 
>>>>>> installs
>>>>>> on your Mac?
>>>>>>
>>>>>> -=- Olivier
>>>>>>
>>>>>>
>>>>>> 2011/8/2 Thomas Markovich 
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I installed numpy from the scipy superpack on Snow Leopard with
>>>>>>> python 2.7 and it all appears to work but when I do the following, I 
>>>>>>> get a
>>>>>>> segmentation fault.
>>>>>>>
>>>>>>> >>> import numpy
>>>>>>> >>> print numpy.__version__, numpy.__file__
>>>>>>> 2.0.0.dev-b5cdaee
>>>>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
>>>>>>> >>> numpy.test()
>>>>>>> Running unit tests for numpy
>>>>>>> NumPy version 2.0.0.dev-b5cdaee
>>>>>>> NumPy is installed in
>>>>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
>>>>>>> Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34)
>>>>>>> [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
>>>>>>> nose version 1.1.2
>>>>>>> S

[Numpy-discussion] Weird upcast behavior with 1.6.x, working as intended?

2011-08-08 Thread Olivier Delalleau
Hi,

This is with numpy 1.6.1 under Linux x86_64, testing the upcast mechanism of
"scalar + array":

>>> import numpy; print (numpy.array(3, dtype=numpy.complex128) +
numpy.ones(3, dtype=numpy.float32)).dtype
complex64

Since it has to upcast my array (float32 is not "compatible enough" with
complex128), why does it upcast it to complex64 instead of complex128?
As far as I can tell 1.4.x and 1.5.x versions of numpy are indeed upcasting
to complex128.

Thanks,

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Weird upcast behavior with 1.6.x, working as intended?

2011-08-08 Thread Olivier Delalleau
2011/8/8 Charles R Harris 

>
>
> On Mon, Aug 8, 2011 at 10:54 AM, Olivier Delalleau  wrote:
>
>> Hi,
>>
>> This is with numpy 1.6.1 under Linux x86_64, testing the upcast mechanism
>> of "scalar + array":
>>
>> >>> import numpy; print (numpy.array(3, dtype=numpy.complex128) +
>> numpy.ones(3, dtype=numpy.float32)).dtype
>> complex64
>>
>> Since it has to upcast my array (float32 is not "compatible enough" with
>> complex128), why does it upcast it to complex64 instead of complex128?
>> As far as I can tell 1.4.x and 1.5.x versions of numpy are indeed
>> upcasting to complex128.
>>
>>
> The 0 dimensional array is being treated as a scalar, hence is cast to the
> type of the 1d array. This seems more consistent with the idea that 0
> dimensional arrays act like scalars, but I suppose that is open to
> discussion.
>
> Chuck
>

I'm afraid I don't understand your reply. I know that the 0d array is a
scalar, and thus should not lead to an upcast "unless the scalar is of a
fundamentally different kind of data (*i.e.*, under a different hierarchy in
the data-type hierarchy) than the array" (quoted from
http://docs.scipy.org/doc/numpy/reference/ufuncs.html).

This is one case where it is under a different hierarchy and thus should
trigger an upcast. What I don't understand it why it upcasts to complex64
instead of complex128.

Note that:
1. When replacing "numpy.ones" with "numpy.array" it yields complex128
(expected upcast of scalar addition of complex128 with float32)
2. The behavior is similar if instead of "3" I use a number which cannot be
represented exactly with a complex64 (so it's not a rule about picking the
smallest data type able to exactly represent the result)

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug with assignment into an indexed array?

2011-08-11 Thread Olivier Delalleau
Maybe confusing, but working as expected.

When you write:
  matched_to[np.array([0, 1, 2])] = 3
it calls __setitem__ on matched_to, with arguments (np.array([0, 1, 2]), 3).
So numpy understand you want to write 3 at these indices.

When you write:
matched_to[:3][match] = 3
it first calls __getitem__ with the slice as argument, which returns a view
of your array, then it calls __setitem__ on this view, and it fills your
matched_to array at the same time.

But when you write:
  matched_to[np.array([0, 1, 2])][match] = 3
it first calls __getitem__ with the array as argument, which retunrs a
*copy* of your array, so that calling __setitem__ on this copy has no effect
on your original array.

-=- Olivier


2011/8/10 Benjamin Root 

> Came across this today when trying to determine what was wrong with my
> code:
>
> import numpy as np
> matched_to = np.array([-1] * 5)
> in_ellipse = np.array([False, True, True, True, False])
> match = np.array([False, True, True])
> matched_to[in_ellipse][match] = 3
>
> I would expect matched_to to now be "array([-1, -1, 3, 3, -1])", but
> instead, it is still all -1.
>
> It would seem that unless the view was created by a slice, then the
> assignment into the indexed view would not work as expected.  This works:
>
> >>> matched_to[:3][match] = 3
>
> but not:
>
> >>> matched_to[np.array([0, 1, 2])][match] = 3
>
> Note that the following does work:
>
> >>> matched_to[np.array([0, 1, 2])] = 3
>
> Is this a bug, or was I wrong to expect this to work this way?
>
> Thanks,
> Ben Root
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug with assignment into an indexed array?

2011-08-11 Thread Olivier Delalleau
2011/8/11 Benjamin Root 

>
>
> On Thu, Aug 11, 2011 at 8:37 AM, Olivier Delalleau  wrote:
>
>> Maybe confusing, but working as expected.
>>
>>
>> When you write:
>>   matched_to[np.array([0, 1, 2])] = 3
>> it calls __setitem__ on matched_to, with arguments (np.array([0, 1, 2]),
>> 3). So numpy understand you want to write 3 at these indices.
>>
>>
>> When you write:
>> matched_to[:3][match] = 3
>> it first calls __getitem__ with the slice as argument, which returns a
>> view of your array, then it calls __setitem__ on this view, and it fills
>> your matched_to array at the same time.
>>
>>
>> But when you write:
>>   matched_to[np.array([0, 1, 2])][match] = 3
>> it first calls __getitem__ with the array as argument, which retunrs a
>> *copy* of your array, so that calling __setitem__ on this copy has no effect
>> on your original array.
>>
>> -=- Olivier
>>
>>
> Right, but I guess my question is does it *have* to be that way?  I guess
> it makes some sense with respect to indexing with a numpy array like I did
> with the last example, because an element could be referred to multiple
> times (which explains the common surprise with '+='), but with boolean
> indexing, we are guaranteed that each element of the view will appear at
> most once.  Therefore, shouldn't boolean indexing always return a view, not
> a copy?  Is the general case of arbitrary array selection inherently
> impossible to encode in a view versus a slice with a regular spacing?
>

Yes, due to the fact the array interface only supports regular spacing
(otherwise it is more difficult to get efficient access to arbitrary array
positions).

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ULONG not in UINT16, UINT32, UINT64 under 64-bit windows, is this possible?

2011-08-15 Thread Olivier Delalleau
The reason is there can be multiple dtypes (i.e. with different .num)
representing the same kind of data.
Usually in Python this goes unnoticed, because you do not test a dtype
through its .num, instead you use for instance "== 'uint32'", and all works
fine.
However, it can indeed confuse C code in situations like the one you
describe, because of direct comparison of .num.
I guess you have a few options:
- Do not compare .num (I'm not sure what would be the equivalent to "==
'utin32' in C though) => probably slower
- Re-cast your array in the exact dtype you need (in Python you can do this
with .view) => probably cumbersome
- Write a customized comparison function that figures out at initialization
time all dtypes that represent the same data, and then is able to do a fast
comparison based on .num => probably best, but requires more work

Here's some Python code that lists the various scalar dtypes associated to
unique .num in numpy (excerpt slightly modified from code found in Theano --
http://deeplearning.net/software/theano -- BSD license). Call the
"get_numeric_types()" function, and print both the string representation of
the resulting dtypes as well as their .num.

def get_numeric_subclasses(cls=numpy.number, ignore=None):
"""
Return subclasses of `cls` in the numpy scalar hierarchy.

We only return subclasses that correspond to unique data types.
The hierarchy can be seen here:
http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html
"""
if ignore is None:
ignore = []
rval = []
dtype = numpy.dtype(cls)
dtype_num = dtype.num
if dtype_num not in ignore:
# Safety check: we should be able to represent 0 with this data
type.
numpy.array(0, dtype=dtype)
rval.append(cls)
ignore.append(dtype_num)
for sub in cls.__subclasses__():
rval += [c for c in get_numeric_subclasses(sub, ignore=ignore)]
return rval


def get_numeric_types():
"""
Return numpy numeric data types.

:returns: A list of unique data type objects. Note that multiple data
types
may share the same string representation, but can be differentiated
through
their `num` attribute.
"""
rval = []
def is_within(cls1, cls2):
# Return True if scalars defined from `cls1` are within the
hierarchy
# starting from `cls2`.
# The third test below is to catch for instance the fact that
# one can use ``dtype=numpy.number`` and obtain a float64 scalar,
even
# though `numpy.number` is not under `numpy.floating` in the class
# hierarchy.
return (cls1 is cls2 or
issubclass(cls1, cls2) or
isinstance(numpy.array([0], dtype=cls1)[0], cls2))
for cls in get_numeric_subclasses():
dtype = numpy.dtype(cls)
rval.append([str(dtype), dtype, dtype.num])
# We sort it to be deterministic, then remove the string and num
elements.
return [x[1] for x in sorted(rval, key=str)]


2011/8/15 Pearu Peterson 

>
> Hi,
>
> A student of mine using 32-bit numpy 1.5 under 64-bit Windows 7 noticed
> that
> giving a numpy array with dtype=uint32 to an extension module the
> following codelet would fail:
>
> switch(PyArray_TYPE(ARR)) {
>   case PyArray_UINT16: /* do smth */ break;
>   case PyArray_UINT32: /* do smth */ break;
>   case PyArray_UINT64: /* do smth */ break;
>   default: /* raise type error exception */
> }
>
> The same test worked fine under Linux.
>
> Checking the value of PyArray_TYPE(ARR) (=8) showed that it corresponds to
> NPY_ULONG (when counting the items in the enum definition).
>
> Is this situation possible where NPY_ULONG does not correspond to a 16 or
> 32 or 64 bit integer?
> Or does this indicate a bug somewhere for this particular platform?
>
> Thanks,
> Pearu
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Best way to construct/slice 3-dimensional ndarray from multiple 2d ndarrays?

2011-08-17 Thread Olivier Delalleau
Right now you allocate new memory only when creating your 3d array. When you
do "x = cube[0]" this creates a view that does not allocate more memory.

If your 2d arrays were created independently, I don't think you can avoid
this.
If you have some control on the way your original 2D arrays are created, you
can first initialize the 3d array with correct shape (or an upper bound on
the number of 2d arrays), then use views on this 3d array ("x_i = cube[i]")
to fill your 2D arrays in the same memory space.

I can't help with your second question, sorry.

-=- Olivier

2011/8/17 Keith Hughitt 

> Hi all,
>
> I have a method which builds a single 3d ndarray from several
> equal-dimension 2d ndarrays, and another method which extracts the original
> 2d ndarrays back out from the 3d one.
>
> The way I'm doing this right now is pretty simple, e.g.:
>
> cube = np.asarray([arr1, arr2,...])
> ...
> x = cube[0]
>
> I believe the way this is currently handled, is to use new memory locations
> first for the 3d array, and then later for the 2d slices.
>
> Does anyone know if there is a better way to handle this? Ideally, I would
> like to reuse the same memory locations instead of copying it anew each
> time.
>
> Also, when subclassing ndarray and calling obj = data.view(cls) for an
> ndarray "data", does this copy the data into the new object by value or
> reference? The method which extracts the 2d slice actually returns a
> subclass of ndarray created using the extracted data, so this is why I ask.
>
> Any insight or suggestions would be appreciated.
>
> Thanks!
> Keith
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Reconstruct multidimensional array from buffer without shape

2011-08-19 Thread Olivier Delalleau
How could it be possible? If you only have the buffer data, there could be
many different valid shapes associated to this data.

-=- Olivier

2011/8/19 Ian 

> Hello list,
>
> I am storing a multidimensional array as binary in a Postgres 9.04
> database. For retrieval of this array from the database I thought
> frombuffer() was my solution, however I see that this constructs a
> one-dimensional array. I read in the documentation about the buffer
> parameter in the ndarray() constructor, but that requires the shape of the
> array.
>
> Is there a way to re-construct a multidimensional array from a buffer
> without knowing its shape?
>
> Thanks.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


  1   2   3   >