[Numpy-discussion] CASTABLE flag

2008-01-07 Thread Charles R Harris
Hi All,

I'm thinking that one way to make the automatic type conversion a bit safer
to use would be to add a CASTABLE flag to arrays. Then we could write
something like

a[...] = typecast(b)

where typecast returns a view of b with the CASTABLE flag set so that the
assignment operator can check whether to implement the current behavior or
to raise an error. Maybe this could even be part of the dtype scalar,
although that might cause a lot of problems with the current default
conversions. What do folks think?

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] CASTABLE flag

2008-01-07 Thread Travis E. Oliphant
Charles R Harris wrote:
> Hi All,
>
> I'm thinking that one way to make the automatic type conversion a bit 
> safer to use would be to add a CASTABLE flag to arrays. Then we could 
> write something like
>
> a[...] = typecast(b)
>
> where typecast returns a view of b with the CASTABLE flag set so that 
> the assignment operator can check whether to implement the current 
> behavior or to raise an error. Maybe this could even be part of the 
> dtype scalar, although that might cause a lot of problems with the 
> current default conversions. What do folks think?

That is an interesting approach.The issue raised of having to 
convert lines of code that currently work (which does implicit casting) 
would still be there (like ndimage), but it would not cause unnecessary 
data copying, and would help with this complaint (that I've heard before 
and have some sympathy towards).

I'm intrigued.

-Travis O.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] CASTABLE flag

2008-01-07 Thread Charles R Harris
On Jan 7, 2008 12:00 PM, Travis E. Oliphant <[EMAIL PROTECTED]> wrote:

> Charles R Harris wrote:
> > Hi All,
> >
> > I'm thinking that one way to make the automatic type conversion a bit
> > safer to use would be to add a CASTABLE flag to arrays. Then we could
> > write something like
> >
> > a[...] = typecast(b)
> >
> > where typecast returns a view of b with the CASTABLE flag set so that
> > the assignment operator can check whether to implement the current
> > behavior or to raise an error. Maybe this could even be part of the
> > dtype scalar, although that might cause a lot of problems with the
> > current default conversions. What do folks think?
>
> That is an interesting approach.The issue raised of having to
> convert lines of code that currently work (which does implicit casting)
> would still be there (like ndimage), but it would not cause unnecessary
> data copying, and would help with this complaint (that I've heard before
> and have some sympathy towards).
>
> I'm intrigued.


Maybe  we could also set a global flag, typesafe, that in the current Numpy
version would default to false, giving current behavior, but could be set
true to get the new behavior. Then when Numpy 1.1 comes out we could make
the global default true. That way folks could keep the current code working
until they are ready to use the typesafe feature.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] CASTABLE flag

2008-01-07 Thread Scott Ransom
On Monday 07 January 2008 02:13:56 pm Charles R Harris wrote:
> On Jan 7, 2008 12:00 PM, Travis E. Oliphant <[EMAIL PROTECTED]> 
wrote:
> > Charles R Harris wrote:
> > > Hi All,
> > >
> > > I'm thinking that one way to make the automatic type conversion a
> > > bit safer to use would be to add a CASTABLE flag to arrays. Then
> > > we could write something like
> > >
> > > a[...] = typecast(b)
> > >
> > > where typecast returns a view of b with the CASTABLE flag set so
> > > that the assignment operator can check whether to implement the
> > > current behavior or to raise an error. Maybe this could even be
> > > part of the dtype scalar, although that might cause a lot of
> > > problems with the current default conversions. What do folks
> > > think?
> >
> > That is an interesting approach.The issue raised of having to
> > convert lines of code that currently work (which does implicit
> > casting) would still be there (like ndimage), but it would not
> > cause unnecessary data copying, and would help with this complaint
> > (that I've heard before and have some sympathy towards).
> >
> > I'm intrigued.
>
> Maybe  we could also set a global flag, typesafe, that in the current
> Numpy version would default to false, giving current behavior, but
> could be set true to get the new behavior. Then when Numpy 1.1 comes
> out we could make the global default true. That way folks could keep
> the current code working until they are ready to use the typesafe
> feature.

I'm a bit confused as to which types of casting you are proposing to 
change.  As has been pointed out by several people, users very often 
_want_ to "lose information".  And as I pointed out, it is one of the 
reasons why we are all using numpy today as opposed to numeric!

I'd bet that the vast majority of the people on this list believe that 
the OPs problem of complex numbers being automatically cast into floats 
is a real problem.  Fine.  We should be able to raise an exception in 
that case.

However, two other very common cases of "lost information" are not 
obviously a problems, and are (for many of us) the _preferred_ actions.  
These examples are:

1.  Performing floating point math in higher precision, but casting to a 
lower-precision float if that float is on the lhs of the assignment.  
For example:

In [22]: a = arange(5, dtype=float32)

In [23]: a += arange(5.0)

In [24]: a
Out[24]: array([ 0.,  2.,  4.,  6.,  8.], dtype=float32)

To me, that is fantastic.  I've obviously explicitly requested that I 
want "a" to hold 32-bit floats.  And if I'm careful and use in-place 
math, I get 32-bit floats at the end (and no problems with large 
temporaries or memory doubling by an automatic cast to float64).

In [25]: a = a + arange(5.0)

In [26]: a
Out[26]: array([  0.,   3.,   6.,   9.,  12.])

In this case, I'm reassigning "a" from 32-bits to 64-bits because I'm 
not using in-place math.  The temporary array created on the rhs 
defines the type of the new assignment.  Once again, I think this is 
good.

2.  Similarly, if I want to stuff floats into a int array:

In [28]: a
Out[28]: array([0, 1, 2, 3, 4])

In [29]: a += 2.5

In [30]: a
Out[30]: array([2, 3, 4, 5, 6])

Here, I get C-like rounding/casting of my originally integer array 
because I'm using in-place math.  This is often a very useful behavior.

In [31]: a = a + 2.5

In [32]: a
Out[32]: array([ 4.5,  5.5,  6.5,  7.5,  8.5])

But here, without the in-place math, a gets converted to doubles.

I can certainly say that in my code (which is used by a fair number of 
people in my field), each of these use cases are common.  And I think 
they are one of the _strengths_ of numpy.

I will be very disappointed if this default behavior changes.

Scott



-- 
Scott M. RansomAddress:  NRAO
Phone:  (434) 296-0320   520 Edgemont Rd.
email:  [EMAIL PROTECTED] Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] CASTABLE flag

2008-01-07 Thread Timothy Hochberg
Another possible approach is to treat downcasting similar to underflow. That
is give it it's own flag in the errstate and people can set it to ignore,
warn or raise on downcasting as desired. One could potentially have two
flags, one for downcasting across kinds (float->int, int->bool) and one for
downcasting within kinds (float64->float32). In this case, I personally
would set the first to raise and the second to ignore and would suggest that
as the default.

IMO:

   1. It's a no brainer to raise and exception when assigning a complex
   value to a float or integer target. Using "Z.real" or "Z.imag" is
   clearer and has the same performance.
   2. I'm fairly dubious about assigning float to ints as is. First off
   it looks like a bug magnet to me due to accidentally assigning a floating
   point value to a target that one believes to be float but is in fact
   integer. Second, C-style rounding is pretty evil; it's not always consistent
   across platforms, so relying on it for anything other than truncating
   already integral values is asking for trouble.
   3. Downcasting within kinds seems much less hazardous than downcasting
   across kinds, although I'd still be happy to be able regulate it with
   errstate.


-tim
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] CASTABLE flag

2008-01-07 Thread Anne Archibald
On 07/01/2008, Timothy Hochberg <[EMAIL PROTECTED]> wrote:

> I'm fairly dubious about assigning float to ints as is. First off it looks
> like a bug magnet to me due to accidentally assigning a floating point value
> to a target that one believes to be float but is in fact integer. Second,
> C-style rounding is pretty evil; it's not always consistent across
> platforms, so relying on it for anything other than truncating already
> integral values is asking for trouble.

I'm not sure I agree that this is a bug magnet: if your array is
integer and you think it's float, you're going to get burned sooner
rather than later, whether you assign to it or not. The only case I
can see would be a problem would be if zeros() and ones() created
integers - which they don't.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] CASTABLE flag

2008-01-07 Thread Zachary Pincus
Hello all,

In order to help make things regarding this casting issue more  
explicit, let me present the following table of potential "down-casts".

(Also, for the record, nobody is proposing automatic up-casting of  
any kind. The proposals on the table focus on preventing some or all  
implicit down-casting.)

(1) complex -> anything else:
Data is lost wholesale.

(2) large float -> smaller float (also large complex -> smaller  
complex):
Precision is lost.

(3) float -> integer:
Precision is lost, to a much greater degree than (2).

(4) large integer -> smaller integer (also signed/unsigned conversions):
Data gets lost/mangled/wrapped around.

The original requests for exceptions to be raised focused on case  
(1), where it is most clear that loss-of-data is happening in a way  
that is unlikely to be intentional.

Robert Kern suggested that exceptions might be raised for cases (1)  
and (3), which are cross-kind casts, but not for within-kind  
conversions like (2) and (4). However, I personally don't think that  
down-converting from float to int is any more or less fraught than  
converting from int32 to int8: if you need a warning/exception in one  
case, you'd need it in the rest. Moreover, there's the principle of  
least surprise, which would suggest that complex rules for when  
exceptions get raised based on the kind of conversion being made is  
just asking for trouble.

So, as a poll, if you are in favor of exceptions instead of automatic  
down-conversion, where do you draw the line? What causes an error?  
Robert seemed to be in favor of (1) and (3). Anne seemed to think  
that only (1) was problematic enough to worry about. I am personally  
cool toward the exceptions, but I think that case (4) is just as  
"bad" as case (3) in terms of data-loss, though I agree that case (1)  
seems the worst (and I don't really find any of them particularly  
bad, though case (1) is something of an oddity for newcomers, I agree.)

Finally, if people really want these sort of warnings, I would  
suggest that they be rolled into the get/setoptions mechanism, so  
that there's a fine-grained mechanism for turning them to off/warn/ 
raise exception.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] CASTABLE flag

2008-01-07 Thread Charles R Harris
Hi,

On Jan 7, 2008 1:16 PM, Timothy Hochberg <[EMAIL PROTECTED]> wrote:

>
> Another possible approach is to treat downcasting similar to underflow.
> That is give it it's own flag in the errstate and people can set it to
> ignore, warn or raise on downcasting as desired. One could potentially have
> two flags, one for downcasting across kinds (float->int, int->bool) and one
> for downcasting within kinds (float64->float32). In this case, I personally
> would set the first to raise and the second to ignore and would suggest that
> as the default.
>
> IMO:
>
>1. It's a no brainer to raise and exception when assigning a complex
>value to a float or integer target. Using "Z.real" or "Z.imag" is
>clearer and has the same performance.
>2. I'm fairly dubious about assigning float to ints as is. First off
>it looks like a bug magnet to me due to accidentally assigning a floating
>point value to a target that one believes to be float but is in fact
>integer. Second, C-style rounding is pretty evil; it's not always 
> consistent
>across platforms, so relying on it for anything other than truncating
>already integral values is asking for trouble.
>3. Downcasting within kinds seems much less hazardous than
>downcasting across kinds, although I'd still be happy to be able regulate 
> it
>with errstate.
>
> Maybe a combination of a typecast function and the errstate would work
well. The typecast function would provide a clear local override of the
default errstate flags, while the user would have the option to specify what
sort of behavior they care about in general.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] CASTABLE flag

2008-01-07 Thread Timothy Hochberg
On Jan 7, 2008 2:00 PM, Charles R Harris <[EMAIL PROTECTED]> wrote:

> Hi,
>
> On Jan 7, 2008 1:16 PM, Timothy Hochberg <[EMAIL PROTECTED]> wrote:
>
> >
> > Another possible approach is to treat downcasting similar to underflow.
> > That is give it it's own flag in the errstate and people can set it to
> > ignore, warn or raise on downcasting as desired. One could potentially have
> > two flags, one for downcasting across kinds (float->int, int->bool) and one
> > for downcasting within kinds (float64->float32). In this case, I personally
> > would set the first to raise and the second to ignore and would suggest that
> > as the default.
> >
> > IMO:
> >
> >1. It's a no brainer to raise and exception when assigning a
> >complex value to a float or integer target. Using "Z.real" or "
> >Z.imag" is clearer and has the same performance.
> >2. I'm fairly dubious about assigning float to ints as is. First
> >off it looks like a bug magnet to me due to accidentally assigning a
> >floating point value to a target that one believes to be float but is in
> >fact integer. Second, C-style rounding is pretty evil; it's not always
> >consistent across platforms, so relying on it for anything other than
> >truncating already integral values is asking for trouble.
> >3. Downcasting within kinds seems much less hazardous than
> >downcasting across kinds, although I'd still be happy to be able 
> > regulate it
> >with errstate.
> >
> > Maybe a combination of a typecast function and the errstate would work
> well. The typecast function would provide a clear local override of the
> default errstate flags, while the user would have the option to specify what
> sort of behavior they care about in general.
>

Note that using the 'with' statement, you can have reasonably lightweight
local control using errstate. For example, assuming the hypothetical
dowcasting flag was named 'downcast':

 with errstate(downcast='ignore'):
  anint[:] = afloat

That would be local enough for me although it may not be to everyone's
taste.

If we were to go the "a[:] = typecast(b)" route, I have a hankering for some
more fine grained control of the rounding. Something like "a[:] =
lazy_floor(b)", "a[:] = lazy_truncate(b)" or "a[:] = lazy_ceil(b)", only
with better names. However, I admit that it's not obvious how to implement
that.

Yet another approach, is to add an 'out' argument to asstype such as already
exists for many of the other methods. Then "a.astype(int, out=b)", will
efficiently stick an integerized version of a into b.

FWIW, I suspect that the usefulness of casting in avoiding of temporaries is
probably overstated. If you really want to avoid temporaries, you either
need to program in a very tedious fashion, suitable only for very localized
hot spots, or you need to use something like numexpr. If someone comes back
with some measurements from real code that show a big time hit, I'll concede
the case, but so far it all sounds like guessing and my guess is that it'll
rarely make a difference. (And, the cases where it does make a difference
will be places your already doing crazy things to optimize the snot out of
the code and an extra astype or such won't matter)

-- 
.  __
.   |-\
.
.  [EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion