Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-06 Thread Benjamin Root
On Fri, Mar 6, 2015 at 7:59 AM, Charles R Harris charlesr.har...@gmail.com
wrote:

 Datetime64 seems to use the highest precision

 In [12]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[us]')
 Out[12]: dtype('M8[us]')

 In [13]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[Y]')
 Out[13]: dtype('M8[D]')



Ah, yes, that's what I'm looking for. +1 from me to have this in
asarray/asanyarray. Of course, there is always the usual caveats about
converting your datetime data in this manner, but this would be helpful in
many situations in writing functions that expect to deal with temporal data
at the resolution of minutes or somesuch.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-06 Thread Charles R Harris
On Thu, Mar 5, 2015 at 10:02 PM, josef.p...@gmail.com wrote:

 On Thu, Mar 5, 2015 at 12:33 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker chris.bar...@noaa.gov
 wrote:
 
  On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root ben.r...@ou.edu wrote:
 
  dare I say... datetime64/timedelta64 support?
 
 
  well, the precision of those is 64 bits, yes? so if you asked for less
  than that, you'd still get a dt64. If you asked for 64 bits, you'd get
 it,
  if you asked for datetime128  -- what would you get???
 
  a 128 bit integer? or an Exception, because there is no 128bit datetime
  dtype.
 
  But I think this is the same problem with any dtype -- if you ask for a
  precision that doesn't exist, you're going to get an error.
 
  Is there a more detailed description of the proposed feature anywhere?
 Do
  you specify a dtype as a precision? or jsut the precision, and let the
 dtype
  figure it out for itself, i.e.:
 
  precision=64
 
  would give you a float64 if the passed in array was a float type, but a
  int64 if the passed in array was an int type, or a uint64 if the passed
 in
  array was a unsigned int type, etc.
 
  But in the end,  I wonder about the use case. I generaly use asarray one
  of two ways:
 
  Without a dtype -- to simple make sure I've got an ndarray of SOME
 dtype.
 
  or
 
  With a dtype - because I really care about the dtype -- usually because
 I
  need to pass it on to C code or something.
 
  I don't think I'd ever need at least some precision, but not care if I
 got
  more than that...
 
 
  The main use that I want to cover is that float64 and complex128 have the
  same precision and it would be good if either is acceptable.  Also, one
  might just want either float32 or float64, not just one of the two.
 Another
  intent is to make the fewest possible copies. The determination of the
  resulting type is made using the result_type function.


 How does this work for object arrays, or datetime?

 Can I specify at least float32 or float64, and it raises an exception
 if it cannot be converted?

 The problem we have in statsmodels is that pandas frequently uses
 object arrays and it messes up patsy or statsmodels if it's not
 explicitly converted.


Object arrays go to object arrays, datetime64 depends.

In [10]: result_type(ones(1, dtype=object_), float32)
Out[10]: dtype('O')


Datetime64 seems to use the highest precision

In [12]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[us]')
Out[12]: dtype('M8[us]')

In [13]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[Y]')
Out[13]: dtype('M8[D]')

but doesn't convert to float

In [11]: result_type(ones(1, dtype='datetime64[D]'), float32)
---
TypeError Traceback (most recent call last)
ipython-input-11-e1a09e933dc7 in module()
 1 result_type(ones(1, dtype='datetime64[D]'), float32)

TypeError: invalid type promotion

What would you like it to do?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-06 Thread josef.pktd
On Fri, Mar 6, 2015 at 7:59 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Thu, Mar 5, 2015 at 10:02 PM, josef.p...@gmail.com wrote:

 On Thu, Mar 5, 2015 at 12:33 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker chris.bar...@noaa.gov
  wrote:
 
  On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root ben.r...@ou.edu wrote:
 
  dare I say... datetime64/timedelta64 support?
 
 
  well, the precision of those is 64 bits, yes? so if you asked for less
  than that, you'd still get a dt64. If you asked for 64 bits, you'd get
  it,
  if you asked for datetime128  -- what would you get???
 
  a 128 bit integer? or an Exception, because there is no 128bit datetime
  dtype.
 
  But I think this is the same problem with any dtype -- if you ask for a
  precision that doesn't exist, you're going to get an error.
 
  Is there a more detailed description of the proposed feature anywhere?
  Do
  you specify a dtype as a precision? or jsut the precision, and let the
  dtype
  figure it out for itself, i.e.:
 
  precision=64
 
  would give you a float64 if the passed in array was a float type, but a
  int64 if the passed in array was an int type, or a uint64 if the passed
  in
  array was a unsigned int type, etc.
 
  But in the end,  I wonder about the use case. I generaly use asarray
  one
  of two ways:
 
  Without a dtype -- to simple make sure I've got an ndarray of SOME
  dtype.
 
  or
 
  With a dtype - because I really care about the dtype -- usually because
  I
  need to pass it on to C code or something.
 
  I don't think I'd ever need at least some precision, but not care if I
  got
  more than that...
 
 
  The main use that I want to cover is that float64 and complex128 have
  the
  same precision and it would be good if either is acceptable.  Also, one
  might just want either float32 or float64, not just one of the two.
  Another
  intent is to make the fewest possible copies. The determination of the
  resulting type is made using the result_type function.


 How does this work for object arrays, or datetime?

 Can I specify at least float32 or float64, and it raises an exception
 if it cannot be converted?

 The problem we have in statsmodels is that pandas frequently uses
 object arrays and it messes up patsy or statsmodels if it's not
 explicitly converted.


 Object arrays go to object arrays, datetime64 depends.

 In [10]: result_type(ones(1, dtype=object_), float32)
 Out[10]: dtype('O')


 Datetime64 seems to use the highest precision

 In [12]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[us]')
 Out[12]: dtype('M8[us]')

 In [13]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[Y]')
 Out[13]: dtype('M8[D]')

 but doesn't convert to float

 In [11]: result_type(ones(1, dtype='datetime64[D]'), float32)
 ---
 TypeError Traceback (most recent call last)
 ipython-input-11-e1a09e933dc7 in module()
  1 result_type(ones(1, dtype='datetime64[D]'), float32)

 TypeError: invalid type promotion

 What would you like it to do?

Note: the dtype handling in statsmodels is still a mess, and we just
plugged some of the worst cases.


What we would need is asarray with at least a minimum precision (e.g.
float32) and raise an exception if it's not numeric, like string,
object, custom dtypes ...

However, we need custom dtype handling in statsmodels anyway, so the
enhancement to asarray with exceptions would mainly be convenient to
get something to work with because pandas and numpy as now object
array friendly.

I assume scipy also has insufficient checks for non-numeric dtypes, AFAIR.


Josef



 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Charles R Harris
Hi All,

This is apropos gh-5634 https://github.com/numpy/numpy/pull/5634, a PR
adding a precision keyword to asarray and asanyarray. The PR description is

 The precision keyword differs from the current dtype keyword in the
 following way.

- It specifies a minimum precision. If the precision of the input is
greater than the specified precision, the input precision is preserved.
- Complex types are preserved. A specifies floating precision applies
to the dtype of the real and complex parts separately.

 For example, both complex128 and float64 dtypes have the
 same precision and an array of dtype float64 will be unchanged if the
 specified precision is float32.

 Ideally the precision keyword would be pushed down into the array
 constructor so that the resulting dtype could be determined before the
 array is constructed, but that would require adding new functions as the
 current constructors are part of the API and cannot have their
 signatures changed.

The name of the keyword is open to discussion, as well as its acceptable
values. And of course, anything else that might come to mind ;)

Thoughts?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Benjamin Root
dare I say... datetime64/timedelta64 support?

::ducks::

Ben Root

On Thu, Mar 5, 2015 at 11:40 AM, Charles R Harris charlesr.har...@gmail.com
 wrote:

 Hi All,

 This is apropos gh-5634 https://github.com/numpy/numpy/pull/5634, a PR
 adding a precision keyword to asarray and asanyarray. The PR description is

  The precision keyword differs from the current dtype keyword in the
 following way.

- It specifies a minimum precision. If the precision of the input is
greater than the specified precision, the input precision is preserved.
- Complex types are preserved. A specifies floating precision applies
to the dtype of the real and complex parts separately.

 For example, both complex128 and float64 dtypes have the
 same precision and an array of dtype float64 will be unchanged if the
 specified precision is float32.

 Ideally the precision keyword would be pushed down into the array
 constructor so that the resulting dtype could be determined before the
 array is constructed, but that would require adding new functions as the
 current constructors are part of the API and cannot have their
 signatures changed.

 The name of the keyword is open to discussion, as well as its acceptable
 values. And of course, anything else that might come to mind ;)

 Thoughts?

 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Charles R Harris
On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker chris.bar...@noaa.gov wrote:

 On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root ben.r...@ou.edu wrote:

 dare I say... datetime64/timedelta64 support?


 well, the precision of those is 64 bits, yes? so if you asked for less
 than that, you'd still get a dt64. If you asked for 64 bits, you'd get it,
 if you asked for datetime128  -- what would you get???

 a 128 bit integer? or an Exception, because there is no 128bit datetime
 dtype.

 But I think this is the same problem with any dtype -- if you ask for a
 precision that doesn't exist, you're going to get an error.

 Is there a more detailed description of the proposed feature anywhere? Do
 you specify a dtype as a precision? or jsut the precision, and let the
 dtype figure it out for itself, i.e.:

 precision=64

 would give you a float64 if the passed in array was a float type, but a
 int64 if the passed in array was an int type, or a uint64 if the passed in
 array was a unsigned int type, etc.

 But in the end,  I wonder about the use case. I generaly use asarray one
 of two ways:

 Without a dtype -- to simple make sure I've got an ndarray of SOME dtype.

 or

 With a dtype - because I really care about the dtype -- usually because I
 need to pass it on to C code or something.

 I don't think I'd ever need at least some precision, but not care if I got
 more than that...


The main use that I want to cover is that float64 and complex128 have the
same precision and it would be good if either is acceptable.  Also, one
might just want either float32 or float64, not just one of the two. Another
intent is to make the fewest possible copies. The determination of the
resulting type is made using the result_type function.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Chris Barker
On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root ben.r...@ou.edu wrote:

 dare I say... datetime64/timedelta64 support?


well, the precision of those is 64 bits, yes? so if you asked for less than
that, you'd still get a dt64. If you asked for 64 bits, you'd get it, if
you asked for datetime128  -- what would you get???

a 128 bit integer? or an Exception, because there is no 128bit datetime
dtype.

But I think this is the same problem with any dtype -- if you ask for a
precision that doesn't exist, you're going to get an error.

Is there a more detailed description of the proposed feature anywhere? Do
you specify a dtype as a precision? or jsut the precision, and let the
dtype figure it out for itself, i.e.:

precision=64

would give you a float64 if the passed in array was a float type, but a
int64 if the passed in array was an int type, or a uint64 if the passed in
array was a unsigned int type, etc.

But in the end,  I wonder about the use case. I generaly use asarray one of
two ways:

Without a dtype -- to simple make sure I've got an ndarray of SOME dtype.

or

With a dtype - because I really care about the dtype -- usually because I
need to pass it on to C code or something.

I don't think I'd ever need at least some precision, but not care if I got
more than that

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Benjamin Root
On Thu, Mar 5, 2015 at 12:04 PM, Chris Barker chris.bar...@noaa.gov wrote:

 well, the precision of those is 64 bits, yes? so if you asked for less
 than that, you'd still get a dt64. If you asked for 64 bits, you'd get it,
 if you asked for datetime128  -- what would you get???

 a 128 bit integer? or an Exception, because there is no 128bit datetime
 dtype.



I was more thinking of datetime64/timedelta64's ability to specify the time
units.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread josef.pktd
On Thu, Mar 5, 2015 at 12:33 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker chris.bar...@noaa.gov wrote:

 On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root ben.r...@ou.edu wrote:

 dare I say... datetime64/timedelta64 support?


 well, the precision of those is 64 bits, yes? so if you asked for less
 than that, you'd still get a dt64. If you asked for 64 bits, you'd get it,
 if you asked for datetime128  -- what would you get???

 a 128 bit integer? or an Exception, because there is no 128bit datetime
 dtype.

 But I think this is the same problem with any dtype -- if you ask for a
 precision that doesn't exist, you're going to get an error.

 Is there a more detailed description of the proposed feature anywhere? Do
 you specify a dtype as a precision? or jsut the precision, and let the dtype
 figure it out for itself, i.e.:

 precision=64

 would give you a float64 if the passed in array was a float type, but a
 int64 if the passed in array was an int type, or a uint64 if the passed in
 array was a unsigned int type, etc.

 But in the end,  I wonder about the use case. I generaly use asarray one
 of two ways:

 Without a dtype -- to simple make sure I've got an ndarray of SOME dtype.

 or

 With a dtype - because I really care about the dtype -- usually because I
 need to pass it on to C code or something.

 I don't think I'd ever need at least some precision, but not care if I got
 more than that...


 The main use that I want to cover is that float64 and complex128 have the
 same precision and it would be good if either is acceptable.  Also, one
 might just want either float32 or float64, not just one of the two. Another
 intent is to make the fewest possible copies. The determination of the
 resulting type is made using the result_type function.


How does this work for object arrays, or datetime?

Can I specify at least float32 or float64, and it raises an exception
if it cannot be converted?

The problem we have in statsmodels is that pandas frequently uses
object arrays and it messes up patsy or statsmodels if it's not
explicitly converted.

Josef





 Chuck


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion