Re: [Numpy-discussion] Python 3 and isinstance(np.int64(42), int)

2015-06-23 Thread josef.pktd
On Fri, Jun 19, 2015 at 4:15 PM, Chris Barker  wrote:

> On Wed, Jun 17, 2015 at 11:13 PM, Nathaniel Smith  wrote:
>
>>  there's some
>> argument that in Python, doing explicit type checks like this is
>> usually a sign that one is doing something awkward,
>
>
> I tend to agree with that.
>
> On the other hand, numpy itself is kind-of sort-of statically typed. But
> in that case, if you need to know the type of an array -- check the array's
> dtype.
>
> Also:
>
>  >>> a = np.zeros(7, int)
>  >>> n = a[3]
>  >>> type(n)
> 
>
> I Never liked declaring numpy arrays with the python types like "int" or
> "float" -- in numpy you usually care more about the type, so should simple
> use "int64" if you want a 64 bit int. And "float64" if you want a 64 bit
> float. Granted, pyton floats have always been float64 (on all platfroms??),
> and python ints used to a be a reasonable int type, but now that python
> ints are bigInts in py3, it really makes sense to be clear.
>
> And now that I think about it, in py2, int is 32 bit on win64 and 64 bit
> on *nix64 -- so you're really better off being explicit with your numpy
> arrays.
>


being late checking some examples

>>> a = np.zeros(7, int)
>>> a.dtype
dtype('int32')
>>> np.__version__
'1.9.2rc1'
>>> type(a[3])



>>> a = np.zeros(7, int)
>>> a = np.array([88])
>>> a
array([88], dtype=int64)

>>> a = np.array([8])
>>> a
array([8], dtype=object)

>>> a = np.array([8], dtype=int)
Traceback (most recent call last):
  File "", line 1, in 
a = np.array([8], dtype=int)
OverflowError: Python int too large to convert to C long


Looks like we need to be a bit more careful now.

Josef
Python 3.4.3



>
> -CHB
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PR added: frozen dimensions in gufunc signatures

2015-06-23 Thread Oscar Villellas
On Fri, Aug 29, 2014 at 10:55 AM, Jaime Fernández del Río <
jaime.f...@gmail.com> wrote:

> On Thu, Aug 28, 2014 at 5:40 PM, Nathaniel Smith  wrote:
>
>> Some thoughts:
>>
>> But, for your computed dimension idea I'm wondering if what we should
>> do instead is just let a gufunc provide a C callback that looks at the
>> input array dimensions and explicitly says somehow which dimensions it
>> wants to treat as the core dimensions and what its output shapes will
>> be. There's no rule that we have to extend the signature mini-language
>> to be Turing complete, we can just use C :-).
>>
>> It would be good to have a better motivation for computed gufunc
>> dimensions, though. Your "all pairwise cross products" example would
>> be *much* better handled by implementing the .outer method for binary
>> gufuncs: pairwise_cross(a) == cross.outer(a, a). This would make
>> gufuncs more consistent with ufuncs, plus let you do
>> all-pairwise-cross-products between two different sets of cross
>> products, plus give us all-pairwise-matrix-products for free, etc.
>>
>
> The outer for binary gufuncs sounds like a good idea. A reduce for binary
> gufuncs that allow it (like square matrix multiplication) would also be
> nice. But going back to the original question, the pairwise whatevers were
> just an example: one could come up with several others, e.g.:
>
> (m),(n)->($p),($q) with $p = m - n and $q = n - 1, could be (I think)
> the signature of a polynomial division gufunc
> (m),(n)->($p), with $p = m - n + 1, could be the signature of a
> convolution or correlation gufunc
> (m)->($n), with $n = m / 2, could be some form of downsampling gufunc
>
>
An example where a computed output dimension would be useful is with
linalg.svd, as some resulting dimensions for a matrix (m, n) are based on
min(m, n). This, coupled with the required keyword support makes it
necessary to have 6 gufuncs to support the functionality.

I do think that the C callback solution would be enough, and just allow the
signature to have unbound variables that can be resolved by that
callback... no need to change the syntax:

(m),(n)->(p),(q)

When registering such a gufunc, a callback function that resolves the
missing dimensions would be required.

Extra niceties that could be built on top of that:
- pass keyword arguments to that function so that stuff like full_matrices
could be resolved inside the gufunc. Maybe even allowing to modify the
number of results (harder) that would be needed to support stuff like
"compute_uv" in svd as well.

- allow context to be created in that resolution that gets passed into the
ufunc kernel itself (note that this might be *necessary*). If context is
created another function would be needed to dispose that context.


In my experience when implementing the linalg gufunc, a very common pattern
was needing some buffers for the actual LAPACK calls (as those functions
are inplace, a tmp buffer was always needed). Some setup and buffer
allocation was performed before looping. Every iteration in the inner loop
will reuse that data and at the end of the loop the buffers will be
released. That means the initialization/allocation/release is done once per
inner loop call. If the hooks to allocate/dispose the context existed, that
initialization/allocation/release could be done once per ufunc call. AFAIK,
a ufunc call can involve several inner loop calls depending on outer
dimensions and layout of the operands.


> While you're messing around with the gufunc dimension matching logic,
>> any chance we can tempt you to implement the "optional dimensions"
>> needed to handle '@', solve, etc. elegantly? The rule would be that
>> you can write something like
>>(n?,k),(k,m?)->(n?,m?)
>> and the ? dimensions are allowed to take on an additional value
>> "nothing at all". If there's no dimension available in the input, then
>> we act like it was reshaped to add a dimension with shape 1, and then
>> in the output we squeeze this dimension out again. I guess the rules
>> would be that (1) in the input, you can have ? dimensions at the
>> beginning or the end of your shape, but not both at the same time, (2)
>> any dimension that has a ? in one place must have it in all places,
>> (3) when checking argument conformity, "nothing at all" only matches
>> against "nothing at all", not against 1; this is because if we allowed
>> (n?,m),(n?,m)->(n?,m) to be applied to two arrays with shapes (5,) and
>> (1, 5), then it would be ambiguous whether the output should have
>> shape (5,) or (1, 5).
>>
>
> I definitely do not mind taking a look into it. I need to think a little
> more about the rules to convince myself that there is a consistent set of
> them that we can use. I also thought there may be a performance concern,
> that you may want to have different implementations when dimensions are
> missing, not automatically add a 1 and then remove it. It doesn't seem to
> be the case with neither `np.dot` nor `np.solve`, so maybe I am