> On Saturday, Nov 10, 2018 at 9:16 PM, Stephan Hoyer <sho...@gmail.com 
> (mailto:sho...@gmail.com)> wrote:
> On Sat, Nov 10, 2018 at 9:49 AM Marten van Kerkwijk 
> <m.h.vankerkw...@gmail.com (mailto:m.h.vankerkw...@gmail.com)> wrote:
> > Hi Hameer,
> >
> > I do not think we should change `asanyarray` itself to special-case matrix; 
> > rather, we could start converting `asarray` to `asanyarray` and solve the 
> > problems that produces for matrices in `matrix` itself (e.g., by overriding 
> > the relevant function with `__array_function__`).
> >
> > I think the idea of providing an `__anyarray__` method (in analogy with 
> > `__array__`) might work. Indeed, the default in `ndarray` (and thus all its 
> > subclasses) could be to let it return `self` and to override it for 
> > `matrix` to return an ndarray view.
>
> Yes, we certainly would rather implement a matrix.__anyarray__ method (if 
> we're already doing a new protocol) rather than special case np.matrix 
> explicitly.
>
> Unfortunately, per Nathaniel's comments about NA skipping behavior, it seems 
> like we will also need MaskedArray.__anyarray__ to return something other 
> than itself. In principle, we should probably write new version of 
> MaskedArray that doesn't deviate from ndarray semantics, but that's a rather 
> large project (we'd also probably want to stop subclassing ndarray).
>
> Changing the default aggregation behavior for the existing MaskedArray is 
> also an option but that would be a serious annoyance to users and backwards 
> compatibility break. If the only way MaskedArray violates Liskov is in terms 
> of NA skipping aggregations by default, then this might be viable. In 
> practice, this would require adding an explicit skipna argument so 
> FutureWarnings could be silenced. The plus side of this option is that it 
> would make it easier to use np.anyarray() or any new coercion function 
> throughout the internal NumPy code base.
>
> To summarize, I think these are our options:
> 1. Change the behavior of np.anyarray() to check for an __anyarray__() 
> protocol. Change np.matrix.__anyarray__() to return a base numpy array (this 
> is a minor backwards compatibility break, but probably for the best). Start 
> issuing a FutureWarning for any MaskedArray operations that violate Liskov 
> and add a skipna argument that in the future will default to skipna=False.
>
>
>
>
>

> 2. Introduce a new coercion function, e.g., np.duckarray(). This is the 
> easiest option because we don't need to cleanup NumPy's existing ndarray 
> subclasses.
>
>
>
>
>

My vote is still for 1. I don’t have an issue for PyData/Sparse depending on 
recent-ish NumPy versions — It’ll need a lot of the recent protocols anyway, 
although I could be convinced otherwise if major package devs (scikits, SciPy, 
Dask) were to weigh in and say they’ll jump on it (which seems unlikely given 
SciPy’s policy to support old NumPy versions).

>
>
> P.S. I'm just glad pandas stopped subclassing ndarray a while ago -- there's 
> no way pandas.Series() could be fixed up to not violate Liskov :). 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to