On Wed, Jul 6, 2016 at 1:56 PM, Ralf Gommers <ralf.gomm...@gmail.com> wrote: > > > On Wed, Jul 6, 2016 at 6:26 PM, Nathaniel Smith <n...@pobox.com> wrote: > >> On Jul 5, 2016 11:21 PM, "Ralf Gommers" <ralf.gomm...@gmail.com> wrote: >> > >> > >> > >> > On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <n...@pobox.com> wrote: >> > >> >> On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" >> >> <jfoxrabinov...@gmail.com> wrote: >> >> > >> >> > Hi, >> >> > >> >> > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a >> >> > function np.atleast_nd in PR#7804 >> >> > (https://github.com/numpy/numpy/pull/7804). >> >> > >> >> > As a result of this PR, I have a couple of questions about >> >> > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with >> >> > the dimensions: If the input is 1D, it prepends and appends a size-1 >> >> > dimension. If the input is 2D, it appends a size-1 dimension. This is >> >> > inconsistent with `np.atleast_2d`, which always prepends (as does >> >> > `np.atleast_nd`). >> >> > >> >> > - Is there any reason for this behavior? >> >> > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in >> >> > terms of `np.atleast_nd`, which is actually much simpler)? This would >> >> > be a slight API change since the output would not be exactly the >> >> > same. >> >> >> >> Changing atleast_3d seems likely to break a bunch of stuff... >> >> >> >> Beyond that, I find it hard to have an opinion about the best design >> >> for these functions, because I don't think I've ever encountered a >> >> situation >> >> where they were actually what I wanted. I'm not a big fan of coercing >> >> dimensions in the first place, for the usual "refuse to guess" reasons. >> >> And >> >> then generally if I do want to coerce an array to another dimension, then >> >> I >> >> have some opinion about where the new dimensions should go, and/or I have >> >> some opinion about the minimum acceptable starting dimension, and/or I >> >> have >> >> a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; >> >> 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that >> >> requirements list.) >> >> >> >> I don't know how typical I am in this. But it does make me wonder if >> >> the atleast_* functions act as an attractive nuisance, where new users >> >> take >> >> their presence as an implicit recommendation that they are actually a >> >> useful >> >> thing to reach for, even though they... aren't that. And maybe we should >> >> be >> >> recommending folk move away from them rather than trying to extend them >> >> further? >> >> >> >> Or maybe they're totally useful and I'm just missing it. What's your >> >> use case that motivates atleast_nd? >> > >> > I think you're just missing it:) atleast_1d/2d are used quite a bit in >> > Scipy and Statsmodels (those are the only ones I checked), and in the large >> > majority of cases it's the best thing to use there. There's a bunch of >> > atleast_2d calls with a transpose appended because the input needs to be >> > treated as columns instead of rows, but that's still efficient and readable >> > enough. >> >> I know people *use* it :-). What I'm confused about is in what situations >> you would invent it if it didn't exist. Can you point me to an example or >> two where it's "the best thing"? I actually had statsmodels in mind with my >> example of wanting the semantics "coerce 1d inputs into a column matrix; 0d >> or 3d inputs are an error". I'm surprised if there are places where you >> really want 0d arrays converted into 1x1, > > Scalar to shape (1,1) is less common, but 1-D to 2-D or scalar to shape (1,) > is very common.
That's ravel, though, not atleast_*, right? > Example is at the top of scipy/stats/stats.py: the > _chk_asarray functions (used in many other functions) I feel like this actually argues for my point :-). scipy.stats needs some uniform prepping of input, so there's a helper function to do that, and the helper function's semantics are not at all the semantics of atleast_*. And they don't even use atleast_* in any necessary way -- the only thing they do is if arr.ndim ==0: arr = np.atleast_1d(arr) but this could be written just as well as if arr.ndim == 0: arr = arr[np.newaxis] (In any case, atleast_1d definitely makes more sense to me than any of the others, since it so obviously corresponds to exactly that 2-line incantation as the only reasonable implementation.) > take care to never > return scalar arrays because those are plain annoying to deal with. If that > sounds weird to you, you're probably one of those people who was never > surprised by this: > > In [3]: x0 = np.array(1) > > In [4]: x1 = np.array([1]) > > In [5]: x0[0] > --------------------------------------------------------------------------- > IndexError Traceback (most recent call last) > <ipython-input-5-6a57e371ca72> in <module>() > ----> 1 x0[0] > > IndexError: too many indices for array > > In [6]: x1[0] > Out[6]: 1 I was surprised by it the first time I hit it, but then thought it over and decided that it was better than the alternatives :-). (It does strike me as really odd, I would even say a bug, that e.g. scipy.stats.mode returns a 1d array for 1d input, and a 2d array (!) for 2d input. Mode is semantically a reduction operation, and we have pretty strong conventions for how those work -- drop a dimension unless keepdims=True. Obviously this is old existing code, and that's fine, but it's not how we'd recommend people write new code, I think? I guess this is orthogonal to the whole atleast_* discussion anyway :-).) >> or want to allow high dimensional arrays to pass through - and if you do >> want to allow high dimensional arrays to pass through, then transposing >> might help with 2d cases but will silently mangle high-d cases, right? > >>2d input handling is usually irrelevant. The vast majority of cases is >> "function that accepts scalar and 1-D array" or "function that accepts 1-D >> and 2-D arrays". So maybe we should have functions that actually handle those cases, instead of recommending atleast_*? -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion