passing a list of arrays would be useful (aside of discriminating between list and array_like)
In that case I would add a keyword like "within=True" to compute the additional statistics like std or iqr on the group demeaned data. This would remove the effect of (mean-)shifted datasets on those auxiliary statistics. aside: An alternative to using a list of arrays would be to include a "groups" indicator as keyword, and if it is not None, then compute based on averages across groups or pooled within statistics. Josef On Fri, Mar 16, 2018 at 3:06 AM, Nathaniel Smith <n...@pobox.com> wrote: > Oh sure, I'm not suggesting it be impossible to calculate for a single data > set. If nothing else, if we had a version that accepted a list of data sets, > then you could always pass in a single-element list :-). > > On Mar 15, 2018 22:10, "Eric Wieser" <wieser.eric+nu...@gmail.com> wrote: >> >> That sounds like a reasonable extension - but I think there still exist >> cases where you want to treat the data as one uniform set when computing >> bins (toggling between orthogonal subsets of data) so isn't really a useful >> replacement. >> >> I suppose this becomes relevant when `density` is passed to the individual >> histogram invocations. Does matplotlib handle that correctly for stacked >> histograms? >> >> On Thu, Mar 15, 2018, 20:14 Nathaniel Smith <n...@pobox.com> wrote: >>> >>> Instead of an nobs argument, maybe we should have a version that accepts >>> multiple data sets, so that we have the full information and can improve the >>> algorithm over time. >>> >>> On Mar 15, 2018 7:57 PM, "Thomas Caswell" <tcasw...@gmail.com> wrote: >>>> >>>> Yes I like the name. >>>> >>>> The primary use-case for Matplotlib is that our `hist` method can take >>>> in a list of arrays and produces N histograms in one shot. Currently with >>>> 'auto' we only use the first data set to sort out what the bins should be >>>> and then re-use those for the rest of the data sets. This will let us get >>>> the bins on the merged input, but I take Josef's point that this is not >>>> actually what we want.... >>>> >>>> Tom >>>> >>>> On Mon, Mar 12, 2018 at 11:35 PM <josef.p...@gmail.com> wrote: >>>>> >>>>> On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser >>>>> <wieser.eric+nu...@gmail.com> wrote: >>>>> >> Given that the bin selection are data driven, transferring them >>>>> >> across datasets might not be so useful. >>>>> > >>>>> > The main application would be to compute bins across the union of all >>>>> > datasets. This is already possibly by using `np.histogram` and >>>>> > discarding the first result, but that's super wasteful. >>>>> >>>>> assuming "union" means a combined dataset. >>>>> >>>>> If you stack datasets, then the number of observations will not be >>>>> correct for individual datasets. >>>>> >>>>> In that case an additional keyword like nobs, or whatever name would >>>>> be appropriate for numpy, would be useful, e.g. use the average number >>>>> of observations across datasets. >>>>> Auxiliary statistic like std could then be computed on the total >>>>> dataset (if that makes sense, which would not be the case if the >>>>> variance across datasets is larger than the variance within datasets. >>>>> >>>>> Josef >>>>> >>>>> > _______________________________________________ >>>>> > NumPy-Discussion mailing list >>>>> > NumPy-Discussion@python.org >>>>> > https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion@python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion@python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion