Oh sure, I'm not suggesting it be impossible to calculate for a single data set. If nothing else, if we had a version that accepted a list of data sets, then you could always pass in a single-element list :-).
On Mar 15, 2018 22:10, "Eric Wieser" <wieser.eric+nu...@gmail.com> wrote: > That sounds like a reasonable extension - but I think there still exist > cases where you want to treat the data as one uniform set when computing > bins (toggling between orthogonal subsets of data) so isn't really a useful > replacement. > > I suppose this becomes relevant when `density` is passed to the individual > histogram invocations. Does matplotlib handle that correctly for stacked > histograms? > > On Thu, Mar 15, 2018, 20:14 Nathaniel Smith <n...@pobox.com> wrote: > >> Instead of an nobs argument, maybe we should have a version that accepts >> multiple data sets, so that we have the full information and can improve >> the algorithm over time. >> >> On Mar 15, 2018 7:57 PM, "Thomas Caswell" <tcasw...@gmail.com> wrote: >> >>> Yes I like the name. >>> >>> The primary use-case for Matplotlib is that our `hist` method can take >>> in a list of arrays and produces N histograms in one shot. Currently with >>> 'auto' we only use the first data set to sort out what the bins should be >>> and then re-use those for the rest of the data sets. This will let us get >>> the bins on the merged input, but I take Josef's point that this is not >>> actually what we want.... >>> >>> Tom >>> >>> On Mon, Mar 12, 2018 at 11:35 PM <josef.p...@gmail.com> wrote: >>> >>>> On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser >>>> <wieser.eric+nu...@gmail.com> wrote: >>>> >> Given that the bin selection are data driven, transferring them >>>> across datasets might not be so useful. >>>> > >>>> > The main application would be to compute bins across the union of all >>>> > datasets. This is already possibly by using `np.histogram` and >>>> > discarding the first result, but that's super wasteful. >>>> >>>> assuming "union" means a combined dataset. >>>> >>>> If you stack datasets, then the number of observations will not be >>>> correct for individual datasets. >>>> >>>> In that case an additional keyword like nobs, or whatever name would >>>> be appropriate for numpy, would be useful, e.g. use the average number >>>> of observations across datasets. >>>> Auxiliary statistic like std could then be computed on the total >>>> dataset (if that makes sense, which would not be the case if the >>>> variance across datasets is larger than the variance within datasets. >>>> >>>> Josef >>>> >>>> > _______________________________________________ >>>> > NumPy-Discussion mailing list >>>> > NumPy-Discussion@python.org >>>> > https://mail.python.org/mailman/listinfo/numpy-discussion >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion@python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion