On Sun, Jan 25, 2015 at 1:48 PM, Warren Weckesser < warren.weckes...@gmail.com> wrote:
> > > On Wed, Aug 13, 2014 at 6:17 PM, Eelco Hoogendoorn < > hoogendoorn.ee...@gmail.com> wrote: > >> Its pretty easy to implement this table functionality and more on top of >> the code I linked above. I still think such a comprehensive overhaul of >> arraysetops is worth discussing. >> >> import numpy as np >> import grouping >> x = [1, 1, 1, 1, 2, 2, 2, 2, 2] >> y = [3, 4, 3, 3, 3, 4, 5, 5, 5] >> z = np.random.randint(0,2,(9,2)) >> def table(*keys): >> """ >> desired table implementation, building on the index object >> cleaner, and more functionality >> performance should be the same >> """ >> indices = [grouping.as_index(k, axis=0) for k in keys] >> uniques = [i.unique for i in indices] >> inverses = [i.inverse for i in indices] >> shape = [i.groups for i in indices] >> t = np.zeros(shape, np.int) >> np.add.at(t, inverses, 1) >> return tuple(uniques), t >> #here is how to use >> print table(x,y) >> #but we can use fancy keys as well; here a composite key and a row-key >> print table((x,y), z) >> #this effectively creates a sparse matrix equivalent of your desired table >> print grouping.count((x,y)) >> >> >> On Wed, Aug 13, 2014 at 11:25 PM, Warren Weckesser < >> warren.weckes...@gmail.com> wrote: >> >>> >>> >>> >>> On Wed, Aug 13, 2014 at 5:15 PM, Benjamin Root <ben.r...@ou.edu> wrote: >>> >>>> The ever-wonderful pylab mode in matplotlib has a table function for >>>> plotting a table of text in a plot. If I remember correctly, what would >>>> happen is that matplotlib's table() function will simply obliterate the >>>> numpy's table function. This isn't a show-stopper, I just wanted to point >>>> that out. >>>> >>>> Personally, while I wasn't a particular fan of "count_unique" because I >>>> wouldn't necessarially think of it when needing a contingency table, I do >>>> like that it is verb-ish. "table()", in this sense, is not a verb. That >>>> said, I am perfectly fine with it if you are fine with the name collision >>>> in pylab mode. >>>> >>>> >>> >>> Thanks for pointing that out. I only changed it to have something that >>> sounded more table-ish, like the Pandas, R and Matlab functions. I won't >>> update it right now, but if there is interest in putting it into numpy, >>> I'll rename it to avoid the pylab conflict. Anything along the lines of >>> `crosstab`, `xtable`, etc., would be fine with me. >>> >>> Warren >>> >>> >>> >>>> On Wed, Aug 13, 2014 at 4:57 PM, Warren Weckesser < >>>> warren.weckes...@gmail.com> wrote: >>>> >>>>> >>>>> >>>>> >>>>> On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn < >>>>> hoogendoorn.ee...@gmail.com> wrote: >>>>> >>>>>> ah yes, that's also an issue I was trying to deal with. the semantics >>>>>> I prefer in these type of operators, is (as a default), to have every >>>>>> array >>>>>> be treated as a sequence of keys, so if calling unique(arr_2d), youd get >>>>>> unique rows, unless you pass axis=None, in which case the array is >>>>>> flattened. >>>>>> >>>>>> I also agree that the extension you propose here is useful; but >>>>>> ideally, with a little more discussion on these subjects we can converge >>>>>> on >>>>>> an even more comprehensive overhaul >>>>>> >>>>>> >>>>>> On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington <joferking...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn < >>>>>>> hoogendoorn.ee...@gmail.com> wrote: >>>>>>> >>>>>>>> Thanks. Prompted by that stackoverflow question, and similar >>>>>>>> problems I had to deal with myself, I started working on a much more >>>>>>>> general extension to numpy's functionality in this space. Like you >>>>>>>> noted, >>>>>>>> things get a little panda-y, but I think there is a lot of panda's >>>>>>>> functionality that could or should be part of the numpy core, a robust >>>>>>>> set >>>>>>>> of grouping operations in particular. >>>>>>>> >>>>>>>> see pastebin here: >>>>>>>> http://pastebin.com/c5WLWPbp >>>>>>>> >>>>>>> >>>>>>> On a side note, this is related to a pull request of mine from >>>>>>> awhile back: https://github.com/numpy/numpy/pull/3584 >>>>>>> >>>>>>> There was a lot of disagreement on the mailing list about what to >>>>>>> call a "unique slices along a given axis" function, so I wound up >>>>>>> closing >>>>>>> the pull request pending more discussion. >>>>>>> >>>>>>> At any rate, I think it's a useful thing to have in "base" numpy. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing list >>>>>>> NumPy-Discussion@scipy.org >>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion@scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>>>> >>>>> >>>>> Update: I renamed the function to `table` in the pull request: >>>>> https://github.com/numpy/numpy/pull/4958 >>>>> >>>>> >>>>> Warren >>>>> >>>>> > > Hey all, > > I'm reviving this thread about the proposed `table` enhancement in > https://github.com/numpy/numpy/pull/4958, because Chuck has poked me (via > the pull request ) about it, so I'm poking the mailing list. Ignoring the > issue of the name for the moment, is there any opposition to adding the > proposed `table` function to numpy? I don't think it would preclude adding > more powerful tools later, but that's not something I have time to work on > at the moment. > > If the only issue is the name, I'm open to any suggestions. I started > with `count_unique`, and changed it to `table`, but Benjamin pointed out > the potential conflict of `table` with a matplotlib function. > > Warren > Looks like the original email in the thread is not part of the quoted (and somewhat disordered) emails. Here's my original email from last August: http://mail.scipy.org/pipermail/numpy-discussion/2014-August/070941.html Warren > > > > > _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion@scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion@scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion