On Sun, Jan 25, 2015 at 1:48 PM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

>
>
> On Wed, Aug 13, 2014 at 6:17 PM, Eelco Hoogendoorn <
> hoogendoorn.ee...@gmail.com> wrote:
>
>> Its pretty easy to implement this table functionality and more on top of
>> the code I linked above. I still think such a comprehensive overhaul of
>> arraysetops is worth discussing.
>>
>> import numpy as np
>> import grouping
>> x = [1, 1, 1, 1, 2, 2, 2, 2, 2]
>> y = [3, 4, 3, 3, 3, 4, 5, 5, 5]
>> z = np.random.randint(0,2,(9,2))
>> def table(*keys):
>>     """
>>     desired table implementation, building on the index object
>>     cleaner, and more functionality
>>     performance should be the same
>>     """
>>     indices  = [grouping.as_index(k, axis=0) for k in keys]
>>     uniques  = [i.unique  for i in indices]
>>     inverses = [i.inverse for i in indices]
>>     shape    = [i.groups  for i in indices]
>>     t = np.zeros(shape, np.int)
>>     np.add.at(t, inverses, 1)
>>     return tuple(uniques), t
>> #here is how to use
>> print table(x,y)
>> #but we can use fancy keys as well; here a composite key and a row-key
>> print table((x,y), z)
>> #this effectively creates a sparse matrix equivalent of your desired table
>> print grouping.count((x,y))
>>
>>
>> On Wed, Aug 13, 2014 at 11:25 PM, Warren Weckesser <
>> warren.weckes...@gmail.com> wrote:
>>
>>>
>>>
>>>
>>> On Wed, Aug 13, 2014 at 5:15 PM, Benjamin Root <ben.r...@ou.edu> wrote:
>>>
>>>> The ever-wonderful pylab mode in matplotlib has a table function for
>>>> plotting a table of text in a plot. If I remember correctly, what would
>>>> happen is that matplotlib's table() function will simply obliterate the
>>>> numpy's table function. This isn't a show-stopper, I just wanted to point
>>>> that out.
>>>>
>>>> Personally, while I wasn't a particular fan of "count_unique" because I
>>>> wouldn't necessarially think of it when needing a contingency table, I do
>>>> like that it is verb-ish. "table()", in this sense, is not a verb. That
>>>> said, I am perfectly fine with it if you are fine with the name collision
>>>> in pylab mode.
>>>>
>>>>
>>>
>>> Thanks for pointing that out.  I only changed it to have something that
>>> sounded more table-ish, like the Pandas, R and Matlab functions.   I won't
>>> update it right now, but if there is interest in putting it into numpy,
>>> I'll rename it to avoid the pylab conflict.  Anything along the lines of
>>> `crosstab`, `xtable`, etc., would be fine with me.
>>>
>>> Warren
>>>
>>>
>>>
>>>> On Wed, Aug 13, 2014 at 4:57 PM, Warren Weckesser <
>>>> warren.weckes...@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn <
>>>>> hoogendoorn.ee...@gmail.com> wrote:
>>>>>
>>>>>> ah yes, that's also an issue I was trying to deal with. the semantics
>>>>>> I prefer in these type of operators, is (as a default), to have every 
>>>>>> array
>>>>>> be treated as a sequence of keys, so if calling unique(arr_2d), youd get
>>>>>> unique rows, unless you pass axis=None, in which case the array is
>>>>>> flattened.
>>>>>>
>>>>>> I also agree that the extension you propose here is useful; but
>>>>>> ideally, with a little more discussion on these subjects we can converge 
>>>>>> on
>>>>>> an even more comprehensive overhaul
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington <joferking...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn <
>>>>>>> hoogendoorn.ee...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks. Prompted by that stackoverflow question, and similar
>>>>>>>> problems I had to deal with myself, I started working on a much more
>>>>>>>> general extension to numpy's functionality in this space. Like you 
>>>>>>>> noted,
>>>>>>>> things get a little panda-y, but I think there is a lot of panda's
>>>>>>>> functionality that could or should be part of the numpy core, a robust 
>>>>>>>> set
>>>>>>>> of grouping operations in particular.
>>>>>>>>
>>>>>>>> see pastebin here:
>>>>>>>> http://pastebin.com/c5WLWPbp
>>>>>>>>
>>>>>>>
>>>>>>> On a side note, this is related to a pull request of mine from
>>>>>>> awhile back: https://github.com/numpy/numpy/pull/3584
>>>>>>>
>>>>>>> There was a lot of disagreement on the mailing list about what to
>>>>>>> call a "unique slices along a given axis" function, so I wound up 
>>>>>>> closing
>>>>>>> the pull request pending more discussion.
>>>>>>>
>>>>>>> At any rate, I think it's a useful thing to have in "base" numpy.
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> NumPy-Discussion mailing list
>>>>>>> NumPy-Discussion@scipy.org
>>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> NumPy-Discussion mailing list
>>>>>> NumPy-Discussion@scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>>
>>>>>>
>>>>>
>>>>> Update: I renamed the function to `table` in the pull request:
>>>>> https://github.com/numpy/numpy/pull/4958
>>>>>
>>>>>
>>>>> Warren
>>>>>
>>>>>
>
> Hey all,
>
> I'm reviving this thread about the proposed `table` enhancement in
> https://github.com/numpy/numpy/pull/4958, because Chuck has poked me (via
> the pull request ) about it, so I'm poking the mailing list.  Ignoring the
> issue of the name for the moment, is there any opposition to adding the
> proposed `table` function to numpy?  I don't think it would preclude adding
> more powerful tools later, but that's not something I have time to work on
> at the moment.
>
> If the only issue is the name,  I'm open to any suggestions.  I started
> with `count_unique`, and changed it to `table`, but Benjamin pointed out
> the potential conflict of `table` with a matplotlib function.
>
> Warren
>


Looks like the original email in the thread is not part of the quoted (and
somewhat disordered) emails.  Here's my original email from last August:
http://mail.scipy.org/pipermail/numpy-discussion/2014-August/070941.html

Warren




>
>
>
>
> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion@scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion@scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to