On Fri, Dec 6, 2013 at 9:32 AM, <josef.p...@gmail.com> wrote: > On Fri, Dec 6, 2013 at 4:39 AM, Sebastian Berg > <sebast...@sipsolutions.net> wrote: >> On Thu, 2013-12-05 at 23:02 -0500, josef.p...@gmail.com wrote: >>> On Thu, Dec 5, 2013 at 10:56 PM, Alexander Belopolsky <ndar...@mac.com> >>> wrote: >>> > On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg >>> > <sebast...@sipsolutions.net> >>> > wrote: >>> >> there was a discussion that for numpy booleans math operators +,-,* (and >>> >> the unary -), while defined, are not very helpful. >>> > >>> > It has been suggested at the Github that there is an area where it is >>> > useful >>> > to have linear algebra operations like matrix multiplication to be defined >>> > over a semiring: >>> > >>> > http://en.wikipedia.org/wiki/Logical_matrix >>> > >>> > This still does not justify having unary or binary -, so I suggest that we >>> > first discuss deprecation of those. >>> >>> Does it make sense to only remove - and maybe / ? >>> >>> would python sum still work? (I almost never use it.) >>> >>> >>> sum(mask) >>> 2 >>> >>> sum(mask.tolist()) >>> 2 >>> >>> is accumulate the same as sum and would keep working? >>> >>> >>> np.add.accumulate(mask) >>> array([0, 0, 0, 1, 2]) >>> >>> >>> In operation with other dtypes, do they still dominate so these work? >>> >> >> Hey, > > > In statistics and econometrics (and economic theory) we just use an > indicator function 1_{x=5} which has largely the same properties as a > numpy bool array, at least in my code. > > some of the common operations are *, dot and kron. > > So far this has worked quite well as intuition, plus numpy casting rules. > > dot is the main surprise, because I thought that it would upcast. (I > always think of dot as a np.linalg.) > > >> >> of course the other types will always dominate interpreting bools as 0 >> and 1. This would only affect operations with only booleans. > > My guess is that this would leave then 90% of our (statsmodels) > possible usage alone. > > There is still the case that with * we can calculate the intersection. > > > There is a >> good point that * is well defined however you define it, though. (Btw. / >> is not defined for bools, `np.bool_(True)/np.bool_(True)` will upcast to >> int8 to do the operation) >> >> However, while well defined, + is not defined like it is for python >> bools (which are just ints) so that is the reason to consider >> deprecation there (if we allow upcast to int8 -- or maybe the default >> int -- in the future, in-place += and -= operations would not behave >> differently, since they just cast back...). > > Actually, I used + once: > > The calculation in terms of indicator functions is > > 1_{A} + 1_{B} - 1_{A & B} > > The last part avoids double counting, which is not necessary if numpy > casts back to bool. > Nothing that couldn't be replaced by logical operators, but the > (linear) algebra is not "logical". > > In this case I did care about memory because the arrays are (nobs, > nobs) (nobs is the number of observations shape[0]) which can be > large, and I have a sparse version also. In most other case we use > astype(int) already very often, because eventually we still have to > cast and memory won't be a big problem. > > The mental model is set membership and set operations with indicator > functions, not "logical", and I don't remember running into problems > with this so far, and happily ignored logical_xxx when I do linear > algebra instead of just working with masks of booleans.
http://en.wikipedia.org/wiki/Indicator_function with the added advantage that we have also the version where + constrains to (0, 1). However `-` doesn't work properly because >>> np.bool_(-5) True instead of False except in the case `1 - mask`. We really have two kinds of addition: bool sum: for indicating set membership counting sum: for counting number of elements. from my viewpoint: I would keep + and * since they work well (bool + and count +) minus - is partially broken and `/` looks useless this casts anyway >>> 1 - m1 array([1, 1, 0, 0, 0]) and I never thought of doing this >>> True - m1 array([ True, True, False, False, False], dtype=bool) (python set defines minus but raises error on plus) Josef > > Nevertheless: If I'm forced to, then I will get used to logical_xxx. (*) > And the above bool addition hasn't made it into statsmodels yet. I > used a simpler version because I thought initially it's too cute. (And > I was using an older numpy that couldn't do broadcasted dot.) > > (*) how do you search in the documentation of `&` or `|`, I cannot > find what the other symbols are, if there are any. > >> >> I suppose python sum works because it first tries using the C-Api number >> protocol, which also means it is not affected. If you were to write a >> sum which just uses the `+` operator, it would be affected, but that >> would seem good to me. > > based on the ticket example, I'm not sure whether `+` should upcast or not. > >>>> mm.dtype > dtype('bool') >>>> mm.sum(0) > array([48, 45, 56, 47]) > >>>> mm.sum(0, bool) > array([ True, True, True, True], dtype=bool) > I would just use any > > but what happens with logical cumsum > >>>> mm[:5].cumsum(0, bool) > array([[False, True, True, True], > [ True, True, True, True], > [ True, True, True, True], > [ True, True, True, True], > [ True, True, True, True]], dtype=bool) > > same as mm[:5].astype(int).cumsum(0, bool) without casting > > Josef > > >> >> - Sebastian >> >> >>> >>> x / mask >>> array([0, 0, 0, 3, 4]) >>> >>> x * 1. / mask >>> array([ nan, inf, inf, 3., 4.]) >>> >>> x**mask >>> array([1, 1, 1, 3, 4]) >>> >>> mask - 5 >>> array([-5, -5, -5, -4, -4]) >>> >>> Josef >>> >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion@scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion