I started a ufunc to compute the sum of square differences here <https://gist.github.com/mattharrigan/6f678b3d6df5efd236fc23bfb59fd3bd>. It is about 4x faster and uses half the memory compared to np.sum(np.square(x-c)). This could significantly speed up common operations like std and var (where c=np.mean(x). It faster because its a single pass instead of 3, and also because the inner loop is specialized for the common reduce case, which might be an optimization considering more generally.
I think I have answered my question #2&3 above. Can someone please point me to an example where "data" was used in a ufunc inner loop? How can that value be set at runtime? Thanks On Fri, Nov 4, 2016 at 5:33 PM, Sebastian Berg <sebast...@sipsolutions.net> wrote: > On Fr, 2016-11-04 at 15:42 -0400, Matthew Harrigan wrote: > > I didn't notice identity before. Seems like frompyfunc always sets > > it to None. If it were zero maybe it would work as desired here. > > > > In the writing your own ufunc doc, I was wondering if the pointer to > > data could be used to get a constant at runtime. If not, what could > > that be used for? > > static void double_logit(char **args, npy_intp *dimensions, > > npy_intp* steps, void* data) > > Why would the numerical accuracy be any different? The subtraction > > and square operations look identical and I thought np.sum just calls > > np.add.reduce, so the reduction step uses the same code and would > > therefore have the same accuracy. > > > > Sorry, did not read it carefully, I guess `c` is the mean, so you are > doing the two pass method. > > - Sebastian > > > > Thanks > > > > On Fri, Nov 4, 2016 at 1:56 PM, Sebastian Berg <sebastian@sipsolution > > s.net> wrote: > > > On Fr, 2016-11-04 at 13:11 -0400, Matthew Harrigan wrote: > > > > I was reading this and got thinking about if a ufunc could > > > compute > > > > the sum of squared differences in a single pass without a > > > temporary > > > > array. The python code below demonstrates a possible approach. > > > > > > > > import numpy as np > > > > x = np.arange(10) > > > > c = 1.0 > > > > def add_square_diff(x1, x2): > > > > return x1 + (x2-c)**2 > > > > ufunc = np.frompyfunc(add_square_diff, 2, 1) > > > > print(ufunc.reduce(x) - x[0] + (x[0]-c)**2) > > > > print(np.sum(np.square(x-c))) > > > > > > > > I have (at least) 4 questions: > > > > 1. Is it possible to pass run time constants to a ufunc written > > > in C > > > > for use in its inner loop, and if so how? > > > > > > I don't think its anticipated, since a ufunc could in most cases > > > use a > > > third argument, but a 3 arg ufunc can't be reduced. Not sure if > > > there > > > might be some trickery possible. > > > > > > > 2. Is it possible to pass an initial value to reduce to avoid the > > > > clean up required for the first element? > > > > > > This is the identity normally. But the identity can only be 0, 1 or > > > -1 > > > right now I think. The identity is what the output array gets > > > initialized with (which effectively makes it the first value passed > > > into the inner loop). > > > > > > > 3. Does that ufunc work, or are there special cases which cause > > > it to > > > > fall apart? > > > > 4. Would a very specialized ufunc such as this be considered for > > > > incorporating in numpy since it would help reduce time and memory > > > of > > > > functions already in numpy? > > > > > > > > > > Might be mixing up things, however, IIRC the single pass approach > > > has a > > > bad numerical accuracy, so that I doubt that it is a good default > > > algorithm. > > > > > > - Sebastian > > > > > > > > > > Thank you, > > > > Matt > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion@scipy.org > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion