On Mon, 2020-07-13 at 15:45 +0300, Ram Rachum wrote: > Thank you Sebastian and Andras for your detailed replies. > > Sebastian, your suggestion of adding `item.item()` solved my problem! > Now > the for loop is still slower than vectorize, but by a smaller factor, > and > that's fast enough for my demonstration. My problem is solved and I'm > very > happy! > > I also tried your `out=` suggestion for vectorize, but I think you > made a > mistake, as it doesn't seem that it takes that argument. If I missed > something and it does (maybe it's a very new feature?) that would be > even > better for me than the `.item()` solution. >
You are right, I thought vectorize may be a proper ufunc internally in this branch (like frompyfunc), but `frompyfunc` currently does not support dtypes other than object (which could be a nice improvement to make vectorize more replaceable). - Sebastian > > > On Sun, Jul 12, 2020 at 5:03 PM Sebastian Berg < > > sebast...@sipsolutions.net> > > wrote: > > > > > On Sun, 2020-07-12 at 16:00 +0300, Ram Rachum wrote: > > > > Hi everyone, > > > > > > > > Here's a problem I've been dealing with. I wonder whether NumPy > > > > has a > > > > tool > > > > that will help me, or whether this could be a useful feature > > > > request. > > > > > > > > In the upcoming EuroPython 20200, I'll do a talk about live- > > > > coding a > > > > music > > > > synthesizer. It's going to be a fun talk, I'll use the > > > > sounddevice > > > > <https://github.com/spatialaudio/python-sounddevice/> module to > > > > make > > > > a > > > > program that plays music. Do attend, or watch it on YouTube > > > > when it's > > > > out :) > > > > > > > > > > Sounds like a fun talk :). > > > > > > > There's a part in my talk that I could make simpler, and thus > > > > shave > > > > 3-4 > > > > minutes of cumbersome explanations. These 3-4 minutes matter a > > > > great > > > > deal > > > > to me. But for that I need to do something with NumPy and I > > > > don't > > > > know > > > > whether it's possible or not. > > > > > > > > > > > > The sounddevice library takes an ndarray of sound data and > > > > plays it. > > > > Currently I use `vectorize` to produce that array: > > > > > > > > output_array = np.vectorize(f, otypes='d')(input_array) > > > > > > > > And I'd like to replace it with this code, which is supposed to > > > > give > > > > the > > > > same output: > > > > > > > > output_array = np.ndarray(input_array.shape, dtype='d') > > > > > > Maybe use `np.empty(inpyt_array.shape, dtype="d")` instead. > > > `np.ndarray` works but is pretty low-level, and I would usually > > > avoid > > > it for array creation. > > > > > > > for i, item in enumerate(input_array): > > > > output_array[i] = f(item) > > > > > > > > > > Ok, one hack that you can try, is to replace `item` with > > > `item.item()`, > > > that will convert the NumPy scalar to a Python scalar, which is > > > quite a > > > lot more lightweight and faster. Also it might give PyPy more > > > chance > > > to optimize `f` I suppose. > > > > > > > > > > The reason I want the second version is that I can then have > > > > sounddevice > > > > start playing `output_array` in a separate thread, while it's > > > > being > > > > calculated. (Yes, I know about the GIL, I believe that > > > > sounddevice > > > > releases > > > > it.) > > > > > > `np.vectorize` will definitely not release the GIL, this loop may > > > in > > > between (I am not sure), but also adds quite a bit of overheads > > > compared to `vectorize`. The best thing of course would be if > > > you can > > > rewrite `f` to accept an array? > > > > > > > > > > Unfortunately, the for loop is very slow, even when I'm not > > > > processing the > > > > data on separate thread. I benchmarked it on both CPython and > > > > PyPy3, > > > > which > > > > is my target platform. On CPython it's 3 times slower than > > > > vectorize, > > > > and > > > > on PyPy3 it's 67 times slower than vectorize! That's despite > > > > the fact > > > > that > > > > the Numpy documentation says "The `vectorize` function is > > > > provided > > > > primarily for convenience, not for performance. The > > > > implementation is > > > > essentially a `for` loop." > > > > > > PyPy is nice because it makes NumPy just work. Unfortunately, > > > that also > > > adds some overheads, so at least some slowdown is probably > > > expected. I > > > am not sure about why it is so much. > > > I would not be surprised if a list comprehension is not much > > > faster, > > > especially on PyPy (assuming you cannot modify `f` to work with > > > arrays). > > > > > > > So here are a few questions: > > > > > > > > 1. Is there something like `vectorize`, except you get to > > > > access the > > > > output > > > > array before it's finished? If not, what do you think about > > > > adding > > > > that as > > > > an option to `vectorize`? > > > > > > vectorize should allow an `out=` argument to pass in the output > > > array, > > > would that help you? So you can access it, but I am not sure how > > > that > > > will help you. Although you could create a big result array and > > > then > > > access chunks of it: > > > > > > final_arr = np.empty(...) > > > newly_written = slice(0, 1000) > > > run_calculation(final_arr[newly_written]) > > > > > > where newly_written is defined by the input chunk you got, I > > > suppose. > > > > > > > > > > 2. Is there a more efficient way of writing the `for` loop I've > > > > written > > > > above? Or any other kind of solution to my > > > > > > As said, the main thing would be to modify `f` in whatever way > > > possible. For that it would be useful to know what `f` does > > > exactly. > > > Maybe you can move `f` to Cython or numba, or maybe write in a > > > way that > > > works on arrays... > > > > > > > Thanks for your help, > > > > Ram Rachum. > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion@python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion