Re: [Numpy-discussion] Generalize hstack/vstack --> stack; Block matrices like in matlab
Hey, one of my new year's resolutions is to get my pull requests accepted (or closed). So here we go... Here is the update pull request: https://github.com/numpy/numpy/pull/5057 Here is the docstring: https://github.com/sotte/numpy/commit/3d4c5d19a8f15b35df50d945b9c8853b683f7ab6#diff-2270128d50ff15badd1aba4021c50a8cR358 The new `block` function is very similar to matlab's `[A, B; C, D]`. Pros: - it's very useful (in my experiment) - less friction for people coming from matlab - it's conceptually simple - the implementation is simple - it's documented - it's tested Cons: - the implementation is not super efficient. Temporary copies are created. However, bmat also does that. Feedback is very welcome! Best, Stefan On Sun, May 10, 2015 at 12:33 PM, Stefan Otte <stefan.o...@gmail.com> wrote: > Hey, > > Just a quick update. I updated the pull request and renamed `stack` into > `block`. Have a look: https://github.com/numpy/numpy/pull/5057 > > I'm sticking with simple initial implementation because it's simple and > does what you think it does. > > > Cheers, > Stefan > > > > On Fri, Oct 31, 2014 at 2:13 PM Stefan Otte <stefan.o...@gmail.com> wrote: > >> To make the last point more concrete the implementation could look >> something like this (note that I didn't test it and that it still >> takes some work): >> >> >> def bmat(obj, ldict=None, gdict=None): >> return matrix(stack(obj, ldict, gdict)) >> >> >> def stack(obj, ldict=None, gdict=None): >> # the old bmat code minus the matrix calls >> if isinstance(obj, str): >> if gdict is None: >> # get previous frame >> frame = sys._getframe().f_back >> glob_dict = frame.f_globals >> loc_dict = frame.f_locals >> else: >> glob_dict = gdict >> loc_dict = ldict >> return _from_string(obj, glob_dict, loc_dict) >> >> if isinstance(obj, (tuple, list)): >> # [[A,B],[C,D]] >> arr_rows = [] >> for row in obj: >> if isinstance(row, N.ndarray): # not 2-d >> return concatenate(obj, axis=-1) >> else: >> arr_rows.append(concatenate(row, axis=-1)) >> return concatenate(arr_rows, axis=0) >> >> if isinstance(obj, N.ndarray): >> return obj >> >> >> I basically turned the old `bmat` into `stack` and removed the matrix >> calls. >> >> >> Best, >> Stefan >> >> >> >> On Wed, Oct 29, 2014 at 3:59 PM, Stefan Otte <stefan.o...@gmail.com> >> wrote: >> > Hey, >> > >> > there are several ways how to proceed. >> > >> > - My proposed solution covers the 80% case quite well (at least I use >> > it all the time). I'd convert the doctests into unittests and we're >> > done. >> > >> > - We could slightly change the interface to leave out the surrounding >> > square brackets, i.e. turning `stack([[a, b], [c, d]])` into >> > `stack([a, b], [c, d])` >> > >> > - We could extend it even further allowing a "filler value" for non >> > set values and a "shape" argument. This could be done later as well. >> > >> > - `bmat` is not really matrix specific. We could refactor `bmat` a bit >> > to use the same logic in `stack`. Except the `matrix` calls `bmat` and >> > `_from_string` are pretty agnostic to the input. >> > >> > I'm in favor of the first or last approach. The first: because it >> > already works and is quite simple. The last: because the logic and >> > tests of both `bmat` and `stack` would be the same and the feature to >> > specify a string representation of the block matrix is nice. >> > >> > >> > Best, >> > Stefan >> > >> > >> > >> > On Tue, Oct 28, 2014 at 7:46 PM, Nathaniel Smith <n...@pobox.com> wrote: >> >> On 28 Oct 2014 18:34, "Stefan Otte" <stefan.o...@gmail.com> wrote: >> >>> >> >>> Hey, >> >>> >> >>> In the last weeks I tested `np.asarray(np.bmat())` as `stack` >> >>> function and it works quite well. So the question persits: If `bmat` >> >>> already offers something like `stack` should we even bother >> >>> implementing `stack`? More code leads to more >> >>> bugs and maintenance work. (However, the current implementation is >> >>> only 5 lines and by using `bmat` which would reduce that even more.) >> >> >> >> In the long run we're trying to reduce usage of np.matrix and ideally >> >> deprecate it entirely. So yes, providing ndarray equivalents of matrix >> >> functionality (like bmat) is valuable. >> >> >> >> -n >> >> >> >> >> >> ___ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion@scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Create a n-D grid; meshgrid alternative
Hey, I just created a pull request: https://github.com/numpy/numpy/pull/5874 Best, Stefan On Tue, May 12, 2015 at 3:29 PM Stefan Otte stefan.o...@gmail.com wrote: Hey, here is an ipython notebook with benchmarks of all implementations (scroll to the bottom for plots): https://github.com/sotte/ipynb_snippets/blob/master/2015-05%20gridspace%20-%20cartesian.ipynb Overall, Jaime's version is the fastest. On Tue, May 12, 2015 at 2:01 PM Jaime Fernández del Río jaime.f...@gmail.com wrote: On Tue, May 12, 2015 at 1:17 AM, Stefan Otte stefan.o...@gmail.com wrote: Hello, indeed I was looking for the cartesian product. I timed the two stackoverflow answers and the winner is not quite as clear: n_elements:10 cartesian 0.00427 cartesian2 0.00172 n_elements: 100 cartesian 0.02758 cartesian2 0.01044 n_elements: 1000 cartesian 0.97628 cartesian2 1.12145 n_elements: 5000 cartesian 17.14133 cartesian2 31.12241 (This is for two arrays as parameters: np.linspace(0, 1, n_elements)) cartesian2 seems to be slower for bigger. On my system, the following variation on Pauli's answer is 2-4x faster than his for your test cases: def cartesian4(arrays, out=None): arrays = [np.asarray(x).ravel() for x in arrays] dtype = np.result_type(*arrays) n = np.prod([arr.size for arr in arrays]) if out is None: out = np.empty((len(arrays), n), dtype=dtype) else: out = out.T for j, arr in enumerate(arrays): n /= arr.size out.shape = (len(arrays), -1, arr.size, n) out[j] = arr[np.newaxis, :, np.newaxis] out.shape = (len(arrays), -1) return out.T I'd really appreciate if this was be part of numpy. Should I create a pull request? There hasn't been any opposition, quite the contrary, so yes, I would go ahead an create that PR. I somehow feel this belongs with the set operations, rather than with the indexing ones. Other thoughts? Also for consideration: should it work on flattened arrays? or should we give it an axis argument, and then broadcast on the rest, a la generalized ufunc? Jaime -- (\__/) ( O.o) ( ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Create a n-D grid; meshgrid alternative
Hello, indeed I was looking for the cartesian product. I timed the two stackoverflow answers and the winner is not quite as clear: n_elements:10 cartesian 0.00427 cartesian2 0.00172 n_elements: 100 cartesian 0.02758 cartesian2 0.01044 n_elements: 1000 cartesian 0.97628 cartesian2 1.12145 n_elements: 5000 cartesian 17.14133 cartesian2 31.12241 (This is for two arrays as parameters: np.linspace(0, 1, n_elements)) cartesian2 seems to be slower for bigger. I'd really appreciate if this was be part of numpy. Should I create a pull request? Regarding combinations and permutations: I could be convenient to have as well. Cheers, Stefan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Create a n-D grid; meshgrid alternative
Hey, here is an ipython notebook with benchmarks of all implementations (scroll to the bottom for plots): https://github.com/sotte/ipynb_snippets/blob/master/2015-05%20gridspace%20-%20cartesian.ipynb Overall, Jaime's version is the fastest. On Tue, May 12, 2015 at 2:01 PM Jaime Fernández del Río jaime.f...@gmail.com wrote: On Tue, May 12, 2015 at 1:17 AM, Stefan Otte stefan.o...@gmail.com wrote: Hello, indeed I was looking for the cartesian product. I timed the two stackoverflow answers and the winner is not quite as clear: n_elements:10 cartesian 0.00427 cartesian2 0.00172 n_elements: 100 cartesian 0.02758 cartesian2 0.01044 n_elements: 1000 cartesian 0.97628 cartesian2 1.12145 n_elements: 5000 cartesian 17.14133 cartesian2 31.12241 (This is for two arrays as parameters: np.linspace(0, 1, n_elements)) cartesian2 seems to be slower for bigger. On my system, the following variation on Pauli's answer is 2-4x faster than his for your test cases: def cartesian4(arrays, out=None): arrays = [np.asarray(x).ravel() for x in arrays] dtype = np.result_type(*arrays) n = np.prod([arr.size for arr in arrays]) if out is None: out = np.empty((len(arrays), n), dtype=dtype) else: out = out.T for j, arr in enumerate(arrays): n /= arr.size out.shape = (len(arrays), -1, arr.size, n) out[j] = arr[np.newaxis, :, np.newaxis] out.shape = (len(arrays), -1) return out.T I'd really appreciate if this was be part of numpy. Should I create a pull request? There hasn't been any opposition, quite the contrary, so yes, I would go ahead an create that PR. I somehow feel this belongs with the set operations, rather than with the indexing ones. Other thoughts? Also for consideration: should it work on flattened arrays? or should we give it an axis argument, and then broadcast on the rest, a la generalized ufunc? Jaime -- (\__/) ( O.o) ( ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalize hstack/vstack -- stack; Block matrices like in matlab
Hey, Just a quick update. I updated the pull request and renamed `stack` into `block`. Have a look: https://github.com/numpy/numpy/pull/5057 I'm sticking with simple initial implementation because it's simple and does what you think it does. Cheers, Stefan On Fri, Oct 31, 2014 at 2:13 PM Stefan Otte stefan.o...@gmail.com wrote: To make the last point more concrete the implementation could look something like this (note that I didn't test it and that it still takes some work): def bmat(obj, ldict=None, gdict=None): return matrix(stack(obj, ldict, gdict)) def stack(obj, ldict=None, gdict=None): # the old bmat code minus the matrix calls if isinstance(obj, str): if gdict is None: # get previous frame frame = sys._getframe().f_back glob_dict = frame.f_globals loc_dict = frame.f_locals else: glob_dict = gdict loc_dict = ldict return _from_string(obj, glob_dict, loc_dict) if isinstance(obj, (tuple, list)): # [[A,B],[C,D]] arr_rows = [] for row in obj: if isinstance(row, N.ndarray): # not 2-d return concatenate(obj, axis=-1) else: arr_rows.append(concatenate(row, axis=-1)) return concatenate(arr_rows, axis=0) if isinstance(obj, N.ndarray): return obj I basically turned the old `bmat` into `stack` and removed the matrix calls. Best, Stefan On Wed, Oct 29, 2014 at 3:59 PM, Stefan Otte stefan.o...@gmail.com wrote: Hey, there are several ways how to proceed. - My proposed solution covers the 80% case quite well (at least I use it all the time). I'd convert the doctests into unittests and we're done. - We could slightly change the interface to leave out the surrounding square brackets, i.e. turning `stack([[a, b], [c, d]])` into `stack([a, b], [c, d])` - We could extend it even further allowing a filler value for non set values and a shape argument. This could be done later as well. - `bmat` is not really matrix specific. We could refactor `bmat` a bit to use the same logic in `stack`. Except the `matrix` calls `bmat` and `_from_string` are pretty agnostic to the input. I'm in favor of the first or last approach. The first: because it already works and is quite simple. The last: because the logic and tests of both `bmat` and `stack` would be the same and the feature to specify a string representation of the block matrix is nice. Best, Stefan On Tue, Oct 28, 2014 at 7:46 PM, Nathaniel Smith n...@pobox.com wrote: On 28 Oct 2014 18:34, Stefan Otte stefan.o...@gmail.com wrote: Hey, In the last weeks I tested `np.asarray(np.bmat())` as `stack` function and it works quite well. So the question persits: If `bmat` already offers something like `stack` should we even bother implementing `stack`? More code leads to more bugs and maintenance work. (However, the current implementation is only 5 lines and by using `bmat` which would reduce that even more.) In the long run we're trying to reduce usage of np.matrix and ideally deprecate it entirely. So yes, providing ndarray equivalents of matrix functionality (like bmat) is valuable. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Create a n-D grid; meshgrid alternative
I just drafted different versions of the `gridspace` function: https://tmp23.tmpnb.org/user/1waoqQ8PJBJ7/notebooks/2015-05%20gridspace.ipynb Beste Grüße, Stefan On Sun, May 10, 2015 at 1:40 PM, Stefan Otte stefan.o...@gmail.com wrote: Hey, quite often I want to evaluate a function on a grid in a n-D space. What I end up doing (and what I really dislike) looks something like this: x = np.linspace(0, 5, 20) M1, M2 = np.meshgrid(x, x) X = np.column_stack([M1.flatten(), M2.flatten()]) X.shape # (400, 2) fancy_function(X) I don't think I ever used `meshgrid` in any other way. Is there a better way to create such a grid space? I wrote myself a little helper function: def gridspace(linspaces): return np.column_stack([space.flatten() for space in np.meshgrid(*linspaces)]) But maybe something like this should be part of numpy? Best, Stefan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.stack -- which function, if any, deserves the name?
Hey, 1. np.stack for stacking like np.asarray(np.bmat(...)) http://thread.gmane.org/gmane.comp.python.numeric.general/58748/ https://github.com/numpy/numpy/pull/5057 I'm the author of this proposal. I'll just give some context real quickly. My stack started really simple, basically allowing a Matlab-like notation for stacking: matlab: [ a b; c d ] numpy: stack([[a, b], [c, d]]) or even stack([a, b], [c, d]) where a, b, c, and d a arrays. During the discussion people asked for fancier stacking and auto filling of non explicitly set blocks (think of an eye matrix where only certain blocks are set). Alternatively, we thought of refactoring the core of bmat [2] so that it can be used with arrays and matrices. This would allow stack(a b; c d) where a, b, c, and d are the names of arrays/matrices. (Also bmat would get better documentation during the refactoring :)). Summarizing, my proposal is mostly concerned how to create block arrays from given arrays. I don't care about the name stack. I just used stack because it replaced hstack/vstack for me. Maybe bstack for block stack, or barray for block array? I have the feeling [1] that my use case is more common, but I like the second proposal. Cheers, Stefan [1] Everybody generalizes from oneself. At least I do. [2] http://docs.scipy.org/doc/numpy/reference/generated/numpy.bmat.html ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Subscribing to mailinglist not possible / sites down
Hey *, The websites to subscribe to the numpy/scipy mailinglists seem to be down: - http://mail.scipy.org/mailman/listinfo/scipy-user - http://mail.scipy.org/mailman/listinfo/scipy-user - http://projects.scipy.org/pipermail/scipy-dev/ And it's not just me: http://www.downforeveryoneorjustme.com/http://projects.scipy.org/pipermail/scipy-dev/ Best, Stefan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalize hstack/vstack -- stack; Block matrices like in matlab
To make the last point more concrete the implementation could look something like this (note that I didn't test it and that it still takes some work): def bmat(obj, ldict=None, gdict=None): return matrix(stack(obj, ldict, gdict)) def stack(obj, ldict=None, gdict=None): # the old bmat code minus the matrix calls if isinstance(obj, str): if gdict is None: # get previous frame frame = sys._getframe().f_back glob_dict = frame.f_globals loc_dict = frame.f_locals else: glob_dict = gdict loc_dict = ldict return _from_string(obj, glob_dict, loc_dict) if isinstance(obj, (tuple, list)): # [[A,B],[C,D]] arr_rows = [] for row in obj: if isinstance(row, N.ndarray): # not 2-d return concatenate(obj, axis=-1) else: arr_rows.append(concatenate(row, axis=-1)) return concatenate(arr_rows, axis=0) if isinstance(obj, N.ndarray): return obj I basically turned the old `bmat` into `stack` and removed the matrix calls. Best, Stefan On Wed, Oct 29, 2014 at 3:59 PM, Stefan Otte stefan.o...@gmail.com wrote: Hey, there are several ways how to proceed. - My proposed solution covers the 80% case quite well (at least I use it all the time). I'd convert the doctests into unittests and we're done. - We could slightly change the interface to leave out the surrounding square brackets, i.e. turning `stack([[a, b], [c, d]])` into `stack([a, b], [c, d])` - We could extend it even further allowing a filler value for non set values and a shape argument. This could be done later as well. - `bmat` is not really matrix specific. We could refactor `bmat` a bit to use the same logic in `stack`. Except the `matrix` calls `bmat` and `_from_string` are pretty agnostic to the input. I'm in favor of the first or last approach. The first: because it already works and is quite simple. The last: because the logic and tests of both `bmat` and `stack` would be the same and the feature to specify a string representation of the block matrix is nice. Best, Stefan On Tue, Oct 28, 2014 at 7:46 PM, Nathaniel Smith n...@pobox.com wrote: On 28 Oct 2014 18:34, Stefan Otte stefan.o...@gmail.com wrote: Hey, In the last weeks I tested `np.asarray(np.bmat())` as `stack` function and it works quite well. So the question persits: If `bmat` already offers something like `stack` should we even bother implementing `stack`? More code leads to more bugs and maintenance work. (However, the current implementation is only 5 lines and by using `bmat` which would reduce that even more.) In the long run we're trying to reduce usage of np.matrix and ideally deprecate it entirely. So yes, providing ndarray equivalents of matrix functionality (like bmat) is valuable. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalize hstack/vstack -- stack; Block matrices like in matlab
Hey, there are several ways how to proceed. - My proposed solution covers the 80% case quite well (at least I use it all the time). I'd convert the doctests into unittests and we're done. - We could slightly change the interface to leave out the surrounding square brackets, i.e. turning `stack([[a, b], [c, d]])` into `stack([a, b], [c, d])` - We could extend it even further allowing a filler value for non set values and a shape argument. This could be done later as well. - `bmat` is not really matrix specific. We could refactor `bmat` a bit to use the same logic in `stack`. Except the `matrix` calls `bmat` and `_from_string` are pretty agnostic to the input. I'm in favor of the first or last approach. The first: because it already works and is quite simple. The last: because the logic and tests of both `bmat` and `stack` would be the same and the feature to specify a string representation of the block matrix is nice. Best, Stefan On Tue, Oct 28, 2014 at 7:46 PM, Nathaniel Smith n...@pobox.com wrote: On 28 Oct 2014 18:34, Stefan Otte stefan.o...@gmail.com wrote: Hey, In the last weeks I tested `np.asarray(np.bmat())` as `stack` function and it works quite well. So the question persits: If `bmat` already offers something like `stack` should we even bother implementing `stack`? More code leads to more bugs and maintenance work. (However, the current implementation is only 5 lines and by using `bmat` which would reduce that even more.) In the long run we're trying to reduce usage of np.matrix and ideally deprecate it entirely. So yes, providing ndarray equivalents of matrix functionality (like bmat) is valuable. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalize hstack/vstack -- stack; Block matrices like in matlab
Hey, @Josef, I wasn't aware of `bmat` and `np.asarray(np.bmat())` does basically what I want and what I'm already using. Regarding the Tetris problem: that never happened to me, but stack, as Josef pointed out, can handle that already :) I like the idea of removing the redundant square brackets: stack([[a, b], [c, d]]) -- stack([a, b], [c, d]) However, if the brackets are there there is no difference between creating a `np.array` and stacking arrays with `np.stack`. If we want to get fancy and turn this PR into something bigger (working our way up to a NP-complete problem ;)) then how about this. I sometimes have arrays that look like: AB0 0 C Where 0 is a scalar but is supposed to fill the rest of the array. Having something like 0 in there might lead to ambiguities though. What does ABC 0D0 mean? One could limit the filler to appear only on the left or the right: AB0 0CD But even then the shape is not completely determined. So we could require to have one row that only consists of arrays and determines the shape. Alternatively we could have a keyword parameter `shape`: stack([A, B, 0], [0, C, D], shape=(8, 8)) Colin, with `bmat` you can do what you're asking for. Directly taken from the example: np.bmat('A,B; C,D') matrix([[1, 1, 2, 2], [1, 1, 2, 2], [3, 4, 7, 8], [5, 6, 9, 0]]) General question: If `bmat` already offers something like `stack` should we even bother implementing `stack`? More code leads to more bugs and maintenance work. Best, Stefan On Tue, Sep 9, 2014 at 12:14 AM, cjw c...@ncf.ca wrote: On 08-Sep-14 4:40 PM, Joseph Martinot-Lagarde wrote: Le 08/09/2014 15:29, Stefan Otte a écrit : Hey, quite often I work with block matrices. Matlab offers the convenient notation [ a b; c d ] This would appear to be a desirable way to go. Numpy has something similar for strings. The above is neater. Colin W. to stack matrices. The numpy equivalent is kinda clumsy: vstack([hstack([a,b]), hstack([c,d])]) I wrote the little function `stack` that does exactly that: stack([[a, b], [c, d]]) In my case `stack` replaced `hstack` and `vstack` almost completely. If you're interested in including it in numpy I created a pull request [1]. I'm looking forward to getting some feedback! Best, Stefan [1] https://github.com/numpy/numpy/pull/5057 The outside brackets are redundant, stack([[a, b], [c, d]]) should be stack([a, b], [c, d]) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Generalize hstack/vstack -- stack; Block matrices like in matlab
Hey, quite often I work with block matrices. Matlab offers the convenient notation [ a b; c d ] to stack matrices. The numpy equivalent is kinda clumsy: vstack([hstack([a,b]), hstack([c,d])]) I wrote the little function `stack` that does exactly that: stack([[a, b], [c, d]]) In my case `stack` replaced `hstack` and `vstack` almost completely. If you're interested in including it in numpy I created a pull request [1]. I'm looking forward to getting some feedback! Best, Stefan [1] https://github.com/numpy/numpy/pull/5057 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function
Hey guys, I just pushed an updated version to github: https://github.com/sotte/numpy_mdot Here is an ipython notebook with some experiments: http://nbviewer.ipython.org/urls/raw2.github.com/sotte/numpy_mdot/master/2014-02_numpy_mdot.ipynb - I added (almost numpy compliant) documentation. - I use a function for len(args) == 3 to improve the speed. - Some general cleanup. Before I create a pull request I have a few questions: - Should there be an optimize argument or should we always optimize the parentheses? There is an overhead, but maybe we could neglect it? I think we should keep the flag, but set it to True by default. - I currently use a recursive algorithm to do the multiplication. Any objections? - In which file should `mdot` live? - I wrote a function `print_optimal_chain_order(D, A, B, C, names=list(DABC))` which determines the optimal parentheses and print out a numpy expression. It's kinda handy but do we actually need it? Beste Grüße, Stefan On Thu, Feb 20, 2014 at 8:39 PM, Nathaniel Smith n...@pobox.com wrote: On Thu, Feb 20, 2014 at 1:35 PM, Stefan Otte stefan.o...@gmail.com wrote: Hey guys, I quickly hacked together a prototype of the optimization step: https://github.com/sotte/numpy_mdot I think there is still room for improvements so feedback is welcome :) I'll probably have some time to code on the weekend. @Nathaniel, I'm still not sure about integrating it in dot. Don't a lot of people use the optional out parameter of dot? The email you're replying to below about deprecating stuff in 'dot' was in reply to Eric's email about using dot on arrays with shape (k, n, n), so those comments are unrelated to the mdot stuff. I wouldn't mind seeing out= arguments become kw-only in general, but even if we decided to do that it would take a long deprecation period, so yeah, let's give up on 'dot(A, B, C, D)' as syntax for mdot. However, the suggestion of supporting np.dot([A, B, C, D]) still seems like it might be a good idea...? I have mixed feelings about it -- one less item cluttering up the namespace, but it is weird and magical to have two totally different calling conventions for the same function. -n On Thu, Feb 20, 2014 at 4:02 PM, Nathaniel Smith n...@pobox.com wrote: If you send a patch that deprecates dot's current behaviour for ndim2, we'll probably merge it. (We'd like it to function like you suggest, for consistency with other gufuncs. But to get there we have to deprecate the current behaviour first.) While I'm wishing for things I'll also mention that it would be really neat if binary gufuncs would have a .outer method like regular ufuncs do, so anyone currently using ndim2 dot could just switch to that. But that's a lot more work than just deprecating something :-). -n On 20 Feb 2014 09:27, Eric Moore e...@redtetrahedron.org wrote: On Thursday, February 20, 2014, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: If the standard semantics are not affected, and the most common two-argument scenario does not take more than a single if-statement overhead, I don't see why it couldn't be a replacement for the existing np.dot; but others mileage may vary. On Thu, Feb 20, 2014 at 11:34 AM, Stefan Otte stefan.o...@gmail.com wrote: Hey, so I propose the following. I'll implement a new function `mdot`. Incorporating the changes in `dot` are unlikely. Later, one can still include the features in `dot` if desired. `mdot` will have a default parameter `optimize`. If `optimize==True` the reordering of the multiplication is done. Otherwise it simply chains the multiplications. I'll test and benchmark my implementation and create a pull request. Cheers, Stefan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Another consideration here is that we need a better way to work with stacked matrices such as np.linalg handles now. Ie I want to compute the matrix product of two (k, n, n) arrays producing a (k,n,n) result. Near as I can tell there isn't a way to do this right now that doesn't involve an explicit loop. Since dot will return a (k, n, k, n) result. Yes this output contains what I want but it also computes a lot of things that I don't want too. It would also be nice to be able to do a matrix product reduction, (k, n, n) - (n, n) in a single line too. Eric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith
Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function
Hey, so I propose the following. I'll implement a new function `mdot`. Incorporating the changes in `dot` are unlikely. Later, one can still include the features in `dot` if desired. `mdot` will have a default parameter `optimize`. If `optimize==True` the reordering of the multiplication is done. Otherwise it simply chains the multiplications. I'll test and benchmark my implementation and create a pull request. Cheers, Stefan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function
Hey guys, I quickly hacked together a prototype of the optimization step: https://github.com/sotte/numpy_mdot I think there is still room for improvements so feedback is welcome :) I'll probably have some time to code on the weekend. @Nathaniel, I'm still not sure about integrating it in dot. Don't a lot of people use the optional out parameter of dot? Best, Stefan On Thu, Feb 20, 2014 at 4:02 PM, Nathaniel Smith n...@pobox.com wrote: If you send a patch that deprecates dot's current behaviour for ndim2, we'll probably merge it. (We'd like it to function like you suggest, for consistency with other gufuncs. But to get there we have to deprecate the current behaviour first.) While I'm wishing for things I'll also mention that it would be really neat if binary gufuncs would have a .outer method like regular ufuncs do, so anyone currently using ndim2 dot could just switch to that. But that's a lot more work than just deprecating something :-). -n On 20 Feb 2014 09:27, Eric Moore e...@redtetrahedron.org wrote: On Thursday, February 20, 2014, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: If the standard semantics are not affected, and the most common two-argument scenario does not take more than a single if-statement overhead, I don't see why it couldn't be a replacement for the existing np.dot; but others mileage may vary. On Thu, Feb 20, 2014 at 11:34 AM, Stefan Otte stefan.o...@gmail.com wrote: Hey, so I propose the following. I'll implement a new function `mdot`. Incorporating the changes in `dot` are unlikely. Later, one can still include the features in `dot` if desired. `mdot` will have a default parameter `optimize`. If `optimize==True` the reordering of the multiplication is done. Otherwise it simply chains the multiplications. I'll test and benchmark my implementation and create a pull request. Cheers, Stefan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Another consideration here is that we need a better way to work with stacked matrices such as np.linalg handles now. Ie I want to compute the matrix product of two (k, n, n) arrays producing a (k,n,n) result. Near as I can tell there isn't a way to do this right now that doesn't involve an explicit loop. Since dot will return a (k, n, k, n) result. Yes this output contains what I want but it also computes a lot of things that I don't want too. It would also be nice to be able to do a matrix product reduction, (k, n, n) - (n, n) in a single line too. Eric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function
Just to give an idea about the performance implications I timed the operations on my machine %timeit reduce(dotp, [x, v, x.T, y]).shape 1 loops, best of 3: 1.32 s per loop %timeit reduce(dotTp, [x, v, x.T, y][::-1]).shape 1000 loops, best of 3: 394 µs per loop I was just interested in a nicer formulas but if the side effect is a performance improvement I can live with that. Pauli Virtanen posed in the issue an older discussion on the mailinglist: http://thread.gmane.org/gmane.comp.python.numeric.general/14288/ Beste Grüße, Stefan On Tue, Feb 18, 2014 at 12:52 AM, josef.p...@gmail.com wrote: On Mon, Feb 17, 2014 at 4:57 PM, josef.p...@gmail.com wrote: On Mon, Feb 17, 2014 at 4:39 PM, Stefan Otte stefan.o...@gmail.com wrote: Hey guys, I wrote myself a little helper function `mdot` which chains np.dot for multiple arrays. So I can write mdot(A, B, C, D, E) instead of these A.dot(B).dot(C).dot(D).dot(E) np.dot(np.dot(np.dot(np.dot(A, B), C), D), E) I know you can use `numpy.matrix` to get nicer formulas. However, most numpy/scipy function return arrays instead of numpy.matrix. Therefore, sometimes you actually use array multiplication when you think you use matrix multiplication. `mdot` is a simple way to avoid using numpy.matrix but to improve the readability. What do you think? Is this useful and worthy to integrate in numpy? I already created an issuer for this: https://github.com/numpy/numpy/issues/4311 jaimefrio also suggested to do some reordering of the arrays to minimize computation: https://github.com/numpy/numpy/issues/4311#issuecomment-35295857 statsmodels has a convenience chaindot, but most of the time I don't like it's usage, because of the missing brackets. say, you have a (1, 10) array and you use an intermediate (1, 1) array instead of (10,10) array nobs = 1 v = np.diag(np.ones(4)) x = np.random.randn(nobs, 4) y = np.random.randn(nobs, 3) reduce(np.dot, [x, v, x.T, y]).shape def dotp(x, y): xy = np.dot(x,y) print xy.shape return xy reduce(dotp, [x, v, x.T, y]).shape (1, 4) (1, 1) (1, 3) (1, 3) def dotTp(x, y): xy = np.dot(x.T,y.T) print xy.shape return xy.T reduce(dotTp, [x, v, x.T, y][::-1]).shape (3, 4) (3, 4) (3, 1) (1, 3) Josef IIRC, for reordering I looked at this http://www.mathworks.com/matlabcentral/fileexchange/27950-mmtimes-matrix-chain-product Josef (don't make it too easy for people to shoot themselves in ...) Best, Stefan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Proposal: Chaining np.dot with mdot helper function
Hey guys, I wrote myself a little helper function `mdot` which chains np.dot for multiple arrays. So I can write mdot(A, B, C, D, E) instead of these A.dot(B).dot(C).dot(D).dot(E) np.dot(np.dot(np.dot(np.dot(A, B), C), D), E) I know you can use `numpy.matrix` to get nicer formulas. However, most numpy/scipy function return arrays instead of numpy.matrix. Therefore, sometimes you actually use array multiplication when you think you use matrix multiplication. `mdot` is a simple way to avoid using numpy.matrix but to improve the readability. What do you think? Is this useful and worthy to integrate in numpy? I already created an issuer for this: https://github.com/numpy/numpy/issues/4311 jaimefrio also suggested to do some reordering of the arrays to minimize computation: https://github.com/numpy/numpy/issues/4311#issuecomment-35295857 Best, Stefan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion