Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Thu, May 2, 2013 at 6:47 PM, josef.p...@gmail.com wrote: On Thu, May 2, 2013 at 6:30 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, May 3, 2013 at 12:29 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, May 2, 2013 at 9:54 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, May 2, 2013 at 11:49 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, May 2, 2013 at 6:45 PM, Pauli Virtanen p...@iki.fi wrote: Charles R Harris charlesr.harris at gmail.com writes: [clip] * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Well, we wrote so here: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas but that's maybe just a mistake -- PSF states exactly the opposite: http://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2013 It wasn't a mistake - the part of a PR process that is most interesting in the context of evaluating GSoC applications is the dialogue and how the submitter deals with feedback. I forgot to add on that page (although I think it was in one of my emails) that the patch shouldn't be completely trivial - fixing a typo doesn't really tell us all that much. But in this case Chuck's suggestion on the PR of how to get something merged looks fine. My feeling is that learning to work with the community is part of the process after acceptance and one of the reasons there are mentors. You might get some bad choices skipping the submission/acceptance bit, but you might also close the door on people who are new to the whole thing. Ideally, the applicants would already have involved themselves with the community, practically that may often not be ths case. You may be right in all of that, but since there's a good chance that there are more applicants than slots I'd rather not make those bad choices if they're acceptable. acceptable -- avoidable Right now we have three solid proposals, from Arink, Blake and Surya. If we're lucky we'll get three slots, but if not then we'll have a tough choice to make. The application deadline is tomorrow, so now is the time for final tweaks to the proposals. After that of course the plan can still be worked out more, but it can't be edited on Melange anymore. Terri can still make it editable on Melange if necessary. Arink, you still have work to do for a PR. Chuck. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
I hardly found, any thing to improve and correct.. not even typo in docs? Where we need to avoid the version checks? On Fri, May 3, 2013 at 10:52 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, May 2, 2013 at 6:47 PM, josef.p...@gmail.com wrote: On Thu, May 2, 2013 at 6:30 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, May 3, 2013 at 12:29 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, May 2, 2013 at 9:54 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, May 2, 2013 at 11:49 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, May 2, 2013 at 6:45 PM, Pauli Virtanen p...@iki.fi wrote: Charles R Harris charlesr.harris at gmail.com writes: [clip] * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Well, we wrote so here: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas but that's maybe just a mistake -- PSF states exactly the opposite: http://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2013 It wasn't a mistake - the part of a PR process that is most interesting in the context of evaluating GSoC applications is the dialogue and how the submitter deals with feedback. I forgot to add on that page (although I think it was in one of my emails) that the patch shouldn't be completely trivial - fixing a typo doesn't really tell us all that much. But in this case Chuck's suggestion on the PR of how to get something merged looks fine. My feeling is that learning to work with the community is part of the process after acceptance and one of the reasons there are mentors. You might get some bad choices skipping the submission/acceptance bit, but you might also close the door on people who are new to the whole thing. Ideally, the applicants would already have involved themselves with the community, practically that may often not be ths case. You may be right in all of that, but since there's a good chance that there are more applicants than slots I'd rather not make those bad choices if they're acceptable. acceptable -- avoidable Right now we have three solid proposals, from Arink, Blake and Surya. If we're lucky we'll get three slots, but if not then we'll have a tough choice to make. The application deadline is tomorrow, so now is the time for final tweaks to the proposals. After that of course the plan can still be worked out more, but it can't be edited on Melange anymore. Terri can still make it editable on Melange if necessary. Arink, you still have work to do for a PR. Chuck. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
I have created a new PR, have removed one irrelevant version check. https://github.com/numpy/numpy/pull/3304/files On Fri, May 3, 2013 at 11:29 PM, Arink Verma arinkve...@iitrpr.ac.inwrote: I hardly found, any thing to improve and correct.. not even typo in docs? Where we need to avoid the version checks? On Fri, May 3, 2013 at 10:52 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, May 2, 2013 at 6:47 PM, josef.p...@gmail.com wrote: On Thu, May 2, 2013 at 6:30 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, May 3, 2013 at 12:29 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, May 2, 2013 at 9:54 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, May 2, 2013 at 11:49 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, May 2, 2013 at 6:45 PM, Pauli Virtanen p...@iki.fi wrote: Charles R Harris charlesr.harris at gmail.com writes: [clip] * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Well, we wrote so here: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas but that's maybe just a mistake -- PSF states exactly the opposite: http://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2013 It wasn't a mistake - the part of a PR process that is most interesting in the context of evaluating GSoC applications is the dialogue and how the submitter deals with feedback. I forgot to add on that page (although I think it was in one of my emails) that the patch shouldn't be completely trivial - fixing a typo doesn't really tell us all that much. But in this case Chuck's suggestion on the PR of how to get something merged looks fine. My feeling is that learning to work with the community is part of the process after acceptance and one of the reasons there are mentors. You might get some bad choices skipping the submission/acceptance bit, but you might also close the door on people who are new to the whole thing. Ideally, the applicants would already have involved themselves with the community, practically that may often not be ths case. You may be right in all of that, but since there's a good chance that there are more applicants than slots I'd rather not make those bad choices if they're acceptable. acceptable -- avoidable Right now we have three solid proposals, from Arink, Blake and Surya. If we're lucky we'll get three slots, but if not then we'll have a tough choice to make. The application deadline is tomorrow, so now is the time for final tweaks to the proposals. After that of course the plan can still be worked out more, but it can't be edited on Melange anymore. Terri can still make it editable on Melange if necessary. Arink, you still have work to do for a PR. Chuck. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Fri, May 3, 2013 at 12:13 PM, Arink Verma arinkve...@iitrpr.ac.inwrote: I have created a new PR, have removed one irrelevant version check. https://github.com/numpy/numpy/pull/3304/files I made some remarks on the PR. The convention on numpy-discussion is bottom posting so you should do that to avoid future complaints. snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
Yes, we need to ensure that.. Code generator can be made, which can create code for table of registered dtype during build time itself. Also at present there lot of duplicate code that attempts to work around these slow paths, simplification of that code is also required. On Thu, May 2, 2013 at 10:12 AM, David Cournapeau courn...@gmail.comwrote: On Thu, May 2, 2013 at 5:25 AM, Arink Verma arinkve...@iitrpr.ac.in wrote: @Raul I will pull new version, and try to include that also. What is wrong with macros for inline function? Yes, time for ufunc is reduced to almost half, for lookup table, I am generating key from argument type and returning the appropriated value.[1] @Chuck Yes I did some profiling with oprofiler for python -m timeit -n 100 -s 'import numpy as np;x = np.asarray(1.0)' 'x+x'. see data sheet.[2] As every time a ufunc is invoked, the code has to check every single data type possible (bool, int, double, etc) until if finds the best match for the data that the operation is being performed on. In scalar, we can send best match, from pre-populated table. At present the implementation is not well-structured and support only addition for int+int and float+float. [1] You are pointing out something that may well be the main difficulty: the code there is messy, and we need to ensure that optimisations don't preclude later extensions (especially with regard to new dtype addition). David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Thu, May 2, 2013 at 11:26 AM, Arink Verma arinkve...@iitrpr.ac.in wrote: Yes, we need to ensure that.. Code generator can be made, which can create code for table of registered dtype during build time itself. So dtypes can be registered at runtime as well. In an ideal world, 'native' numpy types would not be special cases. This is too big for a GSoC, but we should make sure we don't make it worse. Also at present there lot of duplicate code that attempts to work around these slow paths, simplification of that code is also required. That there is room for consolidation would be an understatement :) David On Thu, May 2, 2013 at 10:12 AM, David Cournapeau courn...@gmail.com wrote: On Thu, May 2, 2013 at 5:25 AM, Arink Verma arinkve...@iitrpr.ac.in wrote: @Raul I will pull new version, and try to include that also. What is wrong with macros for inline function? Yes, time for ufunc is reduced to almost half, for lookup table, I am generating key from argument type and returning the appropriated value.[1] @Chuck Yes I did some profiling with oprofiler for python -m timeit -n 100 -s 'import numpy as np;x = np.asarray(1.0)' 'x+x'. see data sheet.[2] As every time a ufunc is invoked, the code has to check every single data type possible (bool, int, double, etc) until if finds the best match for the data that the operation is being performed on. In scalar, we can send best match, from pre-populated table. At present the implementation is not well-structured and support only addition for int+int and float+float. [1] You are pointing out something that may well be the main difficulty: the code there is messy, and we need to ensure that optimisations don't preclude later extensions (especially with regard to new dtype addition). David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
Updating table at runtime, seems a good option. But then we have maintain separate file for caching and storing. I will see, op2calltree.py http://vorpus.org/~njs/op2calltree.py and gperftools both. * Instead of making a giant table of everything that needs to be done to make stuff fast first, before writing any code, I'd suggest picking one operation, figuring out what change would be the biggest improvement for it, making that change, checking that it worked, and then repeat until that operation is really fast. Working like that only, firstly optimizing sum operation specifically for int scalar then will move to other. * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Thanks for reminding! I was too busy with my university exams, I forgot to do that. Does the merge has to be related to gsoc project, or any other improvement can be consider? On Thu, May 2, 2013 at 6:44 PM, Nathaniel Smith n...@pobox.com wrote: On Thu, May 2, 2013 at 6:26 AM, Arink Verma arinkve...@iitrpr.ac.in wrote: Yes, we need to ensure that.. Code generator can be made, which can create code for table of registered dtype during build time itself. I'd probably just generate it at run-time on an as-needed basis. (I.e., use the full lookup logic the first time, then save the result.) New dtypes can be registered, which will mean the tables need to change size at runtime anyway. If someone does some strange thing like add float16's and float64's, we can do the lookup to determine that this should be handled by the float64/float64 loop, and then store that information so that the next time it's fast (but we probably don't want to be calculating all combinations at build-time, which would require running the full type resolution machinery, esp. since it wouldn't really bring any benefits that I can see). * Re: the profiling, I wrote a full oprofile-callgrind format script years ago: http://vorpus.org/~njs/op2calltree.py Haven't used it in years either but neither oprofile nor kcachegrind are terribly fast-moving projects so it's probably still working, or could be made so without much work. Or easier is to use the gperftools CPU profiler: https://gperftools.googlecode.com/svn/trunk/doc/cpuprofile.html Instead of linking to it at build time, you can just use ctypes: In [7]: profiler = ctypes.CDLL(libprofiler.so.0) In [8]: profiler.ProfilerStart(some-file-name-here) Out[8]: 1 In [9]: # do stuff here In [10]: profiler.ProfilerStop() PROFILE: interrupts/evictions/bytes = 2/0/592 Out[10]: 46 Then all the pprof analysis tools are available as described on that webpage. * Please don't trust those random suggestions for possible improvements I threw out when writing the original description. Probably it's true that FP flag checking and ufunc type lookup are expensive, but one should fix what the profile says to fix, not what someone guessed might be good to fix based on a few minutes thought. * Instead of making a giant table of everything that needs to be done to make stuff fast first, before writing any code, I'd suggest picking one operation, figuring out what change would be the biggest improvement for it, making that change, checking that it worked, and then repeat until that operation is really fast. Then if there's still time pick another operation. Producing a giant todo list isn't very productive by itself if there's no time then to actually do all the things on the list :-). * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Thu, May 2, 2013 at 7:14 AM, Nathaniel Smith n...@pobox.com wrote: On Thu, May 2, 2013 at 6:26 AM, Arink Verma arinkve...@iitrpr.ac.in wrote: Yes, we need to ensure that.. Code generator can be made, which can create code for table of registered dtype during build time itself. I'd probably just generate it at run-time on an as-needed basis. (I.e., use the full lookup logic the first time, then save the result.) New dtypes can be registered, which will mean the tables need to change size at runtime anyway. If someone does some strange thing like add float16's and float64's, we can do the lookup to determine that this should be handled by the float64/float64 loop, and then store that information so that the next time it's fast (but we probably don't want to be calculating all combinations at build-time, which would require running the full type resolution machinery, esp. since it wouldn't really bring any benefits that I can see). * Re: the profiling, I wrote a full oprofile-callgrind format script years ago: http://vorpus.org/~njs/op2calltree.py Haven't used it in years either but neither oprofile nor kcachegrind are terribly fast-moving projects so it's probably still working, or could be made so without much work. Or easier is to use the gperftools CPU profiler: https://gperftools.googlecode.com/svn/trunk/doc/cpuprofile.html Instead of linking to it at build time, you can just use ctypes: In [7]: profiler = ctypes.CDLL(libprofiler.so.0) In [8]: profiler.ProfilerStart(some-file-name-here) Out[8]: 1 In [9]: # do stuff here In [10]: profiler.ProfilerStop() PROFILE: interrupts/evictions/bytes = 2/0/592 Out[10]: 46 Then all the pprof analysis tools are available as described on that webpage. * Please don't trust those random suggestions for possible improvements I threw out when writing the original description. Probably it's true that FP flag checking and ufunc type lookup are expensive, but one should fix what the profile says to fix, not what someone guessed might be good to fix based on a few minutes thought. * Instead of making a giant table of everything that needs to be done to make stuff fast first, before writing any code, I'd suggest picking one operation, figuring out what change would be the biggest improvement for it, making that change, checking that it worked, and then repeat until that operation is really fast. Then if there's still time pick another operation. Producing a giant todo list isn't very productive by itself if there's no time then to actually do all the things on the list :-). * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
For the sake of completeness, I don't think I ever mentioned what I used to profile when I was working on speeding up the scalars. I used AQTime 7. It is commercial and only for Windows (as far as I know). It works great and it gave me fairly accurate timings and all sorts of visual navigation features. I do have to mock around with the numpy code every time I want to compile it to get it to play nicely with Visual Studio to generate the proper bindings for the profiler. Raul On 02/05/2013 7:14 AM, Nathaniel Smith wrote: On Thu, May 2, 2013 at 6:26 AM, Arink Verma arinkve...@iitrpr.ac.in wrote: Yes, we need to ensure that.. Code generator can be made, which can create code for table of registered dtype during build time itself. I'd probably just generate it at run-time on an as-needed basis. (I.e., use the full lookup logic the first time, then save the result.) New dtypes can be registered, which will mean the tables need to change size at runtime anyway. If someone does some strange thing like add float16's and float64's, we can do the lookup to determine that this should be handled by the float64/float64 loop, and then store that information so that the next time it's fast (but we probably don't want to be calculating all combinations at build-time, which would require running the full type resolution machinery, esp. since it wouldn't really bring any benefits that I can see). * Re: the profiling, I wrote a full oprofile-callgrind format script years ago: http://vorpus.org/~njs/op2calltree.py Haven't used it in years either but neither oprofile nor kcachegrind are terribly fast-moving projects so it's probably still working, or could be made so without much work. Or easier is to use the gperftools CPU profiler: https://gperftools.googlecode.com/svn/trunk/doc/cpuprofile.html Instead of linking to it at build time, you can just use ctypes: In [7]: profiler = ctypes.CDLL(libprofiler.so.0) In [8]: profiler.ProfilerStart(some-file-name-here) Out[8]: 1 In [9]: # do stuff here In [10]: profiler.ProfilerStop() PROFILE: interrupts/evictions/bytes = 2/0/592 Out[10]: 46 Then all the pprof analysis tools are available as described on that webpage. * Please don't trust those random suggestions for possible improvements I threw out when writing the original description. Probably it's true that FP flag checking and ufunc type lookup are expensive, but one should fix what the profile says to fix, not what someone guessed might be good to fix based on a few minutes thought. * Instead of making a giant table of everything that needs to be done to make stuff fast first, before writing any code, I'd suggest picking one operation, figuring out what change would be the biggest improvement for it, making that change, checking that it worked, and then repeat until that operation is really fast. Then if there's still time pick another operation. Producing a giant todo list isn't very productive by itself if there's no time then to actually do all the things on the list :-). * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
Charles R Harris charlesr.harris at gmail.com writes: [clip] * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Well, we wrote so here: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeasā€ˇ but that's maybe just a mistake -- PSF states exactly the opposite: http://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2013 -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Thu, May 2, 2013 at 6:45 PM, Pauli Virtanen p...@iki.fi wrote: Charles R Harris charlesr.harris at gmail.com writes: [clip] * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Well, we wrote so here: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas but that's maybe just a mistake -- PSF states exactly the opposite: http://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2013 It wasn't a mistake - the part of a PR process that is most interesting in the context of evaluating GSoC applications is the dialogue and how the submitter deals with feedback. I forgot to add on that page (although I think it was in one of my emails) that the patch shouldn't be completely trivial - fixing a typo doesn't really tell us all that much. But in this case Chuck's suggestion on the PR of how to get something merged looks fine. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Thu, May 2, 2013 at 11:49 AM, Ralf Gommers ralf.gomm...@gmail.comwrote: On Thu, May 2, 2013 at 6:45 PM, Pauli Virtanen p...@iki.fi wrote: Charles R Harris charlesr.harris at gmail.com writes: [clip] * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Well, we wrote so here: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas but that's maybe just a mistake -- PSF states exactly the opposite: http://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2013 It wasn't a mistake - the part of a PR process that is most interesting in the context of evaluating GSoC applications is the dialogue and how the submitter deals with feedback. I forgot to add on that page (although I think it was in one of my emails) that the patch shouldn't be completely trivial - fixing a typo doesn't really tell us all that much. But in this case Chuck's suggestion on the PR of how to get something merged looks fine. My feeling is that learning to work with the community is part of the process after acceptance and one of the reasons there are mentors. You might get some bad choices skipping the submission/acceptance bit, but you might also close the door on people who are new to the whole thing. Ideally, the applicants would already have involved themselves with the community, practically that may often not be ths case. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Thu, May 2, 2013 at 9:54 PM, Charles R Harris charlesr.har...@gmail.comwrote: On Thu, May 2, 2013 at 11:49 AM, Ralf Gommers ralf.gomm...@gmail.comwrote: On Thu, May 2, 2013 at 6:45 PM, Pauli Virtanen p...@iki.fi wrote: Charles R Harris charlesr.harris at gmail.com writes: [clip] * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Well, we wrote so here: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas but that's maybe just a mistake -- PSF states exactly the opposite: http://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2013 It wasn't a mistake - the part of a PR process that is most interesting in the context of evaluating GSoC applications is the dialogue and how the submitter deals with feedback. I forgot to add on that page (although I think it was in one of my emails) that the patch shouldn't be completely trivial - fixing a typo doesn't really tell us all that much. But in this case Chuck's suggestion on the PR of how to get something merged looks fine. My feeling is that learning to work with the community is part of the process after acceptance and one of the reasons there are mentors. You might get some bad choices skipping the submission/acceptance bit, but you might also close the door on people who are new to the whole thing. Ideally, the applicants would already have involved themselves with the community, practically that may often not be ths case. You may be right in all of that, but since there's a good chance that there are more applicants than slots I'd rather not make those bad choices if they're acceptable. Right now we have three solid proposals, from Arink, Blake and Surya. If we're lucky we'll get three slots, but if not then we'll have a tough choice to make. The application deadline is tomorrow, so now is the time for final tweaks to the proposals. After that of course the plan can still be worked out more, but it can't be edited on Melange anymore. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Fri, May 3, 2013 at 12:29 AM, Ralf Gommers ralf.gomm...@gmail.comwrote: On Thu, May 2, 2013 at 9:54 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, May 2, 2013 at 11:49 AM, Ralf Gommers ralf.gomm...@gmail.comwrote: On Thu, May 2, 2013 at 6:45 PM, Pauli Virtanen p...@iki.fi wrote: Charles R Harris charlesr.harris at gmail.com writes: [clip] * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Well, we wrote so here: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas but that's maybe just a mistake -- PSF states exactly the opposite: http://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2013 It wasn't a mistake - the part of a PR process that is most interesting in the context of evaluating GSoC applications is the dialogue and how the submitter deals with feedback. I forgot to add on that page (although I think it was in one of my emails) that the patch shouldn't be completely trivial - fixing a typo doesn't really tell us all that much. But in this case Chuck's suggestion on the PR of how to get something merged looks fine. My feeling is that learning to work with the community is part of the process after acceptance and one of the reasons there are mentors. You might get some bad choices skipping the submission/acceptance bit, but you might also close the door on people who are new to the whole thing. Ideally, the applicants would already have involved themselves with the community, practically that may often not be ths case. You may be right in all of that, but since there's a good chance that there are more applicants than slots I'd rather not make those bad choices if they're acceptable. acceptable -- avoidable Right now we have three solid proposals, from Arink, Blake and Surya. If we're lucky we'll get three slots, but if not then we'll have a tough choice to make. The application deadline is tomorrow, so now is the time for final tweaks to the proposals. After that of course the plan can still be worked out more, but it can't be edited on Melange anymore. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Thu, May 2, 2013 at 6:30 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, May 3, 2013 at 12:29 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, May 2, 2013 at 9:54 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, May 2, 2013 at 11:49 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, May 2, 2013 at 6:45 PM, Pauli Virtanen p...@iki.fi wrote: Charles R Harris charlesr.harris at gmail.com writes: [clip] * Did you notice this line on the requirements page? Having your first pull request merged before the GSoC application deadline (May 3) is required for your application to be accepted. Where is that last requirement? It seems out of line to me. Arink now has a pull request, but it looks intrusive enough and needs enough work that I don't think we can just put it in. Well, we wrote so here: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas but that's maybe just a mistake -- PSF states exactly the opposite: http://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2013 It wasn't a mistake - the part of a PR process that is most interesting in the context of evaluating GSoC applications is the dialogue and how the submitter deals with feedback. I forgot to add on that page (although I think it was in one of my emails) that the patch shouldn't be completely trivial - fixing a typo doesn't really tell us all that much. But in this case Chuck's suggestion on the PR of how to get something merged looks fine. My feeling is that learning to work with the community is part of the process after acceptance and one of the reasons there are mentors. You might get some bad choices skipping the submission/acceptance bit, but you might also close the door on people who are new to the whole thing. Ideally, the applicants would already have involved themselves with the community, practically that may often not be ths case. You may be right in all of that, but since there's a good chance that there are more applicants than slots I'd rather not make those bad choices if they're acceptable. acceptable -- avoidable Right now we have three solid proposals, from Arink, Blake and Surya. If we're lucky we'll get three slots, but if not then we'll have a tough choice to make. The application deadline is tomorrow, so now is the time for final tweaks to the proposals. After that of course the plan can still be worked out more, but it can't be edited on Melange anymore. Terri can still make it editable on Melange if necessary. Josef Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
It is great that you are looking into this !! We are currently running on a fork of numpy because we really need these performance improvements . I noticed that, as suggested, you took from the pull request I posted a while ago for the PyObject_GetAttrString PyObject_GetBuffer issues. ( https://github.com/raulcota/numpy ) A couple of comments on that, - Seems like you did not grab the latest revisions of that code that I posted that fixes the style of the comments and 'attempts' to fix an issue reported about Python 3 . I say 'attempts' because I thought it was fixed but I someone mentioned this was not correct. - There was also some feedback from Nathaniel about not liking the macros and siding for inline functions. I have not gotten around to it, but it would be nice if you jump on that boat. On the has lookup table, haven't looked at the implementation but the speed up is remarkable. Cheers ! Raul On 30/04/2013 8:26 PM, Arink Verma wrote: Hi all! I have written my application[1] forPerformance parity between numpy arrays and Python scalars[2].It would be a great help if you view it. Does it lookachievable anddeliverableaccordingto the project. [1]http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/arinkverma/40001# [2] http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Tue, Apr 30, 2013 at 8:26 PM, Arink Verma arinkve...@iitrpr.ac.inwrote: Hi all! I have written my application[1] for *Performance parity between numpy arrays and Python scalars[2]. *It would be a great help if you view it. Does it look achievable and deliverable according to the project. [1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/arinkverma/40001# [2] http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas Hi Arink, Have you already done some profiling? That could be tricky at the C level. I'm also curious about the hash table, what gets hashed and where do you get the improved efficiency? Admittedly, the way in which ufuncs currently detect scalars is a bit heavy weight and a fast path for certain inputs values could help. Is that what you are doing? As to the schedule, I suspect that it may be a bit ambitious but I don't see that as fatal by any means. Identifying bottlenecks and experimenting with solutions would be useful work. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
@Raul I will pull new version, and try to include that also. What is wrong with macros for inline function? Yes, time for ufunc is reduced to almost half, for lookup table, I am generating key from argument type and returning the appropriated value.[1] @Chuck Yes I did some profiling with oprofiler for python -m timeit -n 100 -s 'import numpy as np;x = np.asarray(1.0)' 'x+x'. see data sheet.[2] As every time a ufunc is invoked, the code has to check every single data type possible (bool, int, double, etc) until if finds the best match for the data that the operation is being performed on. In scalar, we can send best match, from pre-populated table. At present the implementation is not well-structured and support only addition for int+int and float+float. [1] [1] https://github.com/arinkverma/numpy/commit/e2d8de7e7b643c7a76ff92bc1219847f9328aad0 [2] https://docs.google.com/spreadsheet/ccc?key=0AnPqyp8kuQw0dG1hdjZiazE2dGtTY1JXVGFsWEEzbXc#gid=0 On Thu, May 2, 2013 at 12:09 AM, Raul Cota r...@virtualmaterials.comwrote: It is great that you are looking into this !! We are currently running on a fork of numpy because we really need these performance improvements . I noticed that, as suggested, you took from the pull request I posted a while ago for the PyObject_GetAttrString PyObject_GetBuffer issues. ( https://github.com/raulcota/numpy ) A couple of comments on that, - Seems like you did not grab the latest revisions of that code that I posted that fixes the style of the comments and 'attempts' to fix an issue reported about Python 3 . I say 'attempts' because I thought it was fixed but I someone mentioned this was not correct. - There was also some feedback from Nathaniel about not liking the macros and siding for inline functions. I have not gotten around to it, but it would be nice if you jump on that boat. On the has lookup table, haven't looked at the implementation but the speed up is remarkable. Cheers ! Raul On 30/04/2013 8:26 PM, Arink Verma wrote: Hi all! I have written my application[1] for *Performance parity between numpy arrays and Python scalars[2]. *It would be a great help if you view it. Does it look achievable and deliverable according to the project. [1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/arinkverma/40001# [2] http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in ___ NumPy-Discussion mailing listNumPy-Discussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars
On Thu, May 2, 2013 at 5:25 AM, Arink Verma arinkve...@iitrpr.ac.in wrote: @Raul I will pull new version, and try to include that also. What is wrong with macros for inline function? Yes, time for ufunc is reduced to almost half, for lookup table, I am generating key from argument type and returning the appropriated value.[1] @Chuck Yes I did some profiling with oprofiler for python -m timeit -n 100 -s 'import numpy as np;x = np.asarray(1.0)' 'x+x'. see data sheet.[2] As every time a ufunc is invoked, the code has to check every single data type possible (bool, int, double, etc) until if finds the best match for the data that the operation is being performed on. In scalar, we can send best match, from pre-populated table. At present the implementation is not well-structured and support only addition for int+int and float+float. [1] You are pointing out something that may well be the main difficulty: the code there is messy, and we need to ensure that optimisations don't preclude later extensions (especially with regard to new dtype addition). David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Gsoc : Performance parity between numpy arrays and Python scalars
Few comments on the topic, The postings I did to the list were in numpy 1.6 but the pull requests were done on the latest code at that time I believe 1.7 . There are still a few comments pending that I have not had a chance to look into, but that is a separate topic. The main problems I addressed in those pull requests are related to the scalars. For example, timeit.timeit("x + x", setup="import numpy as np;x = np.array([1.0])[0]") The line above is much faster with the pull request I posted. The problem as posted in the Gsoc is a different one, timeit.timeit("x + x", setup="import numpy as np;x = np.asarray(1.0)") The line above is not much faster with the pull requests I posted. It would if the test was "x + 1.0" I profiled the line above and the main results are shown in the links below, https://docs.google.com/file/d/0B3hgR3Pc2vPgUC1Kbng3SUx0OUE/edit?usp=sharing https://docs.google.com/file/d/0B3hgR3Pc2vPgS3ZCUDVJUTZScEE/edit?usp=sharing https://docs.google.com/file/d/0B3hgR3Pc2vPgZ3B5alFfYW5vLWc/edit?usp=sharing https://docs.google.com/file/d/0B3hgR3Pc2vPgZnpaZEVFSkhhTkE/edit?usp=sharing They show the main time consumers and then a graphical representation of calls and their time spent in their children. The different images traverse the calls starting in ufunc_gneric_call. You can see a fair bit of time is spent in find_best_ufunc_inner_loop . The point here is that, this particular example is not bottle necked by error checking. Things to keep in mind, - these profile results are based on numpy 1.6 . Takes me quite a bit of doing/hacking to get the profiler to play nicely with getting me to the proper lines of code and I have not done it for the latest code. - I timed it, and in the latest code of numpy it is not any faster, so I assume all the findings still apply (note the word assume) - all my timings are on windows. I point this out, because Nathaniel had pointed me to a specific line of code that was a bottleneck to him which does not apply to Windows and found out that the Windows counter part of the code is not a bottleneck. Raul On 17/04/2013 9:03 AM, Arink Verma wrote: Hello everyone I am Arink, computer science student and open source enthusiastic. This year I am interested to work on project "Performance parity between numpy arrays and Python scalars"[1]. I tried to adobt rald's work on numpy1.7[2] (which was done for numpy1.6 [3]). Till now by avoiding a) the uncessary Checking for floating point errors which is slow, b) unnecessarily Creation / destruction of scalar array types I am getting the speedup by ~ 1.05 times, which marginal offcourse. As in project's describtion it is mention that ufunc look up code is slow and inefficient. Few questions 1. Does it has to check every single data type possible until if finds the best match for the data that the operation is being performed on, or is there better way to find the best possible match? 2. If yes, so where are bottle-necks? Is the checks for proper data types are very expensive? [1]http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas [2]https://github.com/arinkverma/numpy/compare/master...gsoc_performance [3]http://article.gmane.org/gmane.comp.python.numeric.general/52480 -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion